Skip to content

Custom fuzzy

Lukas Möller requested to merge custom-fuzzy into staging

This MR uses a custom scoring function that should be optimized for our use case so that it scores common abbreviations higher than other scoring methods would do. DM, DiskMath, Diskre, Math would all be considered a perfect match (but iskrete, D*skrete, Dath arn't perfect matches). The DP table has a similar structure to the DP solution for Levenshtein distance, but keeps track of an additional metadata value that specifies the last match that resulted in the optimal solution for the subproblem. It also keeps track of matches in the matrix so that those can be displayed. The runtime for calculating a single score is O(n m) where n, m are the length of the respective strings, this might sound bad at first, but isn't as bad in practice because there aren't that many different categories (let's fix that problem when we get to it), it also doesn't freeze completely if the search term is too long (which certain other MRs did).

Edited by Lukas Möller

Merge request reports

Loading