Fuzzy text completion algorithm?
May 10, 2008 9:44 AM Subscribe
I'm looking for a fuzzy text-matching algorithm for an autocomplete widget.
I have a data set of ~4000 short strings (< 60 chars) that needs to be attached to an AJAXy autocomplete widget. I'm looking for a text completion algorithm that supports some amount of slop, to catch basic typos.
The environment is Prototype/LowPro on top of Ruby, backed by a MySQL database. I can pre-compute and store metaphones/soundex/&c. either in the database or in memory, as this list will remain mostly static throughout application lifetime.
The bad news is this feature is not mission-critical, so anything like compiling a user-defined SQL function (e.g. Levenshtein) is out of scope. The good news is this feature isn't mission-critical, so I am only solving for the quickest route to a reasonable set of suggestions for a given input string -- this doesn't need to be bulletproof.
What would you do, given these constraints?
posted by sonofslim to computers & internet (9 answers total) 6 users marked this as a favorite
I'd do to the user input the same thing you do to your dataset - lowercase both, take out spaces and punctuation ... Soundex on both ends, etc.
posted by adipocere at 11:26 AM on May 10, 2008