Metaphone 3 is the latest generation of the Metaphone family of ‘phonetic encoding’ algorithms. A phonetic encoding algorithm takes a word, spelled correctly or incorrectly, or a name, and returns a ‘phoneticised’ key value that should be the same for all words that are pronounced similarly. Programmers can use this key to help the user find the word or name they are looking for even if they are not sure of the exact spelling. Both Metaphone and Double Metaphone are widely used in spell checkers, search engines, and genealogy sites.
Please keep in mind that Metaphone 3 is highly specialized to account for the peculiarities and inconsistencies of pronunciation in English, and is designed to work with all letters available in the 1252 codepage. It works less well for other languages, so versions of the metaphone algorithm for other languages will be coming soon.
Metaphone 3 can be downloaded as OS independent C++, Java, C#, PHP, Perl, or PL/SQL source code. buy now
Metaphone 3 was developed using a database of over 100,000 words prepared with correct encodings to test against, and achieves a 98% accuracy against that database. The result, if you read the source code carefully, amounts to practically a comprehensive description of variations and exceptions in the pronunciation of words in English.
Because of this increased accuracy, it became possible to allow the programmer more flexibility in determining how closely the encoding should match the pronunciation of the target words.
Metaphone and Double Metaphone encodings drop non-initial vowels, and also map voiced/unvoiced consonant pairs to the same coded symbol. (Some voiced/unvoiced consonant pairs in English are B/P, V/F, D/T, and Z/S.) This is useful in cases where adjacent vowels and consonants have been transposed, or voiced and unvoiced consonants are so close together in sound that the user might confuse them, but on the other hand the result set may be too large, or the user might complain that the results don't 'sound' close enough to the search string they typed in.
Metaphone 3 allows this encoding, but adds the flexibility to set the encoding to include non-initial vowels, or to map voiced/unvoiced pairs to different encodings, or both. If you have prepared indexes for some combination of these more exact encodings as well as for the regular encoding, you now have much more flexibility in returning results to the user. If there are very few matches, you might return the less exact matches according to the existing standard of fuzziness, or, if there are too many matches, or plenty of "exact" matches, you can return a more focussed set.
We don’t know of any other phonetic encoding algorithm that does this.