Phonetic Algorithms

Learn to represent words by their sound to detect hidden similarities.

The German “Schwarz,” the English “Shvarts,” and the Russian “Шварц” are spelled differently, but they share a significant similarity—they sound alike. A phonetic algorithm aims to make their strings look alike by encoding a word’s sound.

The list of phonetic algorithms is long. Some work well for a single language, while others try to serve multiple. Some consist of just a handful of rules, and others of several hundred. Some respond with a single encoding, others with a whole array to account for ambiguity. The following three algorithms represent the spectrum for English-centric texts well:

  • Metaphone: It has a simple rule set limited to English pronunciation.

  • Double Metaphone: It focuses on English and frequently used words of foreign origin.

  • Beider-Morse: It focuses not just on English but several languages, including English.

We illustrate each by example using the abydos package.

Get hands-on with 1300+ tech skills courses.