Create list of words with similar spelling

720 Views Asked by At

Given a word (English or non-English), how can I construct a list of words (English or non-English) with similar spelling?

For example, given the word 'sira', some similar words are:

  • sirra
  • seira
  • siara
  • saira
  • shira

I'd prefer this to be on the verbose side, meaning it should generate as many words as possible.

Preferably in Python, but code in any language is helpful.

The Australian Business Register ABN lookup tool (a tool that finds business registration numbers based on search keywords) does a good job of this.

Thanks

2

There are 2 best solutions below

2
Sam Dean On

For searching in databases use "LIKE"

The query you'd want is

SELECT * FROM `testTable` WHERE name LIKE "%s%i%r%a%
3
sophros On

The thing you are looking for is provided by ispell (and family) of dictionaries. There is a relatively easy interface via hunspell library.

The actual data (dictionaries) you can download from here (among other places, like OpenOffice plugin pages).

There is an interface to get a number of similar words based on the edit distance suggested in the comment. Going with the example from GitHub:

>>> import hunspell
>>> hobj = hunspell.HunSpell('/usr/share/hunspell/en_US.dic', '/usr/share/hunspell/en_US.aff')
>>> hobj.spell('spookie')
False
>>> hobj.suggest('spookie')
['spookier', 'spookiness', 'spook', 'cookie', 'bookie', 'Spokane', 'spoken']