django-haystack with Elasticsearch: how can I get fuzzy (word similarity) search to work?

200 Views Asked by At

I'm using elasticsearch==2.4.1 and django-haystack==3.0 with Django==2.2 using an Elasticsearch instance version 2.3 on AWS.

I'm trying to implement a "Did you mean...?" using a similarity search.

I have this model:

class EquipmentBrand(SafeDeleteModel):
    name = models.CharField(
        max_length=128,
        null=False,
        blank=False,
        unique=True,
    )

The following index:

class EquipmentBrandIndex(SearchIndex, Indexable):
    text = fields.EdgeNgramField(document=True, model_attr="name")

    def index_queryset(self, using=None):
        return self.get_model().objects.all()

    def get_model(self):
        return EquipmentBrand

And I'm searching like this:

results = SearchQuerySet().models(EquipmentBrand).filter(content=AutoQuery(q))

When name is "Example brand", these are my actual results:

q='Example brand" -> Found
q='bra" -> Found
q='xam' -> Found
q='Exmple' -> *NOT FOUND*

I'm trying to get the last example to work, i.e. finding the item if the word is similar.

My goal is to suggest items from the database in case of typos.

What am I missing to make this work?

Thanks!

1

There are 1 best solutions below

0
ThoughtfulHacking On

I don't think you want to be using EdgeNgramField. "Edge" n-grams, from the Elasticsearch Docs:

emits N-grams of each word where the start of the N-gram is anchored to the beginning of the word.

It's intended for autocomplete. It only matches string that are prefixes of the target. So, when the target document include "example", searches that work would be "e", "ex", "exa", "exam", ...

"Exmple" is not one of those strings. Try using plain NgramField.

Also, please consider upgrading. So much has been fixed and improved since ES 2.4.1