More Like This query in ElasticSearch

569 Views Asked by At

I'm trying to perform content based recommendation on Amazon Products data and the data is stored in my ElasticSearch Index named 'amazon_products'.

I went through https://elasticsearch-dsl.readthedocs.io/en/latest/search_dsl.html#more-like-this-query to use MLT query in ES's python client and on trying it out, I get no response.

Following is my code :

os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.elasticsearch:elasticsearch-hadoop:7.7.1 pyspark-shell'

from elasticsearch import Elasticsearch
es = Elasticsearch()

from elasticsearch_dsl import Search
from elasticsearch_dsl.query import MoreLikeThis

dsl_search = Search(index='amazon_products').using(es)

input_product = 'Ginger'

content_dsl = dsl_search.query(MoreLikeThis(like= input_product, fields=['brand']))

response = content_dsl.execute()
print(response)

for hit in response:
    print(hit)

The response is just empty {} even though there is 'Ginger' under the field 'brand'. Why does this happen ?

After having created the index amazon_products and mapping data into it, I am able to perform ordinary search queries like this :

es.search(index="amazon_products", q="main_category:Refrigerators", size=3)

which seem to work fine and give proper results.

But, I don't understand why MLT won't work on my data. Can someone please help me resolve this ? What should I do ?

How else should I perform an MLT query in Python ?

1

There are 1 best solutions below

0
shade27 On

Add these two parameters and it would work fine.

  • min_term_freq = 1
  • min_doc_freq = 1

Like below :

s = s.query(MoreLikeThis(like={"_id": 3006}, fields=['title'],min_term_freq=1,min_doc_freq=1))