In ElasticSearch I'm looking for a multi_match in three fields (Field1, Field2, Field3). I now want to calculate within elasticsearch aggs function the 75th of the _score values. Calculation should take place within the ElasticSearch Query
query = {
"size": 25,
"query": {
"multi_match": {
"query": "keyphrase",
"fields": ["field1", "field2", "field3"]
}
},
"aggs": {
"percentile_score": {
"percentiles": {
"field": "_score",
"percents": [ 75.0 ]
}
},
}
}
responnse = client.search(index=INDEX_NAME, body = query)
for hit in responnse["hits"]["hits"]:
print(f"Score: {hit['_score']}")
Score: 9.517459 Score: 8.774883 ... Score: 5.489334 Score: 4.481924
responnse["aggregations"]["percentile_score"]["values"]["75.0"]
I expect the 75th percentile to be returned to me, but I only get the value None
First of all I would like to mention that aggregations don't depend on the hits that you are getting back. You can request 0, 10, 100 or 1000 hits and with all these hits you will get exactly the same aggregation result. It happens because aggregations are calculated on the entire result set not just on the first 10 or 25 hits that you happen to retrieve.
The second issue is that running cardinality aggregation is not supported by elasticsearch and is unlikely to be supported in the near future.
I would love to suggest you some alternative, but I have no idea what you expect the 75th percentile of the first 25 hits of _score to represent. In other words, what meaning are you trying to extract from this number? What does it represent for you?