The docs in my index has the following fields
{
"weight" : int
"tags" : string[]
}
tags is a list of string. Eg - ["A", "B", "C", "D"] . Lets assume my index has the following data
[
{
"weight": 1,
"tags": [
"B",
"C"
]
},
{
"weight": 2,
"tags": [
"A"
]
},
{
"weight": 3,
"tags": [
"B"
]
},
{
"weight": 4,
"tags": [
"A",
"C"
]
},
{
"weight": 5,
"tags": [
"C"
]
}
]
I have a param priority = ["A", "C"]. I want to fetch documents based on the priority list. So since "A" appears first in list, the docs with tag "A" should appear first in output. If doc1 and doc2 both have the same tag, then the doc with greater weight should appear first in output. So output should be
[
{
"weight": 4,
"tags": [
"A",
"C"
]
},
{
"weight": 2,
"tags": [
"A"
]
},
{
"weight": 5,
"tags": [
"C"
]
},
{
"weight": 1,
"tags": [
"B",
"C"
]
}
]
Can we achieve this in ElasticSearch ? I have also heard about Painless scripts. How can we use Painless scripts here, if we can ?
The first thing you need to know is that the tags indexed in the
tagsarray are not necessarily indexed in the same order as you specify them in the source. Usually, the lexical order prevails, and while it works with simple letters likeA,BandC, your real tags might be different and not listed in lexical order. To sum up, you cannot count on the order of the tags list in order to boost certain documents relative to others.Similarly, if you were to specify a
termsclause in your query to give more importance toAoverC(as inpriority = ["A", "C"]), ES would not necessarily use that order to execute your query.The solution I'm giving you below respects the conceptual ordering of your priority, by using a
bool/shouldquery, where the first element has a bigger boost factor than the second, the second has a bigger boost factor than the third, etc. In this case, we should boostAoverCso I'm giving documents having tagAa boost of 2 and the ones with tagCa boost of 1. If you had three tags, you would start at 3, instead. This will properly boost the documents as per your desired priorities.The next part is to account for documents having equal score, and for this we can simply sort by descending weight:
The above query, when executed over your sample set of documents, would yield the results you expect: