Elasticsearch - Case insensitive aggregation not working

45 Views Asked by At

ES version - 7.17.7

I've an index for which I' running an aggregation to get all field matching certain regex. This should be case-insensitive i.e.

new york should match New York and NEW YORK and New YORK

Have added a lowercase_normaliser to index so that documents are indexed with lowercase name. This doesn't solve the issue though. Now I've to pass the regex in lowercase, else it doesn't return correct results.

I've created an index named blah with following mapping

{
  "blah": {
    "mappings": {
      "properties": {
        "name": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "keyword",
              "normalizer": "lowercase_normalizer"
            }
          }
        }
      }
    }
  }
}

Index settings -

{
  "blah": {
    "settings": {
      "index": {
        "max_ngram_diff": "20",
        "analysis": {
          "normalizer": {
            "lowercase_normalizer": {
              "filter": [
                "lowercase"
              ],
              "type": "custom"
            }
          }
        }
      }
    }
  }
}

Documents inserted:

blah/_doc/1

{
  "name": "NEW YORK"
}

blah/_doc/2

{
  "name": "New Orleans"
}

blah/_doc/3

{
  "name": "New Hampshire"
}

Following aggregation query doesn't return expected results - Query

{
  "aggregations": {
    "autoComplete": {
      "terms": {
        "field": "name.raw",
        "include": "New.*",
        "order": [
          {
            "_term": "asc"
          }
        ]
      }
    }
  },
  "size": 0
}

Output

  "aggregations": {
    "autoComplete": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [

      ]
    }
  }

EDIT Another downside is that if I use aggregation after running .lowecase on inlcude value, results are also normalised. Is it possible to have original value being returned?

2

There are 2 best solutions below

1
Rahul Prasad On

Try search analyzer. For more details follow this link https://www.elastic.co/guide/en/elasticsearch/reference/7.17/search-analyzer.html

0
G0l0s On

It works if you replace lines with "include": "new.*" (terms have been already lowercased) and with "_key": "asc" in query