Aggregation for Terms in new Jave client - is there a generic way to get the results?

27 Views Asked by At

When creating aggregation in new API Java client I came into a problem. this is how I create my request. some of my fieldnames are of type string. some can be long or double etc

  Map<String, Aggregation> termsAggregationMap= new HashMap<>();
  for (String fieldName : fieldNames)
        {
            Aggregation termsAggregation = new Aggregation.Builder()
                    .terms(new TermsAggregation.Builder().field(fieldName).order(
                            NamedValue.of("_key", SortOrder.Asc))
                            .build())
                    .build();

            map.put(fieldName, termsAggregation);
        }

SearchRequest searchRequest = new SearchRequest.Builder()
                    .index(indexAliasName).size(0)
                    .aggregations(termsAggregationMap)
                    .build();

when I come to read it I found out that I have to know the aggrgsation type: string, long, double etc..

for example, in order to read aggregation of string I need to use sterms

Map<String, Aggregate> aggregationsMap = searchResponse.aggregations();
Aggregate aggregation = aggregationsMap.get(aggregationName);
for (var entry : termAggregation.sterms().buckets().array())
{
 //does something with it
 entry.key().stringValue();
 entry.docCount();
}

but in order to get long value I need to use lterms method

termAggregation.lterms().buckets().array()

Is there a better way to do one generic thing instead of to check every time the aggregation type as it used to be in old client?

1

There are 1 best solutions below

0
G0l0s On

I explored your problem. I couldn't find a better way to genericize solution. Methods lterms, sterms, dateHistogram and others aren't genericable (methods isLterms, isSterms, isDateHistogram and others as well)

So my solution is enumeration

public enum AggregationKind {
    TERMS_STRING {
        @Override
        boolean identify(Aggregate aggregation) {
            return aggregation.isSterms();
        }

        @Override
        MultiBucketAggregateBase<?> getSpecificAggregation(Aggregate aggregation) {
            return aggregation.sterms();
        }
    },
    TERMS_LONG {
        @Override
        boolean identify(Aggregate aggregation) {
            return aggregation.isLterms();
        }

        @Override
        MultiBucketAggregateBase<?> getSpecificAggregation(Aggregate aggregation) {
            return aggregation.lterms();
        }
    },
    DATA_HISTOGRAM {
        @Override
        boolean identify(Aggregate aggregation) {
            return aggregation.isDateHistogram();
        }

        @Override
        MultiBucketAggregateBase<?> getSpecificAggregation(Aggregate aggregation) {
            return aggregation.dateHistogram();
        }
    };

    abstract boolean identify(Aggregate aggregation);

    abstract MultiBucketAggregateBase<?> getSpecificAggregation(
            Aggregate aggregation);

    public static List<?> getBucketList(Aggregate aggregation) {
        AggregationKind[] aggregationKinds = AggregationKind.values();

        for (AggregationKind aggregationKind : aggregationKinds) {
            if (aggregationKind.identify(aggregation)) {
                MultiBucketAggregateBase<?> specificAggregation = aggregationKind
                        .getSpecificAggregation(aggregation);
                return specificAggregation.buckets().array();
            }
        }
        
        return Collections.emptyList();
    }
}

Using

SearchResponse<Void> response = elasticsearchClient.search(searchRequest, Void.class);
List<StringTermsBucket> buckets = (List<StringTermsBucket>) AggregationKind
        .getBucketList(response.aggregations().get(aggregationName));

As you can see this solution is applicable to all Elasticsearch bucket aggregations (the MultiBucketAggregateBase subclasses (direct and transitive))