How can I return a tuple with multiple fields from Combiner/Reducer/Aggregator function?

92 Views Asked by At

Here Storm Documentation states: A CombinerAggregator returns a single tuple with a single field as output.

What should I do to return a tuple with multiple fields from Combiner function?

I am creating a aggregate function and want to aggregate two or more values from the input tuple and send these two or more fields as output.

I also want to have some fields of the input tuple in output. How can I use Combiner Function to get the required output?

Input Tuple to Combiner Aggregator function:

("a", "b", "c" , "d")

Required Output Tuple:

("a", "b", "newValue1", "newValue2", "newValue3")

In the past, I tried creating a model in the init() method of CombinerAggregator from the fields of the tuple and returning it from the CombinerAggregator as a output. But I don't feel that's the right solution. Does chainedAgg() function works well with this kind of situation?

Any help will be greatly appreciated.

1

There are 1 best solutions below

4
Stig Rohde Døssing On

I think you probably want to use the more general Aggregator interface.

From the link you posted:

The most general interface for performing aggregations is Aggregator, which looks like this:

public interface Aggregator<T> extends Operation {
    T init(Object batchId, TridentCollector collector);
    void aggregate(T state, TridentTuple tuple, TridentCollector collector);
    void complete(T state, TridentCollector collector);
}

Aggregators can emit any number of tuples with any number of fields.