I do a short query to get some data from MongoDB using MongoDB Java Driver v 3.2. I am using Bucket and Unwind aggregation to make my work and result a bit meaningful. I made a filter based on a specific id field (IdManfc) and a specific signal field (mysignal). I aim to have a result using bucket aggregation to group my data based on some boundaries(hour by hour timestamp). I want to show the id and signal in my result and that is why I used a traditional SQL trick (min(id), because I filtered only one id) and (min(signal), because I filtered only one signal) to include my id and signal in the result. Now I want to change a bit my theory, I want to imagine that I only have one id but many signal and I want to include them in my result in a way that id is one but information for each signal is different. Or what if I have many Ids and many signals and I want to include them in my result. Does anyone have any idea?
Some hint on what id and signal means in my code, id refers to a machine in a factory and signal means the type of work that machine does. Let's say a computer in the factory has the id of 12345 and it does 3 types of work which are identified by signals, one signal is printing(signal name X), the second signal is scanning(Y), the third signal is not working(Z). Now I want to make a bucket specifying machine-id(12345) and its type of work grouping hour by hour and calculating the min and max of the time spent( which is the value) for each signal.
Bson startTimeBucket = match( gte("lastTimestamp",startTime));
Bson endTimeBucket= match(lte("lastTimestamp",finishTime));
Bson idFilterBucket = match(eq("id", IdManfc));
Bson unwindBucket = unwind("$signals");
Bson filterSignalUnBucket = match(eq("signals.signal", mysignal));
Bson Bucket=bucket("$lastTimestamp", timeBucketList, new BucketOptions()
.defaultBucket("sum")
.output(min("id","$id"),
min("signal","$signals.signal"),
avg("Average",Document.parse("{$toDouble: \"$signals.value\"}")),
stdDevSamp("STD",Document.parse("{$toDouble: \"$signals.value\"}")),
max("Max",Document.parse("{$toDouble: \"$signals.value\"}")),
min("Min",Document.parse("{$toDouble: \"$signals.value\"}"))
)
);
List<Document> resultbuckt = coll.aggregate(asList(startTimeBucket,
endTimeBucket,
idFilterBucket,
unwindBucket,
filterSignalUnBucket,
Bucket
)).into(new ArrayList<Document>());
for (Document Document : resultbuckt) {
System.out.println(Document);
My data in the database: enter image description here
A corresponding JSON:
{"_id":{"$oid":"5db06b6880bd68ca32a853bc"},"disconnected":false,"id":"144-12","lastTimestamp":{"$numberLong":"1571842909626"},"manufacturer":"gmb","modelName":"GMB_Mod1","signals":[{"event":"changed","signal":"InterfaceType.Moulds.Mould_1.TemperatureZones.TemperatureZone_4.ActualTemperature","stability":"unstable","timestamp":{"$numberLong":"1571842907576"},"type":"DOUBLE","value":"175.0","writable":"read","written":false},{"event":"changed","signal":"InterfaceType.InjectionUnits.InjectionUnit_1.TemperatureZones.TemperatureZone_3.ActualTemperature","stability":"unstable","timestamp":{"$numberLong":"1571842907576"},"type":"DOUBLE","value":"35.0","writable":"read","written":false}],"tC":{"$numberInt":"10"}}
{"_id":{"$oid":"5db069c480bd68ca32a841ec"},"disconnected":false,"id":"144-12","lastTimestamp":{"$numberLong":"1571842499744"},"manufacturer":"gmb","modelName":"GMB_Mod1","signals":[{"event":"changed","signal":"InterfaceType.InjectionUnits.InjectionUnit_1.TemperatureZones.TemperatureZone_1.ActualTemperature","stability":"unstable","timestamp":{"$numberLong":"1571842496543"},"type":"DOUBLE","value":"65.0","writable":"read","written":false}],"tC":{"$numberInt":"10"}}
2 lines of my result which is the same signal and different time(because i group the result using bucket and time boundaries:
Document{{_id=1574434800859, id=144-12, signal=signal 1, Average=69.86274509803921, STD=1.2003267528991806, Max=71.0, Min=68.0}}
Document{{_id=1578444800859, id=144-12, signal=signal 1, Average=69.86274509803921, STD=1.2003267528991806, Max=72.0, Min=69.0}}
In my code i specifically filtered one signal and then to see it in my result i use the min(signal). what I expect to see from the result is something similar to what I already got. but I would like to have also the results for other signals within that time bounderies("_id), and remove the filter for only one signal, i expect something like below:
Document{{_id=1574434800859, id=144-12, signal=**the name of 1st signal**, Average=69.86274509803921, STD=1.20035267528991806, Max=80.0, Min=68.0}}
Document{{_id=1574434800859, id=144-12, signal=**the name 2nd signal**, Average=69.86274509803921, STD=12.2003267528991806, Max=95.0, Min=68.0}}
Document{{_id=1574434800859, id=144-12, signal=**the name of 3rd signal**, Average=69.86274509803921, STD=0.2003267528991806, Max=85.0, Min=68.0}}
Does anyone have any idea?
Thank you