I am trying to write a Flux scheduled task in InfluxDB to sample monitoring data points over a period of time and obtain percentiles. Here is the code I have so far:
from(bucket: "dev-10s")
|> range(start: -30m)
|> filter(fn: (r) => r._measurement == "monitor")
|> group(columns: ["metrics","instanceId"])
|> quantile(q:0.8)
|> map(fn: (r) => ({ r with _measurement: "80%" }))
|> map(fn: (r) => ({ r with _field: "value" }))
|> map(fn: (r) => ({ r with _time: r._stop }))
|> to(bucket: "dev-30m", org: "rainge")
However, I need to calculate all percentiles from 10 to 100. If I have to call the quantile function separately for each percentile, the time complexity would be too high. I would like to calculate all percentiles in one go. Similar to the following C++ code:
bool big_or_equal(double a, double b) {
return (a > b) || (abs(a - b) < 1e-8);
}
vector<double> calc_percentile(vector<double> data) {
vector<double> percent({
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0
}
);
vector<double> result;
int data_count = data.size();
int cur_count = 0;
sort(data.begin(), data.end());
for (auto i: data) {
++cur_count;
double cur_percent = cur_count * 1.0 / data_count;
if (big_or_equal(cur_percent, percent[result.size()])) {
result.push_back(i);
}
}
return result;
}
I am a beginner in Flux programming and have limited knowledge of functional programming. I tried to approach the problem using a C++ approach but found that Flux does not have a for loop. When I attempted to use the map function as a substitute for a for loop (I'm not sure if this is the correct approach), I encountered an issue where the anonymous function inside map was unable to modify an external counter. I have tried reading the documentation on InfluxDB and even used chatgpt to solve the problem, but to no avail.
Thank you in advance for your assistance!