Instant vector operations on prometheus range vectors

30 Views Asked by At

Is there any way to perform an instant vector operation on a range vector?

For instance, if we have count_containers and we want to count the fraction of the time where we have at least one, we might try to do:

avg(avg_over_time(count_containers[1h] > 0))

But this can't be done because count_containers[1h] is a range vector. Is there any sort of map(...) operation that can be performed on instant vectors in range vectors? Or a way to defer building a range vector from count_containers?

2

There are 2 best solutions below

0
markalex On

Is there any way to perform an instant vector operation on a range vector?

No. Prometheus doesn't allow anything like this.

But you can apply range selector over something other than vector selector using subquery syntax. So in your example it would be something like.

avg(avg_over_time((count_containers > 0)[1h:15s]))

Notice that in this case you must place : in the range selector to indicate usage of subquery.

And for this example I used resolution 15s, to indicate that result of the query should be calculated for each 15 seconds window. But you might want to adjust this to your needs, depending on needed precision, scrape interval, etc. Also, resolution can be omitted (while preserving :, like [1h:]): in that case value of evaluation_interval will be used.


we want to count the fraction of the time where we have at least one, we might try to do:

Your attempt of the query, even if it were supported, would not produced what you wanted. Rather it would calculate average number of containers when number of containers was positive.

To calculate percentage of the time when number of containers was positive use following

avg(avg_over_time( (count_containers > bool 0)[1h:15s] ))

Expression count_containers > bool 0 will return 1 if number of containers is positive, and 0 otherwise.

0
valyala On

If you want calculating the share of time when count_containers was positive during the last hour, it is better to use the following PromQL query:

sum(sum_over_time((count_containers >bool 0)[1h:15s]))
  /
sum(count_over_time(count_containers[1h:15s]))

This query uses the following PromQL features:

Note that the avg(avg_over_time(...)) query may return unexpected results, since average of averages may not equal to the average.

P.S. the query above assumes that the interval between raw samples of a single time series equals to 15 seconds - see 15s in square brackets after the colon in the query above. This interval is also known as scrape_interval in Prometheus ecosystem. If your data has different scrape interval, then the value in square brackets should be adjusted in the query above. Otherwise query results will be incorrect.

P.P.S. The query can be simplified to the following one with share_gt_over_time function in MetricsQL - PromQL-like query language I work on:

share_gt_over_time(count_containers[1h], 0)