How to get log volume/storage size by label with Loki?

307 Views Asked by At

Background: I have various services, managed by different teams, sending logs to a single Loki instance. Logs carries a label service that corresponds to the name of the service. We are also running Prometheus and Grafana.

Goal: Each team has an operating budget and I would like to split the cost of operating the Loki instance + storage cost based on how much logs were generated by each service, and deduct that from each team's budget.

What I have: I have used the LogQL query count by(service) (rate({environment="live"} [24h])) with the 'Instant' query type on Grafana to get the total number of log lines, grouped by the service label.

Problem: Some service's logs are tiny whilst others are big. Attributing costs based on number of log lines is inaccurate.

Question: Is there a LogQL query to get the total size of logs in bytes, grouped by the service label? If not, any suggestions for other ways to attribute costs of the logs to each team?

2

There are 2 best solutions below

0
dayuloli On BEST ANSWER

@markalex's answer put me on the right tracks, and I found this post which also helped. Turns out bytes_over_time is what I've been looking for:

sum by(service) (bytes_over_time({environment="live"} [24h]))

bytes_over_time counts the amount of bytes used by each log stream for a given range.

0
markalex On

To get total length of all lines present in Loki over some period of time, you can use this query:

sum by (service) (
    sum_over_time(
        {environment="live"}
        | label_format length=`{{ __line__  | len }}`
        | unwrap length
    [24h])
)

Here:

  • label_format length=`{{ __line__ | len }}` adds label length with value equal to length for the whole log line.
  • unwrap length unwraps this label into value for aggregation with sum_over_time

Please notice, that while this query provides result you described, it's only an estimation of how resources are divided between services. It doesn't account for series number per service, and I believe each series has it's own storage overhead.
*I don't know how (or even if) it can be accounted for from query or any other mean