I'm working with Loki/Grafana and trying to modify a query that currently identifies systems that are considered "offline." A system is deemed offline if it has sent a log within the last 7 days but not in the last 5 minutes. However, I want to change this logic to compare the systems active in the last 5 minutes against a fixed list and display those that are part of the list but haven't sent a log in the last 5 minutes.
The fixed list of systems is provided as a JSON file through promtail to Loki/Grafana like this:
{
"allSystems": ["mrs2-14", "mrs2-11", "mrs2-103"],
"totalCount": 3
}
And I can query this JSON data in Grafana using the following query:
{job="mrs_system_info"} |= ``
What I need is to cross-reference this static list of systems with the logs to determine which ones haven't reported in the last 5 minutes. The current query I'm using to find "offline" systems is:
count by(system) (count_over_time({job="mrs_error_list"} |~ "" [7d]))
unless
count by(system) (count_over_time({job="mrs_error_list"} |~ "" [5m]))
How can I alter this query to check against the allSystems array from my JSON data instead of the last 7 days logs?
I expect the final output to show a list of systems that are in the allSystems array but have not sent logs in the past 5 minutes.
Any suggestions on how to correctly implement this logic with Loki/Grafana would be greatly appreciated.
I used the solution from markalex, so that i was able to use the same query