In spunk how to retrieve only first or last event of each row present in inputlookup file?

460 Views Asked by At

In splunk I'm running below query:

Considering I've following data present in 20230922_id.csv

id_
123
234
345
456
index=1234 application_name="app_name_xyz" app_region=apac "Total time to process request" 
| search [| inputlookup 20230922_id.csv 
  | rename id_ as search 
  | format ]

This above query turned into below one (which I'm expecting based on splunk guide)

index=1234 application_name="app_name_xyz" app_region=apac "Total time to process request" 
| search 123 or 234 or 345 or 456

Currently for each of the above row from CSV file there are 3 events. So the total event count is coming as 12 which is correct.

Is there any way that each of the data from row (which is 123,234,345,456) should just return 1 event? So that total event becomes only 4.

I've tried some below solutions but it's not working:

index=1234 application_name="app_name_xyz" app_region=apac "Total time to process request" 
| search [| inputlookup 20230922_id.csv 
  | rename id_ as search 
  | head 1 
  | format ]

Below one is obvious that it will return just 1 event as I'm doing head on the complete query.

index=1234 application_name="app_name_xyz" app_region=apac "Total time to process request" 
| search [| inputlookup 20230922_id.csv 
  | rename id_ as search  
  | format ] 
|head 1

Could you please let me know if there is any syntax that could solve my above problem?

If there is not a way to get first/last from each matching row, is there any way to get info as below ?

id_ count
123 2
234 3
345 4
456 2
2

There are 2 best solutions below

1
Sheheryar Obaid On

Have you tried dedup?

| dedup id_

This will give you only the latest entry for each id_. Resulting in one result per id_, which if i understood your question correctly is what you are after.

3
warren On

If I understand your question correctly, you have a lookup table with a field named id_ that you want to match in your event data (I think you have a field in said data that matches, correct?)

And then you want to only get the first or last match in your data where the field's value matches what is in the lookup table.

If that is a correct understanding of your use case, this will do it:

index=1234 application_name="app_name_xyz" app_region=apac "Total time to process request" 
    [| inputlookup 20230922_id.csv ]
| stats max(_time) as _time by id_