Logstash Grok custom URIPATHPARAM

1.4k Views Asked by At

How can I split URIPATHPARAM in grok filter.

Here is my grok pattern.

      grok {
            match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} (?:%{IP:backend_ip}:%{NUMBER:backend_port:int}|-) %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} (?:%{NUMBER:elb_status_code:int}|-) (?:%{NUMBER:backend_status_code:int}|-) %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} \"(?:%{WORD:verb}|-) (?:%{GREEDYDATA:request}|-) (?:HTTP/%{NUMBER:httpversion}|-( )?)\" \"%{DATA:userAgent}\"( %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol})?"]
  }

            grok {
                    match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:%{URIPATHPARAM:uri_param})?" ]
            }

}

Values coming in URI_param

/a1/post/abcxyz/data/adfs/
/partner/uc/article/adafdf?adfaf

I want to catch first three strings of above url's in separate field e.g.

/a1/post/abcxyz

/partner/uc/article

2

There are 2 best solutions below

1
mohdasha On BEST ANSWER
  grok {
     match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:loadbalancer} %{IP:client_ip}:%{NUMBER:client_port:int} (?:%{IP:backend_ip}:%{NUMBER:backend_port:int}|-) %{NUMBER:request_processing_time:float} %{NUMBER:backend_processing_time:float} %{NUMBER:response_processing_time:float} (?:%{NUMBER:elb_status_code:int}|-) (?:%{NUMBER:backend_status_code:int}|-) %{NUMBER:received_bytes:int} %{NUMBER:sent_bytes:int} \"(?:%{WORD:verb}|-) (?:%{GREEDYDATA:request}|-) (?:HTTP/%{NUMBER:httpversion}|-( )?)\" \"%{DATA:userAgent}\"( %{NOTSPACE:ssl_cipher} %{NOTSPACE:ssl_protocol})?"]
  }

  grok {
     match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:%{URIPATHPARAM:uri_param})?" ]
  }
  if [uri_param] {
    mutate {
     split => { "uri_param" => "/"}
     add_field => { "uri_param_1" => "%{[uri_param][1]}" }
     add_field => { "uri_param_2" => "%{[uri_param][2]}" }
     add_field => { "uri_param_3" => "%{[uri_param][3]}" }
   }
  }

Or Alternately, You could just grab these three params from grok itself. like

  grok {
     match => [ "request", "%{URIPROTO:http_protocol}://(?:%{USER:user}(?::[^@]*)?@)?(?:%{URIHOST:refhost})?(?:/%{WORD:uri_param_1}/%{WORD:uri_param_2}/%{WORD:uri_param_3}/%{GREEDYDATA:other_params})?" ]
  }

As asked by you, to join them again, you can simply use mutate filter:

 mutate {
   add_field => { "uri_param" => "/%{[uri_param_1]}/%{[uri_param_2]}/%{[uri_param_3]}/%{[other_params]}"}
 }

I hope that will work, just test it out, and let me know if that worked for you or not.

0
JustAnotherProgrammer On

use the grokpattern below on the uri_param field

%{THREESTRINGS:newField}

where the custom pattern for THREESTRINGS is

THREESTRINGS \/\b\w+\b\/\b\w+\b\/\b\w+\b