Creating a bash script to match multiline patterns in log files

92 Views Asked by At

I'm trying to automate some time consuming tasks/log checking, building a system that I will replicate to other uses.

I have a logfile for example:

...multiline ACTION Text where all is good...
ERR-101 Something is wrong
ERR-201 Something is wrong with QASDASDASD
INFO-524 Something was wrong 
WARN-484 Check line 23
...multiline ACTION Text where all is good...
ERR-101 Something is wrong
ERR-201 Something is wrong with PPOYOYOY
INFO-524 Something was wrong
WARN-484 Check line 23
INFO-524 This is it

I'm creating a check-error.template file:

# This is the template file
ERR-101 Something is wrong
ERR-201 Something is wrong with <TEXT_VAR>
INFO-524 Something was wrong
WARN-484 Check line <NUMBER_VAR>
<?>INFO-524 This is it</?>

Starting with # is a comment, surrounded by <?> are optional (e.g. exist only in the last line, match paragraphs with and without it).
Text and number will be regexp checked.

If the error matches the template, I know it's ignorable and I want to log it to the side and remove it from the log.
I'm not using something advanced (perl, other regexp helpers), as it will be an issue to make sure it exists on every environment, and currently trying to do it with grep -P.

The following function gets a file and converts the template to regexp pattern

 function template2variable {  
     local file=$1
     local var_name=$2
     local template=$(sed '/^#/d' "$file")
     local pattern="${template//\\/\\\\}" # replace \ with \\   
       pattern="${pattern//\"/\\\"}" # escape "   
       pattern="${pattern//<TEXT_VALUE>/([[:alnum:]_]+)}"   
       pattern="${pattern//<NUMBER_VALUE>/([[:digit:]]+)}"
           pattern="${pattern//$'\n'/\\n}"
           pattern="${pattern//<?>/(}"
           pattern="${pattern//<\/?>/)?}"
           printf -v "$var_name" '%s' "$pattern" 
} 

template2variable "check-error.template" $error_template

Matching template with:

grep -Pzo "${error_template}" $logfile

Doing so, I get back all the template lines I wished.

However, when trying to work with the grep data
using -n lists every iteration with 1
using -c I get line count of 1
using -v results in an empty output

It seems like the match has returned as one giant result instead of several I can iterate over.

What am I doing wrong?
Suggestions for improvement?

**Summary:
I want to define a "template", a paragraph of text to be matched inside another text file (logfile). The match will occur if I can find the whole chunk of the paragraph. The template / match, have some of the lines specify a placeholder for a text/number (e.g. "Line <NUMBER_VAR>" matching "Line 1", "Line 2", etc.) Doing so using bash/grep, I've defined a regexp template.

How can I iterate over the results? and create a new logfile without them?**

Thank you

Adding a minified example:

Logfile:

    Line 1
    Dave is right
    Sharon is great
    Line 2
    Dave is right
    Aharon is nice

If the template file is:

    # Template file
    Line <NUMBER_VAR>
    Dave is right

It will read the template, and will search for it inside the logfile. So I will could iterate over the options and get:

    match[0]
        Line 1
        Dave is right
    match[1]
        Line 2
        Dave is right
1

There are 1 best solutions below

4
Eric Marceau On

Simple fix to your script. You need to insert missing semi-colons, by replacing all instances of

pattern="${pattern

by

pattern=";${pattern

That will set the OR function for the match/action. Also, I don't completely understand what you are doing, but the line

local pattern="${template//\\/\\\\}"

seems like there is something wrong with that. It seems like it is either incomplete or malformed. I could be wrong.