find string in file from last saved position - PowerShell

85 Views Asked by At

I wrote a simple PowerShell script, to find specific string in a given text file, using the Select-String cmdlet:

$SearchResult = $null

$FilePath = "c:\temp\app\events.log"

$FindString = "error"

$SearchResult = Select-String $FilePath -Pattern $FindString | Select -Last 1



if ($SearchResult -ne $null) {

    Write-Host "The file: ""$FilePath"", contains the string: '$FindString'."

} 
else {

    Write-Host "The file: ""$FilePath"", does not contain the string: '$FindString'."
}

While the above snippet is working, it will always return the string found in the file, even if new lines have been written to the target file.

I'm trying to find a way to save the last position/line the script read, and to start from that previous position/line, in the next iteration of the script.

While viewing the Select-String man page, I have found the -Skip switch, but it's been used with the number of lines to skip, and not the last position/line, to start from.

So, it seems that I need to create a text file, that will hold the last position/line of each iteration using Out-File, and to read from it using Get-Content in the start of the next iteration, but I'm s not sure how to find the last position/line in each iteration.

In Select-String, I can use:

$LineNumber = $SearchResult | Select -Expand LineNumber

To retrieve the number of the line, but it will work only if the string exists in the text file.

How can I find the last position/line in each iteration, and start from it, in the next iteration?

2

There are 2 best solutions below

5
J P On BEST ANSWER

You could use something like this. It uses your Out-File idea and compares the line number you exported to the line number of the recent $SearchResult and if its greater do something and if not, do nothing.

$SearchResult = $null

$FilePath = "c:\temp\app\events.log"
$LastLine = "c:\temp\app\LineNumber.txt"

If([System.IO.File]::Exists($LastLine)){

    # The LineNumber.txt file exists.
    $StartLine = Get-Content -Path $LastLine
}
Else {

    # The file does not exist. Assign 0 to the $StartLine variable.
    $StartLine = "0"
}

$FindString = "error"

$SearchResult = Select-String $FilePath -Pattern $FindString | Select -Last 1

if ($SearchResult -ne $null) {

    $LineNumber = $SearchResult.LineNumber

    If($LineNumber -gt $StartLine) {

        # The line number is greatwer than the last occurrence.
        Write-Host "The file: ""$FilePath"", contains the string: '$FindString'."

        # Write write the new LineNumber to file.
        $LineNumber |  Out-File -FilePath c:\temp\app\LineNumber.txt
    }
    Else {
        
        # The line number is not greater.
        Write-Host "There was not a new occurrence of the pattern: $FindString."
    }

} 
else {

    Write-Host "The file: ""$FilePath"", does not contain the string: '$FindString'."
}
3
mklement0 On
  • Indeed, Select-String doesn't offer specifying a starting position within an input file (note that it is Select-Object, not Select-String that has a -Skip switch).

  • Therefore, you'll need to read the input file line by line first, and skip the desired number of lines with Select-Object -Skip, before piping to Select-String.

    • Note that while Get-Content does read files line by line, in a streaming fashion, it is also painfully slow,[1] which is why the System.IO.File.ReadLines .NET API is used below instead.

Note:

  • The following assumes that you want to find all matches for your search pattern, either (a) when first run, in the initial content, or, (b), in subsequent runs, in the content that was added since the previous run.
    Therefore, Select -Last 1 is not used below.
# IMPORTANT: Due to use of .NET methods below, be sure to use *full path*.
$FilePath = 'c:\temp\app\events.log'

# Define a path to a file in which the state is recorded after each run.
$stateFilePath = "$HOME\.numberOfLastLineSearched.txt"

# Read the last recorded state, if present.
[int] $numberOfLastLineSearched = 
  (Get-Content -ErrorAction Ignore -LiteralPath $stateFilePath)

$findString = "error"

# Read the file line by line, skipping all previously read lines,
# and search only among the non-skipped ones.
$searchResults = 
  [System.IO.File]::ReadLines($FilePath) |
  Select-Object -Skip $numberOfLastLineSearched |
  Select-String -Pattern $findString

# The file's current content has been searched in full, 
# whether or not a match was found, so the line after the *last* 
# one is the one to start searching from next time.
# Therefore, save the number of the last line == the count of lines 
# in the state file.
# Note: This solution for counting the number of lines is much faster
#       than (Get-Content $FilePath).Count
[System.Linq.Enumerable]::Count([System.IO.File]::ReadLines($FilePath)) > $stateFilePath

# Determine if at least one match was found.
$patternWasFound = [bool] $searchResults

# Print a result status message.
Write-Host @"
The file: "$FilePath", 
searching from line $($numberOfLastLineSearched+1)
$(('DOES NOT', 'DOES')[$patternWasFound]) contain the string: '$findString'$(
  if ($patternWasFound) { ",`nnamely {0} time(s)" -f $searchResults.Count }
).
"@

[1] The reason is that it decorates each line emitted with metadata, which hurts performance - see the bottom section of this answer for details.