Get the newly added data from a txt file every $X minutes

70 Views Asked by At

I have a command that is continually adding IPs (do not know the quantity) to a txt file called ips.txt

shodan stream --alert=all --datadir=. --compresslevel=0 >> ips.txt

I want to get the newly added data to the ips.txt file every 1 hour and make some bash operations on it as I will leave this command running in the background.

How can I get only the newly added data to this file every time automatically?

1

There are 1 best solutions below

2
Ed Morton On
$ cat tst.sh
#!/usr/bin/env bash

ipsFile="$1"

trap 'trap - SIGTERM && kill 0' SIGINT SIGTERM EXIT

shodan() { ( while :; do date; sleep 1; done; ) & }

shodan > "$ipsFile"

endWc=0
while :; do
    sleep 3
    begWc=$(( endWc + 1 ))
    endWc=$(wc -l < "$ipsFile")
    if (( endWc > 0 )); then
        if (( endWc < begWc )); then
            begWc=1
        fi
        newWc=$(( endWc - begWc + 1 ))
        echo "----- $begWc -> $endWc = $newWc lines"
        tail -n +"$begWc" "$ipsFile" |
        head -n +"$newWc"
    fi
done

$ ./tst.sh ips.txt
----- 1 -> 3 = 3 lines
Wed Nov  8 08:20:04 CST 2023
Wed Nov  8 08:20:05 CST 2023
Wed Nov  8 08:20:06 CST 2023
----- 4 -> 6 = 3 lines
Wed Nov  8 08:20:08 CST 2023
Wed Nov  8 08:20:09 CST 2023
Wed Nov  8 08:20:10 CST 2023
----- 7 -> 8 = 2 lines
Wed Nov  8 08:20:12 CST 2023
Wed Nov  8 08:20:13 CST 2023
----- 9 -> 11 = 3 lines
Wed Nov  8 08:20:14 CST 2023
Wed Nov  8 08:20:15 CST 2023
Wed Nov  8 08:20:17 CST 2023
----- 12 -> 14 = 3 lines
Wed Nov  8 08:20:18 CST 2023
Wed Nov  8 08:20:19 CST 2023
Wed Nov  8 08:20:21 CST 2023
Terminated

$ cat ips.txt
Wed Nov  8 08:20:04 CST 2023
Wed Nov  8 08:20:05 CST 2023
Wed Nov  8 08:20:06 CST 2023
Wed Nov  8 08:20:08 CST 2023
Wed Nov  8 08:20:09 CST 2023
Wed Nov  8 08:20:10 CST 2023
Wed Nov  8 08:20:12 CST 2023
Wed Nov  8 08:20:13 CST 2023
Wed Nov  8 08:20:14 CST 2023
Wed Nov  8 08:20:15 CST 2023
Wed Nov  8 08:20:17 CST 2023
Wed Nov  8 08:20:18 CST 2023
Wed Nov  8 08:20:19 CST 2023
Wed Nov  8 08:20:21 CST 2023
Wed Nov  8 08:20:22 CST 2023
Wed Nov  8 08:20:23 CST 2023

The shodan() function is to mimic what your shodan command is doing and the trap is to kill the shodan() child process on exit so we can test the main part of the code, the final loop.

I'm using wc specifically as it counts newlines in it's input so if somehow shodan could have written a partial line at the end of ips.txt it won't get picked up until the next loop iteration when it's a complete line. That's also why I'm piping the tail output to a subsequent head, to ensure we just get those complete lines. I'm also using tail | head instead of, say, a single awk command as tail reads from the end of the file and so will be more efficient for this than awk which always starts at the beginning.

The if (( endWc < begWc )) block is to start printing from the first line again if the file shrunk - that may or may not be what you want to do in that situation, idk. You'll also want to add a bit of code to do whatever you want to do if the files is empty (if (( endWc > 0 )) fails) on any iteration and think about whether or not there are any other rainy-day scenarios to cover.

I interrupt ./tst.sh to stop it, and the cat ips.txt has a couple of extra lines as shodan() runs once per second so the main "every 3 seconds" loop (obviously change that to sleep 3600 or do whatever else you like to sleep for an hour) hasn't processed all the lines at that point. The important thing is that no lines are missing from the output prior to that.