I'm doing some testing with flock and pkill for a test.sh script that I'm calling from cron and I ran into something I don't understand.
The test.sh is scheduled as a * * * * * job in cron. Its a very simple script that for testing purposes writes a timestamp to file and then sleeps for 5 minutes. This is to confirm flock is working well and preventing multiple processes for the same script.
This part is working well as I only see one timestamp showing up per 5 minutes despite the test.sh being scheduled to run every minute.
Now as a extra safety measure I want to kill the test.sh (because the script I actually want to use sometimes appears to hang syncing some files to S3 using AWS CLI)
So I figured pkill would be the easiest as it doesn't require modifying anything to my existing script.
If I run pkill -9 -f test.sh it says the processes is killed. Running ps aux | grep test.sh I indeed don't see any test.sh processes anymore.
However as cron is supposed to test.sh every minute, I expect that after killing the process, it would start again after less than a minute.
However it appears that the script doesn't actually restart until the sleep period is over.
So the script initially runs at e.g. 12:00, sleep will last until 12:05. If I kill the script on 12:02 I expect it to run again at 12:03 but it's not actually running again until 12:05 which is inline with the sleep period.
Why is this happening? Also, if pkill is not recommended, is there any other way to kill my processes after a certain amount of time? Preferably without having to edit the original script.
See the following example:
Line 1 opens FD 9 on the lockfile. Line 2's
flocksets a lock on the FD. Line 7'ssleepinherits the FD and keeps it being locked. When youpkillthe.shscript it'll not killsleepso the FD is still locked untilsleepfinishes. So, to clean up, you need to kill all running processes afterflock.flock(1)usesflock(2)and according toflock(2):