I was wondering if there is a better way to get some job statistics (such as cputime, walltime, mem usage etc) in a PBS job script (once the job completes). In my current set up, I have a line at the end of my PBS script
qstat -f "${PBS_JOBID}"
But, the problem is if the job fails or gets killed for some reason, this line won't get executed. Please let me know other options that I can use.
I greatly appreciate any help or advice, thanks!
You may find the
tracejobscript useful. It is available in PBS derivative batch scheduling systems.tracejobtakes one argument, theJOB_IDand one option-n daysthat indicates how deep should it look into the log files for relevant stats.Note on split submission and server hosts
Note that
tracejobworks only if the logs are accessible on the host where it is invoked. On some installations, PBS server runs on one host and job submissions are performed on another and log files are stored on a file system, local to the PBS server. In this casetracejobwould not work.Example
qstatfails since the job has completed, whiletracejobworksYou can redirect
stderrto/dev/nullwhen executingtracejobto avoid multiple message of the formIn the above logs the information that is not relevant to the question was replaced with capitalized words.