A colleague of mine wanted to run a FORTRAN program that takes files arguments and outputs their ordering (best first) against some biophysicochemical criterion. What he needed were the 10 best results.
While the files are not big, the problem is that he got a bash: /home/progs/bin/ardock: Argument list too long, so I created 6 digits long symlinks to the files and used them as argument, which worked ;-)
Now, if the number of files is really too huge for the above trick to work then what can you possibly do to get the 10 best out of all of them? Do you have to sort the files by chunk and compare the bests against the bests with something like this?
#!/bin/bash
best10() { ardock "$@" | head -n 10; }
export -f best10
find . -name '*.dat' -exec bash -c 'best10 "$@"' _ {} + |
xargs bash -c 'best10 "$@"' _ |
xargs bash -c 'best10 "$@"' _ |
xargs bash -c ... | ... | ...
The problem here is that the number of required xargs is not known in advance, so how can you make it a loop?
note: As the program is outputting a linefeed-delimited stream of filepaths, I know that xargs can potentially break. Don't worry about it here, you can consider the filenames to be alphanumeric.
Maybe something like this (untested):
That assumes your file names don't contain newlines since your existing calls to
headandxargswould fail if they did. It also assumes you're using the shell builtinprintf, rather than an external version of it, so it won't have an ARG_MAX issue.