I'm building a small program to compare file contents of two directories. These directories SHOULD mirror each other but that may not be the case if someone forgot to copy files to the second directory.
So this program will check to see if all of the files were copied from Dir1 to Dir2.
What I'm having a problem with is the speed in which the files are counted. Currently, I'm using
find /Dir1/ -type f \( -name '*.CR2' -o -name '*.NEF' \)
Because it is unknown if the files will be from a Canon or Nikon camera.
In a situation where we have thousands of files to count it takes way too long.
What is the fastest way to accomplish this?
To find out what is missing from Dir1 in Dir2, use
rsync:-nis a dryrun so it won't actually copy anything.-iwill print a line for each file that would need to be copied. And-rmakes it recursive.You can test if there were any changes by using
-zto check for empty output.If you want to check the other direction, you can add
--delete.By default, rsync uses the filesize and datestamp to determine if two files are the same. If timestamps may be different, you can use the
--size-onlyoption to only check that the file sizes match, which is very fast.If you need to limit searching to just certain file extensions or other more complicated queries, see
man rsync. It is an extremely configurable tool.