Is there a way to create a tar archive that will only contain file names but omit the actual file data?
The intent is to create a hierarchical 'mirror' of a drive that will only contain the directory structure and file names (preferably with sizes) but omit the actual file data.
The purpose is to generate an inventory of what is on a disk, i.e. something that would be better and faster than the output of ls -R -S -l / but possibly in a less verbose format.
I am aware that the question is about [mis-]using tar for something that it is not meant to be used for, but would like to investigate all options and push the limits of what is possible.
One possible option I'm experimenting with is creating a RAM tmpfs filesystem (in order to avoid writing to disk unnecessarily and increase the speed) and then use lndir (from the xutils-dev package) to mirror the entire subtree using symlinks lndir /media/usb1 /ramtmpfs and then do tar usb1-filelist.tar /ramtmpfs. One limitation I'm running into with this approach is RAM size which is easily exceeded with large subtrees even if just creating symlinks. Is there a better/saner way, possibly something that tar can do on its own?
Following the hint from @CharlesDuffy here is the python compression with both
tarfile(for .tar.gz) andzipfile(for .zip). It takes the folder to 'archive' as 1st arg and the name of the resulting TAR archive.Filling with zeros is only needed in order to display the correct original file size. Omitting it will speed up the operation significantly since compressing zeros is extra overhead, especially when files are huge.
Create TAR file with fake zero-filled files
Create ZIP file with fake zero-filled files
Alternatives
Given that the original purpose of the tar was to generate an 'inventory' of a specific subtree for archival purposes that won't include the actual data,
taris not the best tool for this.A more suitable tool for this could be gnu
findortree; for example to create a JSON file with a directory listing of/media/usb0that will include modification dates and file sizes:Alternative approach with
findto create a tsv file (implies GNUfind, not bsd/macosfind(usegfindon mac after installing withbrew install findutils):Now
files.tsvcan be imported into e.g. an sqlite database: