The python documentation for the numpy.savez which saves an .npz file is:
The .npz file format is a zipped archive of files named after the variables they contain. The archive is not compressed and each file in the archive contains one variable in .npy format. [...]
When opening the saved .npz file with load a NpzFile object is returned. This is a dictionary-like object which can be queried for its list of arrays (with the .files attribute), and for the arrays themselves.
My question is: what is the point of numpy.savez?
Is it just a more elegant version (shorter command) to save multiple arrays, or is there a speed-up in the saving/reading process? Does it occupy less memory?
There are two parts of explanation for answering your question.
I. NPY vs. NPZ
As we already read from the doc, the
.npyformat is:And
.npzis only aSo,
.npzis just a ZipFile containing multiple “.npy” files. And this ZipFile can be either compressed (by usingnp.savez_compressed) or uncompressed (by usingnp.savez).It's similar to tarball archive file in Unix-like system, where a tarball file can be just an uncompressed archive file which containing other files or a compressed archive file by combining with various compression programs (
gzip,bzip2, etc.)II. Different APIs for binary serialization
And Numpy also provides different APIs to produce these binary file output:
np.save---> Save an array to a binary file in NumPy.npyformatnp.savez--> Save several arrays into a single file in uncompressed.npzformatnp.savez_compressed--> Save several arrays into a single file in compressed.npzformatnp.load--> Load arrays or pickled objects from.npy,.npzor pickled filesIf we skim the source code of Numpy, under the hood:
Then back to the question:
np.save, there is no more compression on top of the.npyformat, only just a single archive file for the convenience of managing multiple related files.np.savez_compressed, then of course less memory on disk because of more CPU time to do the compression job (i.e. a bit slower).