What is the equivalent of `posix_fallocate()` on Windows?

258 Views Asked by At

NTFS does support sparse files, but I want to make sure the files I have to write to (which might have been created, set as sparse, and partially filled by another application) are fully allocated, so that I won't get an error due to lack of space when writing to the middle of such file at a later time (i.e. if they are to happen, out of space errors should happen now).

Is there a WinAPI function to ensure a sparse file is fully allocated (preferably atomically), like we have posix_fallocate() in POSIX systems? If not, how do I preallocate it?

I don't think these are duplicates:

1

There are 1 best solutions below

0
lvella On

Following the link from this documentation page, I could think of 3 ways of pre-allocating the sparse ranges of a file, but none are atomic, like posix_fallocate(). I was hoping someone could point to an existing solution in the WinApi.

Here they are:

Just copy the file

Copy the full file to another, delete the old file, then rename. This approach has the drawback of always being slow, as it has to read and write the whole file, and potentially takes twice the file space on disk.

It could be improved a little by checking the FILE_ATTRIBUTE_SPARSE_FILE, so you can skip the operation if the file can't be sparse.

Copy the file inplace

Open the file twice, once for reading and once for writing, and alternate between reading from one side and writing to the other, until the whole file has been rewritten. The performance is as bad as the first solution, but at least doesn't take more space than the full file size.

This (maybe) can be improved by reading and writing only one byte per cluster (if you know the cluster size), because the whole cluster have to be allocated. Allocated clusters will keep the old value, and new clusters will be automatically filled with the default value. I say maybe because writing one byte or one full cluster is the same for the NTFS layer, so maybe it is not worth the extra system calls to fseek() the file.

Write zeros to the sparse region

As suggested in the comments of the question, you can use FSCTL_QUERY_ALLOCATED_RANGES to figure out the ranges where the file are allocated, and write zeros to the space between them. Actually, I've read somewhere that the default read value for unallocated ranges is not necessarily zero, so, to be safe, in my implementation I read one byte from one of those regions and use this value to write back to the spaces between allocations.

Again, only one byte per cluster is sufficient.

Depending on how much of the file is allocated, the performance can be much better than the other methods.