What Happens to POSIX File Lock When the Underlying File is Overwritten

250 Views Asked by At

Title says it all. Suppose I have a file which multiple processes want to modify using my library. A traditional means of preventing corruption is to use flock or similar to place an advisory lock on the file. Each process attempts to open the file and acquire the lock; blocking or erroring if the lock cannot be obtained.

Now, suppose I want to insert a line into the middle of the file. The canonical, "safe" way to do this is a second file which is then moved over the first once all of the writing is done.

In the above case, what happens to any existing file locks on the original file? Are they preserved since the underlying struct dirent doesn't change (excluding the creation / modification timestamps)? What happens to any other process blocked on flock when this happens? The flock manpage is mute on these topics, unfortunately.

Finally, if the above operations result in the lock being lost, what is a good means of preventing concurrent modification in lieu of flock?

2

There are 2 best solutions below

0
Barmar On BEST ANSWER

Locks are associated with the inode, not the filename. So if you have locks on file1, and then do mv file1.new file1, the inode of the old file1 is unchanged (if there are other hard links, they still reference the original file), and the filename file1 now points to the inode of file1.new.

As a result, all the file locks still persist on the original file. Even if there are no filenames referring to it, the file contents continue to exist as long as there are any open file handles pointing to it.

6
Luis Colorado On

What Happens to POSIX File Lock When the Underlying File is Overwritten

Just nothing, locking is unrelated to writing. Locking files in linux/POSIX are not mandatory but advisory, so this means all programs that share the file should agree on using the locking routines to wait for the resource (this is, the segment in the file you are about to protect from other threads/processes simultaneous locking) This means you have to acquire the lock in advance to write to the file. So if a program doesn't acquire a lock on the segment it is going to write, or any program doesn't follow this convention, your data will trash. Documentation on file locking can be found in the flock(2) system call manual page. I've consulted it in my FreeBSD, but a manual page should have been installed in your system too. If you don't have it installed, you will need to ask for help to your system administrator or in another forum about how to install manual pages.

By the way, reading/writing a file (each read/write system call to a disk file) is waranted to be atomic, as the kernel locks the inode during the full write/read system call (from the start of the system call to its end, when it returns to user mode in the process) so no other process can do a read/write to the same file at the same time. This is very important to do full deterministic file updates. No need to lock a file to do a single write, then.

Another issue you ask is about data insertion into a file. There's no provision in POSIX to insert some data in a file. You can only (over)write a file's data, extend it past the end (by writing) or truncate the file (totally to zero length, or partially to some length) but there's no support for insertion/deletion of data between two bytes of the file. Wha do the editors then? simply to make a copy of the file that is overwritten when you insert a group of lines into the text stored in the file.

Locks are maintained by the kernel as long as some process has the lock exists. Normally the lock has a number of references to the processes owning the lock (and the ones waiting for it) and that reference count is decremented if the process terminates, making the lock to be released by the kernel when no process has it. This will protect from resource leaking in a controllable way.

How the kernel locks resources like files or inodes (or even disk blocks) is described in the kernel internals. You can read any kernel describing book on the subject:

  • The design of the UNIX operating system, by Maurice J. Bach. This is an old resource, but everything about locking is useful to understand how locking works. Very affordable and introductory to the sujbect. This is described as the source used by Linus Torvalds to start writing his Linux kernel from scratch. You will learn a lot from this book.
  • The design and implementation of the FreeBSD Operating System, by Marshal Kirk McKusik. Last edition is very good source for actual locking technologies. You will see a description here on how actual, multithreaded, multiprocessor systems use locking techniques to efficiently handle shared resources access.
  • Any book describing Linux kernel internals. Linux Device Drivers by Alessandro Rubinni is a good starting point. This book is more focused on teaching you how to write device drivers and modules, and so, shows you the available resources for in-kernel locking, and will require you to dig into the functions described to see how exactly things are done. It requires more time to learn, but the acquired knowledge is more precise.