How to read, truncate and write locked file, without unlocking?

Question

How to read, truncate and write locked file, without unlocking?

111 Views Asked by sahgasdvsadgv At 27 January 2024 at 14:14

Consider a file, that is edited with a frequency of tens, hundreds multiple processes per second. Since two or more processes can race for a file access for writing, there need a mechanism to be implemented, to make only one process to access file at one time.

As I understand, calling fopen (or open), until the fclose will do the job - these functions guarantee, that only one process will access the file.

The problem is that the file is needed to be truncated after being opened, because there is a need to read it first and after that rewrite. If there will be two fopen calls, obviously it will not guarantee cross-process safety.

This answer recommends to use freopen function after fopen and before fclose calls, however, according to the Linux man-pages, freopen documentation:

If pathname is not a null pointer, freopen() shall close any file descriptor associated with stream.

Currently, the only solution I see is to create an associated file, with the needed to be accessed, and lock it, instead of locking the needed file (although it also be locked for sure), for the period, needed file will be read, closed, truncated, wrote, closed.

Is at least such solution will guarantee safety? Any better solutions?

As users in comments explained me, fopen does not prevent a file from being opened in another process at all. Instead, flock should be called additionally.

So the question is can I lock the file, read, truncate, write and then unlock it, or I should use solution with associated file, I described above?

Specific summary question

As I understand, flock accepts file descriptor need to be locked as an argument (and exactly descriptor from open, not fopen).
But, as I understand, read, truncation and write file requires opening, closing and again opening the file.
So, according to second fact, if I lock the file before I call close, will it be locked after close call? If it automatically unlocks due to close it does not make any sense. If it is not unlocks, how to unlock it after all manipulations?

Original Q&A

There are 2 best solutions below

KamilCuk On 27 January 2024 at 15:23

I will ignore everything about freopen and fopen, as I suspect they are unrelated to your problem. If you are interested specifically only about freopen behavior, which might be a fun topic, consider asking a separate question.

No, fopen give no guarantee about any concurency. You can fopen and open files in parallel as many times as you want.

Each process is an instance of a parser. They all store data to the different files. But there is one file, that just contains information of how many items each parser has stored, consider just an array string, and each process (instance of parser) update value to the specific array position

Great. The simplest to achieve data consistency is to use locking. There is no file locking in standard C, so use operating system mechnisis - in this case Linux or POSIX.

I would do: should the parser write or read from the file, first it flocks the file, make the modifications or reads from it, and then unflocks it for other processes to use. Read man 2 flock, and there is also man 3 flock command line utility. Each process may have pre-open()-ed the file to protect against rename or unlink or to reduce overhead.

**John Bollinger** · Accepted Answer · 2024-01-27T15:53:52.007000

You write,

Consider a file, that is edited with a frequency of tens, hundreds multiple processes per second. Since two or more processes can race for a file access for writing, there need a mechanism to be implemented, to make only one process to access file at one time.

and you clarify that

It should be strictly P1read-P1truncate-P1write -> P2read-P2truncate-P2write, not P1read-P2read-P1truncate-P1write-P2truncate-P2write

That is, you want to serialize units of read(+truncate)+write access to the file.

As I understand, calling fopen (or open), until the fclose will do the job - these functions guarantee, that only one process will access the file.

Absolutely not. Neither C nor POSIX makes any such guarantee. In practice, it is not typically true on POSIX systems, and it is not true on Linux in particular. You might see such behavior on Windows, however, which could be where you got the idea.

If you must use a physical file for this purpose then you will need to perform some sort of locking around access to it. Provided that the file is on a local filesystem, the most natural locking mechanism would be flock(). This implements an advisory locking system, meaning that it applies only to processes that explicitly gate their access via flock(), but that does not appear to be an issue for your planned use. flock() has the advantage that if a process dies while holding the lock (with the result that the underlying open file description is closed) then the lock is automatically released. flock() has the disadvantage that it will not work if the file is replaced -- only if processes always modify it in place.

Another alternative would be a process-shared mutex or semaphore or a named semaphore, which the processes would use (again, cooperatively) to ensure the needed serialization. This has the advantage that it could still work if the file is altogether replaced instead of modified in-place, but the possible disadvantage that if one of the participating processes dies while holding the mutex / semaphore locked then the whole system gets jammed. (But if you use System V semaphores instead of POSIX semaphores then that shortcoming can be addressed.)

But what if you don't use a physical file? What you're describing sounds like it will require moderately high I/O bandwidth, and it will do a great deal of rewriting the same sectors of the underlying physical medium. Why do you want to hit a disk so hard? Or to deal with slow file I/O? You probably would be better off using shared memory (still protected by mutex or semaphore). Or you could implement a data broker service that runs on the system and mediates access to the data.

Addendum - response to additional questions about `flock`

As I understand, flock accepts file descriptor need to be locked as an argument (and exactly descriptor from open, not fopen).

Sort of. flock() requires a file descriptor (from open) to identify the file to lock. The file is locked, not the file descriptor, but that lock is associated with the file descriptor.

And if you want to use stdio functions to read and write the file, then you can get a FILE * from the file descriptor via fdopen(). The file descriptor is still underneath, so that will not interfere with the locking. If you do this, however, then you should be sure to close via fclose()ing the stream from fdopen, not directly close()ing the file descriptor underneath.

But, as I understand, read, truncation and write file requires opening, closing and again opening the file.

That is incorrect.

If the data to be written are the same length then you don't need to explicitly truncate at all. After reading the file, rewind()ing to the beginning or fseek()ing or lseek()ing there, then writing, then closing will have the desired effect.

If the data to be written might be shorter then you can truncate() the file to any length (including 0) without first closing it.

So, according to second fact, if I lock the file before I call close, will it be locked after close call?

No, and it doesn't need to be (see above). Moreover, it is a convenience that you just need to close the file, not explicitly unlock it.

Outline

  // ...

  // open the file for reading and writing, creating it if necessary
  int fd = open(path_to_file, O_RDWR | O_CREAT, 0600);

  // acquire an exclusive lock on the file, blocking if necessary
  int result = flock(fd, LOCK_EX);

  // Wrap the file descriptor in a stream
  FILE *file = fdopen(fd, "r+");

  // ... read the file ...

  // Rewind to the beginning
  rewind(file);

  // Truncate if necessary:
  result = truncate(fd, 0);

  // ... write new data ...

  // close the file and underlying file descriptor, flushing any
  // buffered output to it.  The lock is hereby released.
  result = fclose(file);

  // done

Note that for clarity and simplicity, I have omitted testing for or handling errors from the various functions. Production code does not have that luxury.

How to read, truncate and write locked file, without unlocking?

There are 2 best solutions below

Addendum - response to additional questions about `flock`

Related Questions in C

Related Questions in LINUX

Related Questions in FOPEN

Related Questions in RACE-CONDITION

Trending Questions

Popular # Hahtags

Popular Questions

How to read, truncate and write locked file, without unlocking?

There are 2 best solutions below

Addendum - response to additional questions about flock

Related Questions in C

Related Questions in LINUX

Related Questions in FOPEN

Related Questions in RACE-CONDITION

Trending Questions

Popular # Hahtags

Popular Questions

Addendum - response to additional questions about `flock`