I need to read a sector from the physical disk, but without using the system cache.
I tried this:
import os
disk_path = "/dev/sdc"
try:
disk_fd = os.open(disk_path, os.O_RDONLY | os.O_DIRECT)
os.lseek(disk_fd, 12345 * 4096, os.SEEK_SET)
buffer = os.read(disk_fd, 4096)
finally:
if disk_fd: os.close(disk_fd)
But I get an error:
Traceback (most recent call last):
File "/home/marus/direct.py", line 8, in <module>
buffer = os.read(disk_fd, 4096)
OSError: [Errno 22] Invalid argument
In Windows I know that there are some alignment requirements for unbuffered file reading, but here in Linux I don't know how it is... What can be wrong here ? I executed the script as sudo.
Edit:
If I remove the os.O_DIRECT flag, everything works fine...
Update: I preallocated an aligned buffer like this:
buffer_address = ctypes.create_string_buffer(buffer_size + sector_size)
buffer_offset = (ctypes.addressof(buffer_address) + sector_size - 1) & ~(sector_size - 1)
buffer = ctypes.string_at(buffer_offset, buffer_size)
...but now how can I use this buffer with os.read() ?
man 2 open:
man 2 read:
E.g. both file position and buffer in memory should be aligned at 512 bytes. You can control file position with lseek, but read buffer in python requires different approach.
See https://bugs.python.org/issue5396 for the details and remedy.
Here's another discussion: Direct I/O in Python with O_DIRECT
Here what works for me:
the code above for me prints contents of that block.
This is how it works:
this allocates 4k of memory using
mmap. Returned memory will be aligned at memory page start, which again is usually 4k.os.readcannot read to a buffer. So here they are using the following trick - create a file object from the os file descriptor:the above creates
fwhich is similar to what you get from normalopen- it will have read/write/close.Them they call
which allows them to read from file object into a pre-allocated block of memory.
Now is the final step is to extract the data from mmaped memory block: