Why is my benchmark over-estimating read throughput?

107 Views Asked by At

I have some c# read throughput benchmarks for an application library I work on that seem suspiciously high and I'm trying to understand what I could do to make them more accurate.

I first measure how fast my test system can simply read and throw away data. The idea is that this baseline value should approach the hardware read throughput limit and provide a sanity check on my benchmarking approach. Here is the code for the baseline.

void ReadAndThrowAway(System.IO.FileStream stream, byte[] buf)
{
    int bytesRead;
    do {
        bytesRead = stream.Read(buf, 0, buf.Length);
    }
    while (bytesRead == buf.Length);
}

Test hardware is an Apple SSD AP1024N from a 2019 16" touch-bar MacBook Pro which I have read has about a 2.8 GB/s throughput. However, when I run the above code I get a throughput around 3.4 GB/s. Further, when I run it multi-threaded I get throughputs over 5 GB/s.

I have suspected that this could be because the file is cached in RAM. So to prevent caching I use a 32 GB test file. My thinking is that complete benchmark runs should always be forced to read the file again since I only have 16 GB RAM.

Is it possible that my Apple SSD really is this fast? Is this SSD PCIe 3.0 x 4 lanes? Is my benchmark over-reporting throughput somehow? And if so how?

A full write up and source code can be found here on Github.

1

There are 1 best solutions below

0
dynamicbutter On

Resolved. I just realized that the test only exceeds 3.4 GB/s when I turn off file caching using F_NOCACHE. This makes sense because I'm turning it off while a large part of the 32 GB test file is still lodged in memory. So its going to stay lodged in memory since nothing is overwriting the file cache anymore. The subsequent tests are going to be faster because a big chunk of their data is now cached in memory.