I have audio data and I am not sure what is the best way to store it as matrix. I have 4 large files of recordings from acoustic sensors, each file has 4 channels data interleaved. I am using Qt C++ to do some treatements of these data. I already made this approach using QVector of QVectors to store the data in.
QVector<QVector<int>> buffer(16) // 4 * 4 : numberOfChannels * numberOfFiles
for(int i = 0 ; i < 4 ; i++){
QFile file(fileList[i]); // fileList is QList of QStrings contains 4 files path
if(file.open(QIODevice::ReadOnly)){
int k = 0;
while(!file.atEnd()){
QByteArray sample = file.read(depth/8); // depth here is 24
int integerSample = convertByteArrayToIntFunction(sample);
buffer[4 * i + (K%4)].append(integerSample);
k++;
}
}
}
To have at the end this matrix of 16 columns like below(f:file, c:channel):
f1c0 | f1c1 | f1c2 | f1c3 | f2c0 | f2c1 | ... | f4c2 | f4c3
But this approach it takes ages for large files of few gigabytes. I am wondering if there is another efficient way to fulfill this task and gain a lot of time. As I found, I can divide reading from files to chunks but still not clear for me. Thanks in advance.
There are two obvious antipatterns in your code.
The first one is not pre-sizing your
QVectors. This means that every so often a call toappendwill notice that the vector's storage is full, which triggers an allocation of memory for a larger vector and then copying the contents of the vector before theappendcan complete. You know in advance how many samples are in each file, so you can useQVector::reserveto allocate the right amount in advance and inhibit this behavior:Secondly, you are calling
file.read()for every sample. This means you are repeatedly paying the cost of retrieving data (although buffering will alleviate this a bit) and that of allocating aQByteArray. Instead, read a huge chunk of the file at once and then loop over that:You can play around with the
1'000'000number to see if there is a more optimal number, and you can probably gain a few percent more performance by passingconvertByteArrayToIntFunctionaconst char *, but more readable is probably better.