I'm a student and I'm pretty new to C++ and security. I've been given an assignment about checking the signature/magic number in a file, and I'm having a little problem about speeding up the reading time.
My idea is to read the file in binary mode using ifstream, store its data in a vector, then translate it to a hexadecimal string. Finally, I'll check if the given signature exits in the hex string.
Things went theoretically fine, except that the the whole process of allocating the vector's memory, reading and converting the file's data takes ages. Only the reading part takes 44ms.
I wonder how I can improve this? Here is my code
UINT CheckForSignature(CString source, CString dest_path) {
// source is the HEX string need to find in file, dest_path is the destination of the file
ifstream file(dest_path, ios::binary);
if (file.is_open()) {
// check for size of the file
file.seekg(0, ios::end);
int iFileSize = file.tellg();
// if the file size exceed 50MB, pass
if (iFileSize > 50000000) {
// return -1, means file exceed 50MB, which do not need to be checked
return -1;
}
// read file and store data in hex string
file.seekg(0, ios::beg);
vector<char> memblock(iFileSize);
file.read(((char*)memblock.data()), iFileSize); // 18ms alloc memory
ostringstream ostrData; // 44ms read file
// add to a total of 62ms
// if consider the time need to translate all the memblock
// then this will be long as hell
// need to improve this
for (int i = 0; i < memblock.size(); i++) {
int z = memblock[i] & 0xff;
ostrData << hex << setfill('0') << setw(2) << z;
}
string strDataHex = ostrData.str();
string strHexSource = (CT2A)source;
if (strDataHex.find(strHexSource) != string::npos) {
// return 1, means there exits the signature in the file
return 1;
}
else {
// return 0; means there isn't the signature in the file
return 0;
}
}
}
I'm open to all help and suggestions about solutions and code improvement. Thank you very much!
There are much more performant ways to read and examine file content.
Here I show one naive/simple way (just an example.)
I've created a 51M file with "0000" at the end (I've removed the size limit):
(Showing last two lines.)
Running your code (20 runs):
Running my example (20 runs):
With 5M file.
Yours:
Example:
Script to compile and run (you can try several buffer sizes for my example):