I want to implement a custom InputStream, that calculates Hashes (MD5, SHA1 etc.) and is then wrapped inside another InputStream, that encrypts the Data. The goal is to retrieve the encrypted data later and throw an Exception, if the Hashes do not match anymore.
The use case is pretty simple:
upload file -> stream contents and -> calculate hashes and -> encrypt and -> save the encrypted data
After some digging, I found this simple implementation, that also uses an internal buffer (which seems to be required when I want to achieve somewhat of a performance. My first attempt was to do this in a single InputStream implementation, but I failed horribly.
So instead I am just calculating the Hashes on-the-fly, but even that does not work properly. This is my current implementation:
public class HashingInputStream extends InputStream {
private final static int BUFFER_LENGTH = 16;
private final List<MessageDigest> digests;
private final InputStream inputStream;
private final byte[] buffer = new byte[BUFFER_LENGTH];
private int writeIndex, readIndex;
private boolean eof = false;
public HashingInputStream(InputStream inputStream, List<MessageDigest> digests) {
this.inputStream = inputStream;
this.digests = digests;
}
@Override
public int read() throws IOException {
if(eof)
return -1; // why??
if (readIndex == writeIndex) {
if (writeIndex == buffer.length) {
writeIndex = readIndex = 0;
}
// read bytes into buffer
int bytesRead = 0;
while (bytesRead == 0) {
bytesRead = readBytesIntoBuffer();
}
// if no more data could be read in, return -1 and mark stream as finished
if (bytesRead == -1) {
eof = true;
return -1;
} else {
// update hashes
for (MessageDigest digest : digests) {
digest.update(buffer, 0, bytesRead);
}
}
}
return 255 & buffer[readIndex++];
}
private int readBytesIntoBuffer() throws IOException {
int bytesRead = inputStream.read(buffer, writeIndex, buffer.length - writeIndex);
writeIndex += bytesRead;
return bytesRead;
}
@Override
public void close() throws IOException {
inputStream.close();
}
}
When bytesRead is set to -1 and thus the stream is finished, this stream itself should be finished, but it somehow executes read once more and therefore needed the eof property. If that's not bugging enough, the calculated Hashes differ.
The calling side is straight foreward:
MessageDigest md5 = DigestUtils.getMd5Digest();
MessageDigest sha1 = DigestUtils.getSha1Digest();
try(HashingInputStream uploadInputStream = new HashingInputStream(file.getInputStream(), List.of(md5, sha1))) {
storage.persist(uploadInputStream, upload.getId());
upload.setMd5Hash(DigestUtils.md5Hex(md5.digest()));
upload.setSha1Hash(DigestUtils.sha1Hex(sha1.digest()));
}
Thanks for helping out! Is there any chance I completely misunderstood something?