Java InputStream: copy to and calculate hash at the same time

798 Views Asked by At

Here are my two code snippets:

public class Uploader {

  private static final String SHA_256 = "SHA-256";

  public String getFileSHA2Checksum(InputStream fis) throws IOException {
    try {
      MessageDigest md5Digest = MessageDigest.getInstance(SHA_256);
      return getFileChecksum(md5Digest, fis);
    } catch (NoSuchAlgorithmException e) {
      return "KO";
    }
  }

  public void transferTo(InputStream fis) throws IOException {
    FileUtils.copyInputStreamToFile(fis, file2);
  }

My code uses this class as:

Is it possible to copyToFile and calculateChecksum at the same time leveraging InputStream is open?

3

There are 3 best solutions below

0
Mark Rotteveel On BEST ANSWER

You can use the DigestInputStream to calculate a hash while reading from a stream. That is, you wrap the original input stream with a DigestInputStream and read through the DigestInputStream. While reading the data, the message digest is automatically updated, and you can retrieve the digest after you read the entire stream.

Alternatively, you can use DigestOutputStream to calculate a hash while writing to a stream. In a similar vein, you wrap the destination output stream with a DigestOutputStream and write through the DigestOutputStream.

A quick and dirty example:

var inputFile = Path.of("D:\\Development\\data\\testdata-csv\\customers-1000.csv");
var outputFile = Files.createTempFile("testoutput", ".dat");
var md = MessageDigest.getInstance("SHA-256");
try (var in = new DigestInputStream(Files.newInputStream(inputFile), md);
     var out = Files.newOutputStream(outputFile)) {
    in.transferTo(out);
} finally {
    Files.deleteIfExists(outputFile);
}

System.out.println(HexFormat.of().formatHex(md.digest()));

In terms of your existing code, you could do something like:

public String transferAndHash(InputStream in) throws IOException {
    try {
        var md = MessageDigest.getInstance("SHA-256");
        try (var digestIn = new DigestInputStream(in, md)) {
            transferTo(digestIn);
        }
        return HexFormat.of().formatHex(md.digest());
    } catch (NoSuchAlgorithmException e) {
        // all recent Java versions are required to support SHA-256
        throw new AssertionError("Expected SHA-256 to be supported", e);
    }
}

(NOTE: HexFormat was introduced in Java 17, if you're using an earlier version, you'll need an alternative.)

0
Keijack On

If you want to read the input stream once, you might manual read the bytes from the input string and write it to you file by yourself.

    private static final String SHA_256 = "SHA-256";
    private static final int BUFFER = 1024;

    public String writeToFile(InputStream fis, File outputFile) throws IOException {
        try (var fout = new FileOutputStream(outputFile); fis) {
            MessageDigest md5Digest = MessageDigest.getInstance(SHA_256);
            var buffer = new byte[BUFFER];
            var bytesRead = 0;
            while (true) {
                bytesRead = fis.read(buffer);
                if (bytesRead < 0)
                    break;
                md5Digest.update(buffer, 0, bytesRead);
                fout.write(buffer, 0, bytesRead);
            }
            var checksume = md5Digest.digest();
            return Hex.encodeHexString(checksume);
        } catch (NoSuchAlgorithmException e) {
            return "KO";
        }
    }
4
Miss Chanandler Bong On

Here is an example using DigestInputStream based on Mark Rotteveel's suggestion (which I liked):

Path input = Paths.get("input_file");
Path output = Paths.get("output_file");
MessageDigest algorithm = MessageDigest.getInstance("SHA-256");
try (InputStream is = Files.newInputStream(input);
     DigestInputStream hashingStream = new DigestInputStream(is, algorithm)) {
    Files.copy(hashingStream, output);
}
byte[] digest = algorithm.digest();
// this line uses Apache Commons Codec to show the hex representation of the byte[]
String hash = Hex.encodeHexString(digest);