How to calculate SHA-256 checksum of S3 file content in memory

2.3k Views Asked by At

I'm looking to compute the sha256 checksum of an image file located in S3 without writing it to disk.

I have tried the code below:

  const data = (
    await getS3().getObject({ Bucket: bucketName, Key: fileName }).promise()
  ).Body.toString('utf-8');

  logger.info(`File data: ${data}`);
  const sha256 = createHash('sha256');
  sha256.update(data);
  const hex = sha256.digest('hex');
  logger.info(`SHA256 HEX: ${hex}`);

The value of hex matches the same as other browser based tools (e.g. https://emn178.github.io/online-tools/sha256_checksum.html) if I use a basic .txt file, but when I use an image file (.png), I am getting a different value. Any ideas what I could be doing wrong?

I also tried using

const data = (
    await getS3().getObject({ Bucket: bucketName, Key: fileName }).promise()
  ).Body.toString('binary');

still the sha256 is different.

I think maybe I need to use S3.getObject(params).createReadStream() but I don't know why that would make a difference.

1

There are 1 best solutions below

1
Cole Omni On

I eventually got the solution here by using a read stream instead of the promise version of getObject:

  return new Promise((resolve, reject) => {
    const hash = createHash('sha256');
    const stream = S3.getObject({ Bucket: bucketName, Key: fileName }).createReadStream();

    stream.on('data', (d) => hash.update(d));
    stream.on('end', () => {
      const digest = hash.digest('hex');
      logger.debug(`SHA256: [${digest}]`);
      resolve(digest);
    });
    stream.on('error', reject);
  });