Why is the Protobuf blob heavier than the JSON equivalent?

249 Views Asked by At

I'm trying to use protobuf to accelerate data transfers between my front and back.

As a POC, I tried to load a JSON file, turn it into a protobuf buffer, and save the result in a new file.

But it turns out that the new file is heavier than the JSON one. Did I do something wrong?

Here are my files:

// input.proto

syntax = "proto3";

message MyData {
    repeated float a = 1;
    repeated float b = 2;
    repeated float c = 3;
    float d = 4;
    repeated float e = 5;
}
// index.mjs

import protobuf from 'protobufjs';
import fs from 'fs';

protobuf.load('./input.proto', (err, root) => {
    const payload = JSON.parse(fs.readFileSync('./input.json', {encoding: 'utf8'}));

    var Message = root.lookupType("MyData");

    var errMsg = Message.verify(payload);
    if (errMsg)
        throw Error(errMsg);

    var message = Message.create(payload);
    const buffer = Message.encode(message).finish();

    fs.writeFileSync('./output.pb', buffer, 'binary');
}, () => {

});
// input.json
{
  "a": [1, 2.4, 3, 4],
  "b": [1, 2, 3, 4],
  "c": [1, 2, 3.2, 4],
  "d": 10.321,
  "e": [1, 2, 3.7, 4],
}

(my actual json is much bigger than that, but it respects the same format as this one)


And finally :

$ du -h input.json output.pb
2,0M    input.json
2,5M    output.pb

Thanks for your help!

2

There are 2 best solutions below

3
eik On BEST ANSWER

On reason could be that float in Protocol Buffers is encoded as I32, so every number needs 4 bytes. In JSON (UTF8) a single digit number is represented in 3 bytes (space, number and comma). You can also omit the space, making JSON even more compact.

0
Zorzi On

I ended up using NodeJS Buffers to reduce the size of my floats, here's my solution :

syntax = "proto3";

message MyData {
    bytes a = 1;
    bytes b = 2;
    bytes c = 3;
    bytes d = 4;
    bytes e = 5;
}
import protobuf from 'protobufjs';
import fs from 'fs';

protobuf.load('./input.proto', (err, root) => {
    const payload = JSON.parse(fs.readFileSync('./input.json', {encoding: 'utf8'}));

    var Message = root.lookupType("MyData");

    const formatedPayload = {};

    const encodeBuffer = (key) => {
        // Creating buffer, 2 bytes per element
        const buff = Buffer.alloc(2 * payload[key].length);
        payload[key].forEach((num, idx) => {
            // Writing new int in buffer.
            // Multiplying by 10, so that I keep a floating number in memory,
            // I will have to divide by 10 when decoding.
            buff.writeUInt16BE(num*10, idx * 2);
        });
        formatedPayload[key] = buff;
    }

    encodeBuffer('a');
    encodeBuffer('b');
    encodeBuffer('c');
    encodeBuffer('e');

    const dbuffer = Buffer.alloc(2);
    dbuffer.writeUInt16BE(payload.d * 10);
    formatedPayload.d = dbuffer;

    var errMsg = Message.verify(formatedPayload);
    if (errMsg)
        throw Error(errMsg);

    // var message = Message.create(formatedPayload);
    const buffer = Message.encode(formatedPayload).finish();

    fs.writeFileSync('./output.pb', buffer, 'binary');
}, () => {

});