How to decode a binary file containing an array of floats?

59 Views Asked by At

encoded_floats_hex = [ "35F1AC3D", "610BB63D", "8C25BF3D", "B73FC83D", "E259D13D", "0E74DA3D", "B5AB5C00", "D3972400", "6D9B2400", "B5A82400", "4A75E003", "ED20E403", "11D7F803", "0F7EEF03", "BAD41500", "56341200", "6B5C1200", "BE753200", "E94B3E00", "EA4D3600", "BE372E00", "6BDC1500", "56FC1200", "B9D41300", "E9CC7200", "D6347E00", "BB547600", "6E7C6E00", "5A541600", "B94B1200", "EECD1300", "DBD73200", "B7FD3D00", "6ADC3600", "59342E00", "B64C1600", "EA541200", "DF741200", "9281F000", "FC39F600", "5B0EFC00", "3254F600", "FB7B0B00", "EDEE0E00", "D71D0900", "3D2A1900", "2C261900", "573A1B00", "7D2A1700", "6B1E0B00", "55EA0E00", "3C660900", "2BE63900", "D56B3900", "FF1E3B00", "D52D3700", "D43A0B00", "3F260F00", "552A0900", "6B1A1900", "7D6E1900", "54FA1A00", "2C661700", "3DE60B00", "D71B0F00", "ED2E0900", "FB3D7900", "D42A7900", "2D267900", "3B1A7700", "55EA0B00", "6F7E0F00", "74EA0A00", "55661900", "2F261900", "352A1900", "DB3E1700", "EC2D0B00", "F51A0F00", "DBE70A00", "2D6A3900", "37FA3900", "5C6E3900", "6C1A3700", "77260B00", "5D260F00", "2B3A0B00", "352E1900", "DC1D1900", "EB6A1900", "F5E71600", "DF7A0B00", "35EA0F00", "341E0B00" ]

this array contain hex values for float 4 bytes on little endian, the first 6 values can be decoded but after those, the decoded floats are unexpected:

# Convert the hex strings to bytes
encoded_bytes = [bytes.fromhex(x) for x in encoded_hex]
# convert the each byte array into a float
float_values = [struct.unpack('f', x)[0] for x in encoded_bytes]
float_values 

[0.08444444090127945,
 0.08888889104127884,
 0.09333333373069763,
 0.09777777642011642,
 0.10222221910953522,
 0.1066666692495346,
 8.510462523131517e-39,
 3.360542129100596e-39,
 3.361834126284704e-39,
 3.366598541063408e-39,
 1.3192464977331103e-36,...]

It seems to be encoded by some custom delta encoding.

On the other hand, all the array are made of numbers between 0 and 1, which don't need exponential annotation, and increasing from 0.0 to 1.0 except for last values.

Any idea of how to decode it?

I tried several decodings, but none of the are giving the expected results.

1

There are 1 best solutions below

3
M Ciel On

hex values for float 4 bytes on little endian

Let's focus on this part.

The reading of file itself is trivial so we assume it's already read into memory as a list of such hex string.

Since your order is little endian, the byte can be decoded from the string by reversing the bytes.fromhex result:

bytes.fromhex("35F1AC3D")[::-1] 

Then all we need to do is to unpack the bytes to target format data (e.g., float here since your string has only 4 bytes):

float_out = struct.unpack('>f', bytes.fromhex("35F1AC3D")[::-1])[0]

Alternatively, simply use the '<f' format (little-endian) without reversing the bytes also works:

float_out = struct.unpack('<f', bytes.fromhex("35F1AC3D"))[0]

Simply perform the operation to each element and voilà.

Per @Marek Piotrowski's suggestion, a bit more information from here:

For the 'f', 'd' and 'e' conversion codes, the packed representation uses the IEEE 754 binary32, binary64 or binary16 format (for 'f', 'd' or 'e' respectively), regardless of the floating-point format used by the platform.