How to find checksum (CRC16?) for messages

53 Views Asked by At

I have a list of messages from a CAN bus log. Some of the messages seem to have 2 bytes with what appears to be checksum, and then a sequence (or counter byte)... and then 5 more bytes. Here is a sample list. The first two bytes are the message id, then is the length (08) and then the message contents, which consists of the 2 bytes checksum (65 AB for the first one in the list), 1 byte sequence (00 for the first one) and then the content.

0185    Rx  8   65  AB  00  00  00  00  00  00
0185    Rx  8   B6  EC  01  00  00  00  00  00
0185    Rx  8   C3  24  02  00  00  00  00  00
0185    Rx  8   10  63  03  00  00  00  00  00
0185    Rx  8   08  A4  04  00  00  00  00  00
0185    Rx  8   DB  E3  05  00  00  00  00  00
0185    Rx  8   AE  2B  06  00  00  00  00  00
0185    Rx  8   7D  6C  07  00  00  00  00  00
0185    Rx  8   BF  B5  08  00  00  00  00  00
0185    Rx  8   6C  F2  09  00  00  00  00  00
0185    Rx  8   19  3A  0A  00  00  00  00  00
0185    Rx  8   CA  7D  0B  00  00  00  00  00
0185    Rx  8   D2  BA  0C  00  00  00  00  00
0185    Rx  8   01  FD  0D  00  00  00  00  00
0185    Rx  8   74  35  0E  00  00  00  00  00
0185    Rx  8   A7  72  0F  00  00  00  00  00
0185    Rx  8   D1  96  10  00  00  00  00  00
0185    Rx  8   02  D1  11  00  00  00  00  00
0185    Rx  8   77  19  12  00  00  00  00  00
0185    Rx  8   A4  5E  13  00  00  00  00  00
0185    Rx  8   BC  99  14  00  00  00  00  00
0185    Rx  8   6F  DE  15  00  00  00  00  00
0185    Rx  8   1A  16  16  00  00  00  00  00
0185    Rx  8   C9  51  17  00  00  00  00  00
0185    Rx  8   0B  88  18  00  00  00  00  00
0185    Rx  8   D8  CF  19  00  00  00  00  00
0185    Rx  8   AD  07  1A  00  00  00  00  00
0185    Rx  8   7E  40  1B  00  00  00  00  00
0185    Rx  8   66  87  1C  00  00  00  00  00
0185    Rx  8   B5  C0  1D  00  00  00  00  00
0185    Rx  8   C0  08  1E  00  00  00  00  00
0185    Rx  8   13  4F  1F  00  00  00  00  00
0185    Rx  8   0D  D0  20  00  00  00  00  00
0185    Rx  8   DE  97  21  00  00  00  00  00
                                        
0180    Rx  8   43  13  01  00  00  00  00  00
0180    Rx  8   36  DB  02  00  00  00  00  00
0180    Rx  8   E5  9C  03  00  00  00  00  00
0180    Rx  8   FD  5B  04  00  00  00  00  00
0180    Rx  8   2E  1C  05  00  00  00  00  00
0180    Rx  8   5B  D4  06  00  00  00  00  00
0180    Rx  8   88  93  07  00  00  00  00  00
0180    Rx  8   4A  4A  08  00  00  00  00  00
0180    Rx  8   99  0D  09  00  00  00  00  00
0180    Rx  8   EC  C5  0A  00  00  00  00  00
0180    Rx  8   3F  82  0B  00  00  00  00  00
0180    Rx  8   27  45  0C  00  00  00  00  00
0180    Rx  8   F4  02  0D  00  00  00  00  00
0180    Rx  8   81  CA  0E  00  00  00  00  00
0180    Rx  8   52  8D  0F  00  00  00  00  00
0180    Rx  8   24  69  10  00  00  00  00  00
0180    Rx  8   F7  2E  11  00  00  00  00  00
0180    Rx  8   82  E6  12  00  00  00  00  00
0180    Rx  8   51  A1  13  00  00  00  00  00
0180    Rx  8   49  66  14  00  00  00  00  00
0180    Rx  8   9A  21  15  00  00  00  00  00
0180    Rx  8   EF  E9  16  00  00  00  00  00
0180    Rx  8   3C  AE  17  00  00  00  00  00
0180    Rx  8   FE  77  18  00  00  00  00  00
0180    Rx  8   2D  30  19  00  00  00  00  00
0180    Rx  8   58  F8  1A  00  00  00  00  00
0180    Rx  8   8B  BF  1B  00  00  00  00  00
0180    Rx  8   93  78  1C  00  00  00  00  00
0180    Rx  8   40  3F  1D  00  00  00  00  00
0180    Rx  8   35  F7  1E  00  00  00  00  00
0180    Rx  8   E6  B0  1F  00  00  00  00  00
0180    Rx  8   F8  2F  20  00  00  00  00  00

I have included messages with id 0x180 and 0x185, that have the same content but different checksum, which leads me to believe that the message id is somehow affecting the calculation of the checksum.

I have tried to use reveng tool (https://reveng.sourceforge.io/) and can get "poly", "init" and other parameters, these are valid and work for 4 messages, but they do not work for the the 5th message, and if I run the reveng for the next 4 messages I get the same "poly" but different "init" parameter. For example:

reveng -w16 -s 00000000000065AB 010000000000B6EC 020000000000C324 0300000000001063
width=16  poly=0x887b  init=0xb372  refin=true  refout=true  xorout=0x0000  check=0x81e8  residue=0x0000  name=(none)

reveng -w16 -s 04000000000008A4 050000000000DBE3 060000000000AE2B 0700000000007D6C
width=16  poly=0x887b  init=0x1ea4  refin=true  refout=true  xorout=0x0000  check=0x35c0  residue=0x0000  name=(none)

It seems unlikely that the init parameter changes for each message. So I am missing something.

I have also read and tried the method described in this paper (https://www.csse.canterbury.ac.nz/greg.ewing/essays/CRC-Reverse-Engineering.html), and although the checksums have a same XOR result for same bit changed, it does not provide me with any further clue as to how to be able to calculate these checksums.

I also read a similar question here (Unknown CRC Calculation), and compiled the provided answer, but was not able to replicate the results in the question.

#include <stddef.h>

// Return a with the low 16 bits reversed and any bits above that zeroed.
static unsigned rev16(unsigned a) {
    a = (a & 0xff00) >> 8 | (a & 0x00ff) << 8;
    a = (a & 0xf0f0) >> 4 | (a & 0x0f0f) << 4;
    a = (a & 0xcccc) >> 2 | (a & 0x3333) << 2;
    a = (a & 0xaaaa) >> 1 | (a & 0x5555) << 1;
    return a;
}

// Implement the CRC specified in the BASECAM SimpleBGC32 2.6x serial protocol
// specification. Return crc updated with the length bytes at message. If
// message is NULL, then return the initial CRC value. This CRC is like
// CRC-16/ARC, but with the bits reversed.
//
// This is a simple bit-wise implementation. Byte-wise and word-wise algorithms
// using tables exist for higher speed if needed. Also this implementation
// chooses to reverse the CRC bits as opposed to the data bits, as done in the
// specficiation appendix. The CRC only needs to be reversed once at the start
// and once at the end, whereas the alternative is reversing every data byte of
// the message. Reversing the CRC twice is faster for messages with length
// greater than two bytes.
unsigned crc16_simplebgc(unsigned crc, void const *message, size_t length) {
    if (message == NULL)
        return 0;
    unsigned char const *data = message;
    crc = rev16(crc);
    for (size_t i = 0; i < length; i++) {
        crc ^= data[i];
        for (int k = 0; k < 8; k++)
            crc = crc & 1 ? (crc >> 1) ^ 0x887b : crc >> 1;
            //crc = crc & 1 ? (crc >> 1) ^ 0xa001 : crc >> 1;
    }
    return rev16(crc);
}

#include <stdio.h>

// Example usage of crc_simplebgc(). A CRC can be computed all at once, or with
// portions of the data at a time.
int main(void) {
    unsigned crc = crc16_simplebgc(0, NULL, 0);         // set initial CRC
    //crc = crc16_simplebgc(crc, "\x01\x85", 2);      // message id
    //crc = crc16_simplebgc(crc, "\x08", 1);      // message length
    crc = crc16_simplebgc(crc, "\x00\x00\x00\x00\x00\x00", 6);      // message content
    printf("%04x\n", crc);                              // expecting 65AB
    return 0;
}

I also have some code in python that tests the list of data, but it is useless without correct parameters to the crc function.

Ultimately I would want to use this in node-red function so I would need it in javascript, but I think I can do that myself once I have working code in any language.

Edit: Some more messages that seem to follow the same pattern but with 16 bytes length:

020A    Rx  16  C5  69  0C  09  50  00  00  00  0A  44  00  00  00  00  00  00
020A    Rx  16  E1  C1  0D  09  50  00  00  00  0A  44  00  00  00  00  00  00
020A    Rx  16  39  C7  0C  08  50  00  00  00  0A  44  00  00  00  00  00  00
020A    Rx  16  1D  6F  0D  08  50  00  00  00  0A  44  00  00  00  00  00  00
020A    Rx  16  50  87  0E  08  50  00  00  00  0A  44  00  00  00  00  00  00
1

There are 1 best solutions below

1
Mark Adler On

The two bytes do appear to be a linear function over GF(2) of the other bits. CRCs are in that class, but it is more general than CRCs. I did not find a sequencing of the bits that indicates that they are a CRC. All I could get was the linear function for the bits that changed.

So for each of the six bits of the one data bytes that varies in those bits, you would take each 1 bit, and exclusive-or together the 16-bit values in this table to get the check bytes, for address 0180:

20: 687b
10: b43d
08: da1e
04: 6d0f
02: a68f
01: d347

If the address is 0185, then also exclusive-or this value:

5: f5ff

Then exclusive-or the result with 9054.

This was determined simply by row-reduction.