Adding two byte arrays which represent unsigned numbers together fails to add last carry in c

165 Views Asked by At

I have been writing a big numbers library for practice reasons in C and the addition operation is causing me some issues. It works normally but ignores the carry at the end and the result is incorrect.

I have been trying to write a big number library in c and here is the code so far for the addition operation. Code so far:

The BIG_INT type is a typedef for a unsigned char array of 8 bytes. 8 bytes is used for testing, it will be like 512 bytes later.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define BIG_INT_SIZE 8
typedef unsigned char BIG_INT[BIG_INT_SIZE];

int add_big_int(BIG_INT *left, size_t right)
{
    if (!right)
        return 0;

    char buffer[sizeof(size_t)];
    memcpy(buffer, &right, sizeof(size_t));

    unsigned int carry = 0;
    for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i) {
        unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;
        *left[i] = sum;
        carry = sum >> 8;
    }

    printf("carry: %u\n", carry);
    return 0;
}

int main() 
{
    BIG_INT big = { 0, 0, 0, 0, 0, 0, 0, 1 };
    add_big_int(&big, 255);
    return 0;
}

When adding numbers like 256 + 1 = 257, the byte array is [0, 0, 0, 0, 0, 0, 1, 1] (reverse order), however when adding 255 + 1 = 256, the byte array is [0, 0, 0, 0, 0, 0, 0, 0] instead of the expected [0, 0, 0, 0, 0, 0, 1, 0] and the carry variable is set to 1.

2

There are 2 best solutions below

1
Eric Postpischil On BEST ANSWER

The code is disorganized, and some organization will help fix it.

Look at this loop: for (int64_t i = BIG_INT_SIZE - 1; i >= 0; --i). That iterates through BIG_INT_SIZE elements, using left[i] and buffer[i]. How many elements are in buffer?

buffer is declared char buffer[sizeof(size_t)];, so, if sizeof(size_t) is different from BIG_INT_SIZE, the loop is massively broken. You either need buffer and left to have the same number of elements or you need code in the loop to handle disparate sizes.

Let’s make them the same size, using char buffer[BIG_INT_SIZE];.

Next, compare these two declarations:

typedef unsigned char BIG_INT[BIG_INT_SIZE];
   char buffer[BIG_INT_SIZE];

What is different? One uses unsigned char, and the other uses char. The C standard does not say whether char is signed or unsigned. When you put 255 in signed char, it is likely to take on the value −1, although this is implementation-defined. You want unsigned char:

   unsigned char buffer[BIG_INT_SIZE];

Next, you have memcpy(buffer, &right, sizeof(size_t));, but we need to account for the change in how the size of buffer is set. Also, memcpy copies the bytes in whatever order they are in memory, but you want the low bytes at higher addresses, judging from your other code. So what you can do is:

   for (int i = 0; i < sizeof right; ++i)
   {
       buffer[BIG_INT_SIZE - 1 - i] = right & 0xff;
       right >>= 8;
   }
   for (int i = sizeof right; i < BIG_INT_SIZE; ++i)
       buffer[BIG_INT_SIZE - 1 - i] = 0;

The first loops puts the low-value byte of right into the high-address element of buffer and then shifts the bytes of right right. The second loop fills in any further elements of buffer with zeros.

A potential hazard here is that right is larger than buffer, which would make the first loop overrun the buffer. We can guard against that with:

    _Static_assert(sizeof right <= sizeof buffer, "right is too big for buffer");

I used & 0xff above, mimicking your existing code. Often, people would design for an eight-bit byte, in which case the & 0xff is unnecessary; the assignment could be just buffer[BIG_INT_SIZE - 1 - i] = right;, and the automatic conversion of right to unsigned char will effectively remove higher bits. If you make that change, you may want to insert a _Static_assert that CHAR_BIT is eight. CHAR_BIT is defined in <limits.h>.

Next, look at unsigned int sum = (*left[i] & 0xff) + (buffer[i] & 0xff) + carry;. Array subscripting has higher precedence than pointer dereferencing, so *left[i] is *(left[i]), not (*left)[i]. left is a pointer to BIG_INT, and just one BIG_INT is passed, so you cannot use left[i] with i being non-zero. You need (*left) to get the BIG_INT and then [i] to select an element from it, so you want unsigned int sum = ((*left)[i] & 0xff) + (buffer[i] & 0xff) + carry;.

Similarly, *left[i] = sum; should be (*left)[i] = sum;.

With those changes made, your code will get the correct answer for the case you show. However, you should consider other design aspects, including:

  • Do you want to define BIG_INT as an array? It may be safer as a structure containing an array. This would have caught the *left[i] error earlier?
  • Why is the type of right size_t? size_t has a connotation that it is used with object sizes, not arbitrary arithmetic values that might be used with big-integer types. Do you want to use a signed type for it?
  • Why does the loop use int64_t for i? That is a fixed-width type. You might want a type like ptrdiff_t, which is suitable for indexing arrays (it is signed, so it can be used with an i >= 0 test, and it is a type designed to adjust its width to the target platform.)
  • What will you do when the final carry is non-zero?
  • Will add_big_int always return zero, or do you have some future plan for it?
0
Vlad from Moscow On

For starters the first parameter of the function is a pointer

int add_big_int(BIG_INT* left, size_t right)
                ^^^^^^^^

If BIG_INT is defined like

typedef unsigned char BIG_INT[8];

then it means that the parameter has the type char ( * )[8].

There is no great sense to pass the array through a pointer to it. It is enough to declare the function like

int add_big_int(BIG_INT left, size_t right)

and call it like

add_big_int(big, 255);

The function always returns 0 that is not informative for the user of the function. Either the function should return meaningful values or its return type should be void.

Using the type size_t as for example

char buffer[sizeof(size_t)];

only confuses readers of the code.

Insetad you could use expression sizeof( BIG_INT ) provided that the array BIG_INT is large enough to store values of the type size_t.

The used expression *left[i] invokes undefined behavior. Instead you have to use expression ( *left )[i]. But in any case using expressions with the bitwise AND operator like

(buffer[i] & 0xff)

has no effect because you have arrays with the element type unsigned char.

In general you should use a dynamically allocated array enclosed in a structure due to using the carry flag because there can be an overflow.