Function to Split a String into Letters and Digits in C

298 Views Asked by At

I'm pretty new to C, and I'm trying to write a function that takes a user input RAM size in B, kB, mB, or gB, and determines the address length. My test program is as follows:

int bitLength(char input[6]) {

  char nums[4];
  char letters[2];

  for(int i = 0; i < (strlen(input)-1); i++){
    if(isdigit(input[i])){
      memmove(&nums[i], &input[i], 1);
    } else {
      //memmove(&letters[i], &input[i], 1);
    }
  }
  int numsInt = atoi(nums);
  int numExponent = log10(numsInt)/log10(2);
  printf("%s\n", nums);
  printf("%s\n", letters);
  printf("%d", numExponent);
  return numExponent;
}

This works correctly as it is, but only because I have that one line commented out. When I try to alter the 'letters' character array with that line, it changes the 'nums' character array to '5m2'

My string input is '512mB' I need the letters to be able to tell if the user input is in B, kB, mB, or gB. I am confused as to why the commented out line alters the 'nums' array.

Thank you.

2

There are 2 best solutions below

0
On

In your input 512mB, "mB" is not digit and is supposed to handled in commented code. When handling those characters, i is 3 and 4. But because length of letters is only 2, when you execute memmove(&letters[i], &input[i], 1);, letters[i] access out of bounds of array so it does undefined behaviour - in this case, writing to memory of nums array.

To fix it, you have to keep unique index for letters. Or better, for both nums and letters since i is index of input.

0
On

There are several problems in your code. @MarkSolus have already pointed out that you access letters out-of-bounds because you are using i as index and i can be more than 1 when you do the memmove.

In this answer I'll address some of the other poroblems.

string size and termination

Strings in C needs a zero-termination. Therefore arrays must be 1 larger than the string you expect to store in the array. So

char nums[4];     // Can only hold a 3 char string
char letters[2];  // Can only hold a 1 char string

Most likely you want to increase both arrays by 1.

Further, your code never adds the zero-termination. So your strings are invalid.

You need code like:

nums[some_index] = '\0';  // Add zero-termination

Alternatively you can start by initializing the whole array to zero. Like:

char nums[5] = {0};
char letters[3] = {0};

Missing bounds checks

Your loop is a for-loop using strlen as stop-condition. Now what would happen if I gave the input "123456789BBBBBBBB" ? Well, the loop would go on and i would increment to values ..., 5, 6, 7, ... Then you would index the arrays with a value bigger than the array size, i.e. out-of-bounds access (which is real bad).

You need to make sure you never access the array out-of-bounds.

No format check

Now what if I gave an input without any digits, e.g. "HelloWorld" ? In this case nothin would be written to nums so it will be uninitialized when used in atoi(nums). Again - real bad.

Further, there should be a check to make sure that the non-digit input is one of B, kB, mB, or gB.

Performance

This is not that important but... using memmove for copy of a single character is slow. Just assign directly.

memmove(&nums[i], &input[i], 1); ---> nums[i] = input[i];

How to fix

There are many, many different ways to fix the code. Below is a simple solution. It's not the best way but it's done like this to keep the code simple:

#define DIGIT_LEN 4
#define FORMAT_LEN 2

int bitLength(char *input)
{
    char nums[DIGIT_LEN + 1] = {0};    // Max allowed number is 9999
    char letters[FORMAT_LEN  + 1] = {0}; // Allow at max two non-digit chars

    if (input == NULL) exit(1);  // error - illegal input

    if (!isdigit(input[0])) exit(1); // error - input must start with a digit

    // parse digits (at max 4 digits)
    int i = 0;
    while(i < DIGITS && isdigit(input[i]))
    {
        nums[i] = input[i];
        ++i;
    }

    // parse memory format, i.e. rest of strin must be of of B, kB, mB, gB
    if ((strcmp(&input[i], "B") != 0) &&
        (strcmp(&input[i], "kB") != 0) &&
        (strcmp(&input[i], "mB") != 0) &&
        (strcmp(&input[i], "gB") != 0))
    {
          // error - illegal input
          exit(1);
    }
    strcpy(letters, &input[i]);

    // Now nums and letter are ready for further processing
    ...
    ...
  }

}