How to compare char* to string literal in C?

1.9k Views Asked by At

I need to compare some char * (which I know the length of) with some string literals. Right now I am doing it like this:

void do_something(char * str, int len) {
  if (len == 2 && str[0] == 'O' && str[1] == 'K' && str[2] == '\0') {
    // do something...
  }
}

The problem is that I have many comparisons like this to make and it's quite tedious to break apart and type each of these comparisons. Also, doing it like this is hard to maintain and easy to introduce bugs.

My question is if there is shorthand to type this (maybe a MACRO).

I know there is strncmp and I have seen that GCC optimizes it. So, if the shorthand is to use strncmp, like this:

void do_something(char * str, int len) {
  if (len == 2 && strncmp(str, "OK", len) == 0) {
    // do something...
  }
}

Then, I would like to know it the second example has the same (or better) performance of the first one.

2

There are 2 best solutions below

3
Mad Physicist On BEST ANSWER

Your example implies that your strings are always NUL terminated. In that case, don't bother getting their length ahead of time, since that involves searching for the NUL. Instead, you can do

memcmp(str, "OK", 3);

This way, the NULs get compared too. If your length is > 2, the result will be > 0 and if it's shorter, the result will be < 0.

This is a single function call, and memcmp is virtually guaranteed to be better optimized than your hand-written code. At the same time, don't bother optimizing unless you find this code to be a bottleneck. Keep in mind also that any benchmark I run on my machine will not necessarily apply to yours.

The only real reason to make this change is for readability.

12
Schwern On

Yes it will. However, your code is not comparing a char * to a string literal. It is comparing two string literals. The compiler is smart enough to spot this and optimize all the code away. Only the code inside the if block remains.

We can see this by looking at the assembly code generated by the comiler:

cc -S -std=c11 -pedantic -O3 test.c

First with your original code...

#include <stdio.h>
#include <string.h>

int main() {
    unsigned int len = 2;
    char * str = "OK";
    if (len == 2 && strncmp(str, "OK", len) == 0) {
      puts("Match");
    }
}

Then with just the puts.

#include <stdio.h>
#include <string.h>

int main() {
    //unsigned int len = 2;
    //char * str = "OK";
    //if (len == 2 && strncmp(str, "OK", len) == 0) {
      puts("Match");
    //}
}

The two assembly files are practically the same. No trace of the strings remains, only the puts.

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 10, 14    sdk_version 10, 14
    .globl  _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    leaq    L_.str(%rip), %rdi
    callq   _puts
    xorl    %eax, %eax
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz  "Match"


.subsections_via_symbols

This is a poor place to focus on optimization. String comparison against small strings is very unlikely to be a performance problem.

Furthermore, your proposed optimization is likely slower. You need to get the length of the input string, and that requires walking the full length of the input string. Maybe you need that for other reasons, but its an increasing edge case.

Whereas strncmp can stop as soon as it sees unequal characters. And it definitely only has to read up to the end of the smallest string.