What is the meaning of "Yes, Virginia, it had better be unsigned"?

327 Views Asked by At

In the linux source code version 3.18 (and previous), in the string.c file, in the function strncasecmp, the very first thing is:

/* Yes, Virginia, it had better be unsigned */
unsigned char c1, c2;

As can be seen here: http://lxr.free-electrons.com/source/lib/string.c

What is the meaning of this?

1

There are 1 best solutions below

3
the gods from engineering On BEST ANSWER

string.c:strncasecmp() calls __tolower from include/linux/ctype.h which expects an unsinged char.

EDITed to add: In general you always want to pass unsigned char to ctype.h functions because of C Standard §7.4, which says the behavior is undefined if the argument to ctype.h functions is not representable as unsigned char or EOF. So that probably explains the "Yes, Virginia" bit.

What is a bit more mysterious is that include/linux/ctype.h actually appears idiot-proof in this respect, because it does its own safety-minded cast in #define __ismask(x) (_ctype[(int)(unsigned char)(x)]). I'm not sure when the "Yes, Virginia" comment was added relative to this other line, but with the current version of include/linux/ctype.h it appears that string.c:strncasecmp() would work fine even with char for c1 and c2. I haven't really tried to change & test it that way though...

Also if you go back to Linux 2.0.40's ctype.h, the safety-minded cast ((int)(unsigned char)) is not there anymore. There's no "Virginia" comment either in 2.0.40's string.c, but there's not even a strncasecmp in it. It looks like both changes were made somewhere in between Linux 2.0 and 2.2, but I can't tell you more right now which came first, etc.