I am not sure about whether I have to cast a character to an unsigned char before being compared to the return of a getc family function.
The functions I consider getc family, are getc, fgetc and getchar
I am only talking about single-byte characters.
Here is example without the cast:
#include <stdio.h>
int main(void) {
int c;
while ((c = getchar()) != '\n' && c != EOF) // loop until newline or EOF
putchar(c);
return 0;
}
Here is an example with the cast:
#include <stdio.h>
int main(void) {
int c;
while ((c = getchar()) != (unsigned char)'\n' && c != EOF) // loop until newline or EOF
putchar(c);
return 0;
}
On the implementation I use, both work.
Is the cast required for portable programs?
I believe yes, because C11/N1570 7.21.7.1p2, emphasis mine:
If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).
The C standard guarantees that character constants for these characters have nonnegative values:1
AtoZandatoz,0to9,!,",#,%,&,’,(,),*,+,,,-,.,/,:,;,<,=,>,?,[,\,],^,_,{,|,}, and~,This follows from several sections of the C standard:
charis nonnegative.'x', containing a single character (including a single character resulting from an escape sequence, like'\n') is its value as acharconverted toint.The nonnegative
charvalues are always a subset of theunsigned charvalues, so each character constant of one of these characters will have the same value as the value returned bygetcwhen reading the same character.If you need to handle other characters and cannot ensure those characters have nonnegative values in your target platforms, then you should convert the character constants to
unsigned char.Footnote
1 There is one pedantic exception to this which does not occur in practice. In a C implementation in which
charandintare the same width andcharis unsigned,charmay have values not representable inint. In this case, the conversion is implementation-defined, so it may produce negative values. This conversion would be the same for converting the unsignedcharvalue tointfor the character constant and for converting theunsigned chargetcreturn value toint, so they would produce the same value for the same characters. Conceivably, the conversion might be defined to clamp instead of wrap, which would make multiple characters map to the same value and be impossible to distinguish. This would be a defect in the C implementation, and there would not be a way to work around it using only the features fully specified by the C standard.