Here's some code to demonstrate my problem
#include <unicode/ucnv.h>
#include <stdio.h>
UConverter * converter;
void test_char(char inchar){
const char * inbuf=&inchar;
UErrorCode err=U_ZERO_ERROR;
UChar32 c = ucnv_getNextUChar(
converter,
&inbuf,
inbuf+1,
&err
);
printf("%x %s\n", c, u_errorName(err));
}
int main(){
UErrorCode err=U_ZERO_ERROR;
converter = ucnv_open("cp932", &err);
test_char(0x41); /*A*/
test_char(0xB1); /*ア*/
test_char(0xE1); /*Should be U_TRUNCATED_CHAR_FOUND*/
test_char(0xF1); /*Should be U_INVALID_CHAR_FOUND*/
return 0;
}
This code prints
41 U_ZERO_ERROR
ff71 U_ZERO_ERROR
1a U_ZERO_ERROR
1a U_ZERO_ERROR
Why do the invalid characters always return U_ZERO_ERROR when there is clearly an error? Why does it return the Substitute control code instead? isn't Substitute a valid SHIFT-JIS character? How do I distinguish between a valid Substitute and an invalid SHIFT-JIS string?
I found the answer in the fine print of the
ucnv.hlibraryTherefore if you want actual error codes you need to change your converter callback.