I have created a text file with following characters for testing utf-8 encoding:
%gÁüijȐʨΘЋЮѦҗԘՔהڳضणணษ༒Ⴃᎃᡧᬐ⁜₪≸☺⛜⺟むヸ㒦㢒
I also have written this program in C to open file and read it:
#pragma warning(disable:4996)
#include <stdio.h>
#include <stdlib.h>
int main() {
FILE *ptr;
ptr = fopen("inputtest.txt", "r, ccs=UTF-8");
char input[50];
if (ptr == NULL)
perror("Error opening file");
else {
if (fgets(input, 50, ptr) != NULL) {
puts(input);
}
printf(input);
fclose(ptr);
}
}
If I don't use ccs=UTF-8, I will get some unreadable characters. But with it, the program crashes with code -1073740791. Also after using wchar_t and fgetws the program's output was just %.
Note: I am using windows 11 and visual studio 2022 and I need to input multi-language characters.
Consider using
fgetws(3), instead, and usingsetlocale(3)prior to that. For one byte characters, you are limited to ascii or at most one byte characters. And of course, usewchar_tcharacters, instead ofchar.But, if you use utf-8 encoding, all bytes are read as bytes, and can be printed as bytes. You can read and write those without interpreting them (except, of course if you want to interpret them):
should work:
A better approach (to asciify its utf8 input) is shown below:
that, on the given input, will output: