I'm trying to save text files using a Chinese character encoder GB2312. According to this document, GB2312 supports Cyrillic characters. Unfortunately, java can't save Cyrillic characters in GB2312 encoding. I used the below code.
Question: Does java's encoder is not fully supports all GB2312 supported characters? How can I see all supported characters in specific encoder?
Files.write(Path.of("output_gb2312.txt"), List.of("АБВГДЕЁЖЗИЙКЛМНОӨПРСТУҮФХЦЧШЩЪЫЬЭЮЯ"), Charset.forName("GB2312"));
Output:
Exception in thread "main" java.nio.charset.UnmappableCharacterException: Input length = 1
at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:275)
at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:307)
at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:132)
at java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:205)
at java.base/java.io.BufferedWriter.flushBuffer(BufferedWriter.java:120)
at java.base/java.io.BufferedWriter.close(BufferedWriter.java:268)
at java.base/java.nio.file.Files.write(Files.java:3587)
The characters Ө (U+04E8 CYRILLIC CAPITAL LETTER BARRED O) and Ү (U+04AE CYRILLIC CAPITAL LETTER STRAIGHT U) aren't part of the GB 2312 character set. Remove them from your string and your code will work.
Alternatively, GB 2312's replacement, GB 18030 will handle those two characters (And the rest of Unicode) if your java installation supports it.
Or you can set things up to replace unmappable characters instead of throwing an exception, though it's more cumbersome than using
Files.write():See the
CodingErrorActiondocumentation for the available options of how to handle an unmappable character.