java, StandardCharsets utf-16 issue

81 Views Asked by newman At 29 September 2023 at 14:26

I'm trying to understand why the results are different when I try to write a test string using different encodings.

For StandardCharsets.UTF_16LE, the result is "test" (seems correct), while for StandardCharsets.UTF_16BE the result is " t e s t" (seems wrong).

Can someone please explain why in the case of UTF_16BE the result is having unnecessary spaces between letters?

String filename="C:\\Users\\name\\Downloads\\debugging.txt";
String str="test";

File fl = new File(filename);

try {
    FileOutputStream fos = new FileOutputStream(fl);

    //BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fos, StandardCharsets.UTF_16LE)); //seems does work
    BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fos, StandardCharsets.UTF_16BE)); //seems does not work

    bw.write(str);
} catch (IOException ignored) {
    //some actions
}

Original Q&A

There are 1 best solutions below

Pino On 29 September 2023 at 14:37 BEST ANSWER

Javadoc says:

When decoding, the UTF-16BE and UTF-16LE charsets interpret the initial byte-order marks as a ZERO-WIDTH NON-BREAKING SPACE; when encoding, they do not write byte-order marks.

Without a BOM, editors have to guess the correct encoding and they could be not so clever. Some editor could simply not support some encoding. So it depends on the editor that you use to read the file.

java, StandardCharsets utf-16 issue

There are 1 best solutions below

Related Questions in JAVA

Related Questions in ENCODING

Related Questions in UTF-16

Trending Questions

Popular # Hahtags

Popular Questions