About Java's character encoding for System.in in Windows CMD?

33 Views Asked by At

Notice

My main users of this programme are from China.

Since my program involves English, numbers, Chinese, Japanese, Korean, and even other characters, I need to use UTF-8 encoding.

How do I get the following program to run properly in a Windows terminal (e.g. CMD, PowerShell)?

import java.util.Scanner;

public class Main {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in);
        System.out.println("Please copy the contents of the following line and paste it into the console and press enter to end it.");
        System.out.println("aaa你好123桜色舞うころbbb보여줄게456");
        System.out.print("Please input: ");
        String line = scanner.nextLine();
        System.out.println("You have entered: " + line);
        System.out.println("Whether what you have entered is consistent with what is expected: " + line.equals("aaa你好123桜色舞うころbbb보여줄게456"));
    }
}

It works fine in IntelliJ IDEA, and the result is:

Please copy the contents of the following line and paste it into the console and press enter to end it.
aaa你好123桜色舞うころbbb보여줄게456
Please input: aaa你好123桜色舞うころbbb보여줄게456
You have entered: aaa你好123桜色舞うころbbb보여줄게456
Whether what you have entered is consistent with what is expected: true

Run the command java Main.java in CMD and the results are respectively:

  • Code page: 936 (GBK)

    Please copy the contents of the following line and paste it into the console and press enter to end it.
    aaa你好123桜色舞うころbbb????456
    Please input: aaa你好123桜色舞うころbbb????456
    You have entered: aaa???123?@??褦????bbb????456
    Whether what you have entered is consistent with what is expected: false
    

    The second line of Korean cannot be output properly, and the subsequent Chinese and Japanese texts obtained are incorrect.

  • Code page: 65001 (UTF-8)

    Please copy the contents of the following line and paste it into the console and press enter to end it.
    aaa你好123桜色舞うころbbb보여줄게456
    Please input: aaa你好123桜色舞うころbbb보여줄게456
    You have entered: aaa&&123&&&&&&bbb&&&&456
    Whether what you have entered is consistent with what is expected: false
    

    The output of the second line is normal, but the subsequent Chinese, Japanese and Korean are not correct.

Try changing the code as follows:

import java.nio.charset.StandardCharsets;
import java.util.Scanner;

public class Main {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in, "UTF_8");
        System.out.println("Please copy the contents of the following line and paste it into the console and press enter to end it.");
        System.out.println("aaa你好123桜色舞うころbbb보여줄게456");
        System.out.print("Please input: ");
        String line = scanner.nextLine();
        System.out.println("You have entered: " + line);
        System.out.println("Whether what you have entered is consistent with what is expected: " + line.equals("aaa你好123桜色舞うころbbb보여줄게456"));
    }
}

The output in IntelliJ IDEA, CMD the output is consistent with the above, the result is not correct.

Try again to change the code as follows:

import java.util.Scanner;

public class Main {
    public static void main(String[] args) {
        Scanner scanner = new Scanner(System.in, "GBK");
        System.out.println("Please copy the contents of the following line and paste it into the console and press enter to end it.");
        System.out.println("aaa你好123桜色舞うころbbb보여줄게456");
        System.out.print("Please input: ");
        String line = scanner.nextLine();
        System.out.println("You have entered: " + line);
        System.out.println("Whether what you have entered is consistent with what is expected: " + line.equals("aaa你好123桜色舞うころbbb보여줄게456"));
    }
}

The result of the run in IDEA becomes:

Please copy the contents of the following line and paste it into the console and press enter to end it.
aaa你好123桜色舞うころbbb보여줄게456
Please input: aaa你好123桜色舞うころbbb보여줄게456
You have entered: aaa浣犲ソ123妗滆壊鑸炪亞銇撱倣bbb氤挫棳欷勱矊456
Whether what you have entered is consistent with what is expected: false

Run the command java Main.java in CMD and the results are respectively:

  • Code page: 936 (GBK)

      Please copy the contents of the following line and paste it into the console and press enter to end it.
      aaa你好123桜色舞うころbbb????456
      Please input: aaa你好123桜色舞うころbbb????456
      You have entered: aaa你好123桜色舞うころbbb????456
      Whether what you have entered is consistent with what is expected: false
    

    Although it seems that the output of the second and fourth lines are the same, the Korean text in the output of the second line cannot be output properly, and the result of the subsequent judgement is not true.

    If the input is correct, it is as follows.

    Please copy the contents of the following line and paste it into the console and press enter to end it.
    aaa你好123桜色舞うころbbb????456
    Please input: aaa你好123桜色舞うころbbb보여줄게456
    You have entered: aaa你好123桜色舞うころbbb????456
    Whether what you have entered is consistent with what is expected: false
    

    The results are still incorrect.

  • Code page: 65001 (UTF-8)

      Please copy the contents of the following line and paste it into the console and press enter to end it.
      aaa你好123桜色舞うころbbb보여줄게456
      Please input: aaa你好123桜色舞うころbbb보여줄게456
      You have entered: aaa&&123&&&&&&bbb&&&&456
      Whether what you have entered is consistent with what is expected: false
    

    The results are also still incorrect.

Besides

Also tried adding parameters to run, e.g:

java -Dfile.encoding=UTF-8 Main.java

OR

java -Dfile.encoding=UTF-8 -Dsun.stdout.encoding=UTF-8 -Dsun.stderr.encoding=UTF-8 Main.java

But none of the results were correct.

0

There are 0 best solutions below