Java replace Japanese characters with \\p{Katakana} regular expression

3.6k Views Asked by At

i have followed that link

and the user "slevithan" offer using \p{Katakana}

public static void main(String[] args) {
    String str = "マイポケット (1).csv";
    str=    str.replaceAll(  "[\\p{Katakana}]", "_");//.replaceAll("\\p{Z}", "_");
    System.out.println(str);
}

but i get an error:

Exception in thread "main" java.util.regex.PatternSyntaxException: Unknown character property name {Katakana} near index 12
[\p{Katakana}]

i am working with java 8 . what is the correct syntax for checking Japanese characters with String replaceAll ?

2

There are 2 best solutions below

0
2Big2BeSmall On BEST ANSWER

The best solution was with this Regular-Expression when working with negative look-ahead .

str.replaceAll("(?![-,.,/p{Han}/p{Hiragana}/p{Katakana},\\p{IsAlphabetic}\\p{IsDigit}])[\\p{Punct}\\s]", "_");
7
2Big2BeSmall On

i needed to support both English and Japanese letters

that regular expression did the trick:

str.replaceAll(  "[/p{Han}/p{Hiragana}/p{Katakana}&&[^\\.^\\p{IsAlphabetic}^\\p{IsDigit}^-]]", "_");