Grapheme search in Java

190 Views Asked by At

So i am working on a project which involves searching of a word in different languages. I can easily get the Locale of the language but i dont know how to search for the word in another language. So the text can be in Chinese and the word to be searched can be in english. For example in php we have grapheme_stripos i am looking for a similar functionality in Java. I havent found anything which does a grapheme search in java. So one way might be to break down the string and store it in a byte array and search through it but isnt there something better like grapheme_stripos in php that solves the purpose?

1

There are 1 best solutions below

0
9000 On

PHP uses UTF-8, so searching for a grapheme is not trivial. Java uses UCS-2 where most of the characters (all the BMP chracters) are one Character wide. Some CJK are off BMP, though.

Look at the CodePoints-related functionality of java.lang.String. Most of the time, indexOf and regionMatches do the right thing.

Also, take a look at dedicated full-text search solution.