I have a Notepad++. The Encoding is UTF-8, in notepad I have two text
Thành
Thành
But when i use Find dialog to search "Thành" the result has only 1 result. I change the Notepad++ encoding to ANSI. It show
Thà nh
Thành
Why are they different in ANSI ? What should i do to make they same ?
Your strings differ on Unicode Normalization (demonstrated merely for relevant characters):
The former string is
T(U+0054, Latin Capital Letter T)h(U+0068, Latin Small Letter H)à(U+00E0, Latin Small Letter A With Grave)n(U+006E, Latin Small Letter N)h(U+0068, Latin Small Letter H)while the latter one is
T(U+0054, Latin Capital Letter T)h(U+0068, Latin Small Letter H)a(U+0061, Latin Small Letter A)̀(U+0300, Combining Grave Accent)n(U+006E, Latin Small Letter N)h(U+0068, Latin Small Letter H)You invoke a mojibake case (example in Python for its universal intelligibility):