What causes the ГѓВ pattern in this Mojibake?

137 Views Asked by At

Google ГѓВ (UTF-8: D0 93 D1 93 D0 92) and you'll see a few examples of what seems to be Mojibake. A specific example is ö becoming ГѓВ¶.

What kind of encodings did the original ö go through to become ГѓВ¶? How would I figure this out?

1

There are 1 best solutions below

2
Aly On BEST ANSWER

When searching for ГѓВ¶, you can hit a website about Home Decor with a post titled äèçàéí èíòåðüåðîâ. Throwing this into an online mojibake decoder/fixer gives us the string äèçàéí èíòåðüåðîâ, which at first looks like garbage, but the mojibake decoder also gives us a list of steps:

Mojibake Decoder's list of steps

We can follow these steps backwards with ö to see if we get the original ГѓВ¶:

  1. Encode "ö" into UTF-8: C3 B6
  2. Decode C3 B6 into Latin-1: "ö"
  3. Encode "ö" into UTF-8: C3 83 C2 B6
  4. Decode C3 83 C2 B6 into Windows-1251: "ГѓВ¶"

So, the ГѓВ (C3 C8 C2) pattern is caused specifically by characters in the C3 80-C3 BF range of UTF-8, or Unicode codepoints 00C0-00FF.

Here is a CyberChef for the forwards conversion and another for the backwards conversion.