What transliterates € -> EU and ™ -> (tm)?

175 Views Asked by At

On the streaming service Kanopy, I have noticed that some descriptions have strange looking text.

Screenshot from Kanopy showing mojibake

Screenshot shows:

BogieaEU(tm)s

and

HepburnaEU(tm)s

With a little effort, my theory is this:

The text started as

Bogie’s

Using (U+2019) as the apostrophe.

This was saved as UTF-8 as the bytes [0xe2 0x80 0x99]

That sequence of bytes was then treated as Windows-1252 or some other encoding, so that the output was ’

So far, this is standard mojibake and I am not asking about that.

Some process then converted ’ into aEU(tm).

That is:

â -> a
€ -> EU
™ -> (tm)

Looks like some kind of transliteration, converting Unicode into an ASCII approximation.

I am wondering about the precise piece of software that is doing this conversion. I can't find it!

For example, iconv is a very popular library for doing transliteration.

But you can see on the main page here: https://www.php.net/manual/en/function.iconv.php

that iconv converts into EUR, not EU.

So iconv is not doing this transliteration.

unidecode is another popular library.

But unidecode also converts into EUR

>>> unidecode('\u00e2\u20ac\u2122')
'aEUR(tm)'
>>>

Is it possible to find the precise piece of software that transliterates -> EU and -> (tm) ?

0

There are 0 best solutions below