I have a string that came from an old database of unknown character encoding. I am having trouble encoding/filtering the string to show the correct text.
What the data looks like in the database: Marronnière à quatre pans
What we need the string to show up as: Marronnière à quatre pans
Specifically, I am having trouble parsing the string so I can display the character à (à
)
This is an asp.Net 2.0 site written in VB using a Sql Server 2005 Database. Not sure if it matters, but data comes from a column with this collation: SQL_Latin1_General_CP1_CI_AS
I've tried encoding the string to various encodings in the code to no avail. I've also passed the string (encoded different ways) into a byte array to find a unique byte pattern for the bad characters without success.
Any ideas or leads would be greatly appreciated, thanks.
It sounds like the collation in the SQL Server database doesn't match the character encoding that was actually used :( It's a fairly common mistake for careless developers.
That's why the SQL Server administration tools are showing weird characters rather than the strings that you are expecting.
Possibly it is UTF-8? In UTF-8
Ã
is represented by the bytes0xC3 0xA8
, which would be interpreted under the Windows code page Latin-1 asè
. I know nothing about SQL Server collations, but it seems likely that SQL_Latin1_CP1_CI_AS is similar to Windows "Latin-1".You either need to