Here is what I did:
- I dumped a SQLite database with UTF-8 data (
sqlite3 example.db .dump > dump.sql), but since this was in powershell, I assume the piping converted it to windows-1252 - I loaded that dumped data into a new database, again using powershell (
Get-Content dump.sql | sqlite3 example2.db) - I dumped that new database and am left with a new
.sqlfile (this time it was not through powershell - so I assume it was unmodified)
This new sql file's UTF-8 characters are seriously mangled, and I was wondering if there was a way to convert it back into correct UTF-8.
As a few examples, here are what some sequences are in the new file, and what they should be (all are viewed as UTF-8):
ÒüéÒü¬ÒüƒÒü½should beあなたに´╝üshould be a full width exclamation markÒé¡Òé╗Òé¡should beキセキ
Does anyone have any idea as to how I might undo this mangling? Any method would be very helpful!
This is in powershell 7.0.1
Edit:
On further inspection, you can duplicate my predicament by redirecting any such data to a file in powershell (note that the data cannot itself be entered in powershell). Hence, setting up a script like this gives the same outcome:
test.sh
#!/bin/bash
echo "キ"
And then running wsl ./test.sh > test.txt will give an output of Òé¡, not キ
Edit 2:
It seems as if the codepage the UTF-8 text was converted to is almost 437: some characters are restored using this assumption (e.g. 木), but others are not. If it's close to 437, but isn't, what could it be?
It turns out, since I am in the UK, the codepage I wanted was 850. Saving the file as 850 and then reloading it as UTF-8 fixed my issue!