Disadvantages of using `std::wstring` for Unicode in cross-platform code?

102 Views Asked by smls At 07 March 2023 at 19:52

Situation

I have a large existing Win32 C++ code-base, and I want to make it portable so that it compiles and runs on both Windows (MSCV) and Linux (gcc).

For a new project I would try to go UTF-8 Everywhere, but this existing code-base already stores and processes its text in std::wstring as UTF-16.
So I expect to cause less upheaval, and have less risk of breaking existing behavior on Windows, if I keep it that way and try to work with it.

Plan

So this is what text handling would look like once the code-base is cross-platform:

Use std::wstring for storing text in memory, and operate on it using standard library functionality that accepts std::wstring/wchar_t.
On Windows, this means UTF-16 (with 2 bytes per code unit).
On Linux, this means UTF-32 (with 4 bytes per code unit).
On the program's input/output boundaries, when text must be converted to/from other encodings, have #ifdefs to do the correct thing on both platforms.

Question

What are the downsides/problems of this approach?

Already considered

Problems I already considered:

Higher memory usage compared to UTF-8.
Per-code-unit processing like std::tolower will behave differently on the two platforms if there are Unicode codepoints outside the Basic Multilingual Plane.
Some std::wstring-accepting overloads used by the current code-base, such as the std::ifstream(std::wstring, ...) constructor, are actually Microsoft-specific extensions and not available on Linux/GCC - so extra platform-specific #ifdefs will be necessary in those places.

But aside from that?

Original Q&A

Disadvantages of using `std::wstring` for Unicode in cross-platform code?

Situation

Plan

Question

Already considered

There are 0 best solutions below

Related Questions in C++

Related Questions in C++14

Related Questions in CROSS-PLATFORM

Related Questions in UNICODE-STRING

Related Questions in WSTRING

Trending Questions

Popular # Hahtags

Popular Questions