Deprecated header <codecvt> replacement

38.4k Views Asked by At

A bit of foreground: my task required converting UTF-8 XML file to UTF-16 (with proper header, of course). And so I searched about usual ways of converting UTF-8 to UTF-16, and found out that one should use templates from <codecvt>.

But now when it is deprecated, I wonder what is the new common way of doing the same task?

(Don't mind using Boost at all, but other than that I prefer to stay as close to standard library as possible.)

5

There are 5 best solutions below

7
eerorika On BEST ANSWER

std::codecvt template from <locale> itself isn't deprecated. For UTF-8 to UTF-16, there is still std::codecvt<char16_t, char, std::mbstate_t> specialization.

However, since std::wstring_convert and std::wbuffer_convert are deprecated along with the standard conversion facets, there isn't any easy way to convert strings using the facets.

So, as Bolas already answered: Implement it yourself (or you can use a third party library, as always) or keep using the deprecated API.

6
Nicol Bolas On

The new way is... you write it yourself. Or just rely on deprecated functionality. Hopefully, the standards committee won't actually remove codecvt until there is a functioning replacement.

But at present, there isn't one.

7
xmllmx On

Don't worry about that.

According to the same information source:

this library component should be retired to Annex D, along side , until a suitable replacement is standardized.

So, you can still use it until a new standardized, more-secure version is done.

2
BullyWiiPlaza On

Since nobody really answers the question and provides usable replacement code, here is one but it's only for Windows:

#include <string>
#include <stdexcept>
#include <Windows.h>

std::wstring string_to_wide_string(const std::string& string)
{
    if (string.empty())
    {
        return L"";
    }

    const auto size_needed = MultiByteToWideChar(CP_UTF8, 0, string.data(), (int)string.size(), nullptr, 0);
    if (size_needed <= 0)
    {
        throw std::runtime_error("MultiByteToWideChar() failed: " + std::to_string(size_needed));
    }

    std::wstring result(size_needed, 0);
    MultiByteToWideChar(CP_UTF8, 0, string.data(), (int)string.size(), result.data(), size_needed);
    return result;
}

std::string wide_string_to_string(const std::wstring& wide_string)
{
    if (wide_string.empty())
    {
        return "";
    }

    const auto size_needed = WideCharToMultiByte(CP_UTF8, 0, wide_string.data(), (int)wide_string.size(), nullptr, 0, nullptr, nullptr);
    if (size_needed <= 0)
    {
        throw std::runtime_error("WideCharToMultiByte() failed: " + std::to_string(size_needed));
    }

    std::string result(size_needed, 0);
    WideCharToMultiByte(CP_UTF8, 0, wide_string.data(), (int)wide_string.size(), result.data(), size_needed, nullptr, nullptr);
    return result;
}
1
Richard Day On

Well Options is what you need so yes you CAN use it still. I ifdef it. Some places its ok and others your not allowed to use it. Still need to convert though right ?

#ifdef _CODECVT_  /// deprected in c++ 17
_STL_DISABLE_DEPRECATED_WARNING
    _NODISCARD std::optional<std::wstring> StrA2W(const std::string& Data)
{
    assert(Data.length() > 0);
    if (Data.length() == 0) {
        return std::nullopt;
    }
    using convert_typeX = std::codecvt_utf8<wchar_t>;
    std::wstring_convert<convert_typeX, wchar_t> converterX;
    return converterX.from_bytes(Data);
}

_NODISCARD std::optional<std::string>  StrW2A(const std::wstring& Data)
{
    assert(Data.length() > 0);
    if (Data.length() == 0) {
        return std::nullopt;
    }
    using convert_typeX = std::codecvt_utf8<wchar_t>;
    std::wstring_convert<convert_typeX, wchar_t> converterX;
    return converterX.to_bytes(Data);
}
_STL_RESTORE_DEPRECATED_WARNING

#else // #ifdef CODECVT /// deprecated in c++ 17

_NODISCARD std::optional<std::string> StrW2A(const std::wstring& Data) { // convert from LPWSTR to LPSTR
    assert(Data.length() > 0);
    if (Data.length() == 0) {
        return std::nullopt;
    }
    UINT acp = GetACP();
    int buflen = WideCharToMultiByte(acp, 0, Data.c_str(), -1, nullptr, 0, nullptr, nullptr) + 8;
    auto dest = std::make_unique<char[]>(buflen);
    int buflen2 = WideCharToMultiByte(acp, 0, Data.c_str(), -1, dest.get(), buflen, NULL, NULL) + 8;
    assert(buflen == buflen2);
    if( buflen == buflen2) 
        return dest.get();
    return std::nullopt;
}

_NODISCARD std::optional<std::wstring> StrA2W(const std::string& Data) { // convert from LPSTR to LPWSTR
    assert(Data.length() > 0);
    if (Data.length() == 0) {
        return std::nullopt;
    }
    UINT acp = GetACP();
    auto buflen = MultiByteToWideChar(acp, 0, Data.c_str(), -1, nullptr, 0) + 8;
    auto dest = std::make_unique<wchar_t[]>(buflen);
    auto buflen2 = (MultiByteToWideChar(acp, 0, Data.c_str(), -1, dest.get(), buflen) + 8);
    assert(buflen == buflen2);
    if (buflen == buflen2)
        return dest.get();
    return std::nullopt;
}

#endif // #ifdef CODECVT /// deprected in c++ 17

If you include the header you use it otherwise the other functions are used instead. Transparently no other change required.