I have some code that reads in an a unicode codepoint (as escaped in a string 0xF00).
Since im using boost, I'm speculating if the following is best (and correct) approach:
unsigned int codepoint{0xF00};
boost::locale::conv::utf_to_utf<char>(&codepoint, &codepoint+1);
?
As mentioned, a codepoint in this form is (conveniently) UTF-32, so what you're looking for is a transcoding.
For a solution that does not rely on functions deprecated since C++17, and isn't really ugly, and which also does not require hefty third-party libraries, you can use the very lightweight UTF8-CPP (four small headers!) and its function
utf8::utf32to8.It's going to look something like this:
(There's also a
utf8::unchecked::utf32to8, if you're allergic to exceptions.)(And consider reading into
vector<char8_t>orstd::u8string, since C++20).(Finally, note that I've specifically used
uint32_tto ensure the input has the proper width.)I tend to use this library in projects until I need something a little heavier for other purposes (at which point I'll typically switch to ICU).