string转wstring,编码问题
我读过 Stroustrup 的附录 D(特别注意 Locales 和 Codecvt)。 Stroustrup 没有给出一个好的 codecvt 和 Widen 示例(恕我直言)。我一直在尝试从互联网上获取一些东西,但没有任何乐趣。我也尝试过注入字符串流但没有成功。
有人能够展示(并解释)从 UTF-8 到 UTF-16(或 UTF-32)编码的代码吗? 注意:我事先不知道输入/输出字符串的大小,因此我希望解决方案应该使用 reserve
和 back_inserter
。请不要使用out.resize(in.length()*2)
。
完成后,如果代码确实能够工作那就太好了(令人惊讶的是,那里有这么多损坏的代码)。请确保以下“往返”。下面的字节是 UTF-8 和 UTF-{16|32} 中“骨”的汉字。
const std::string n("\xe9\xaa\xa8");
const std::wstring w = L"\u9aa8";
我对一个基本问题表示歉意。在 Windows 上,我使用 Win32 API,并且在编码之间移动时不会遇到这些问题。
I've read Stroustrup's Appendix D (particular attention to Locales and Codecvt). Stroustrup does not give a good codecvt and widen example (IMHO). I've been trying to knob turn stuff from the internet with no joy. I've also tried imbue'ing stringstreams without success.
Would anyone be able to show (and explain) the code to go from a UTF-8 to a UTF-16 (or UTF-32) encoding? NOTE: I do not know the size of the input/output string in advance, so I expect the solution should use reserve
and a back_inserter
. Please don't use out.resize(in.length()*2)
.
When finished, it would be great if the code actually worked (its amazing how much broken code is out there). Please make sure the following 'round trips'. The bytes below are the Han character for 'bone' in UTF-8 and UTF-{16|32}.
const std::string n("\xe9\xaa\xa8");
const std::wstring w = L"\u9aa8";
My apologies for a basic question. On Windows, I use the Win32 API and don't have these problems moving between encodings.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
只需使用 UTF8-CPP :
警告:这只适用于 wchar_t 为 2 字节长的情况(Windows)。
对于便携式解决方案,您可以这样做:
但是您将失去字符串支持。希望我们能尽快得到 char16_t。
Just use UTF8-CPP :
Caveat: this will only work where wchar_t is 2-bytes long (windows).
For a portable solution you could do :
But then you're losing the string support. Hopefully, we'll get char16_t soon enough.
很明显他在吸大麻。至于代码页转换,iconv 就是最好的选择!
It seems pretty obvious that he was smoking weed. As for the codepage conversions, look no further than iconv!