中国conversion依Multibytetowidechar

发布于 2025-02-10 02:58:20 字数 273 浏览 1 评论 0原文

我正在尝试在MessageBoxw中显示中文文本。但是我无法正确将其从UTF-8转换为WCHAR_T。同时，正确显示原始的WCHAR_T中文。我玩过不同的多teToWideChar标志，但结果相同。错误转换的原因是什么？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

安静被遗忘 2025-02-17 02:58:20

char text [] =“文本”仅在UTF-8中编码源文件时才是UTF-8。由于您的标题字符串正确显示您的编码是Windows上的默认中文旧版编码，而text字符串字符串包含该编码中的字节，而不是UTF-8，因此MultibyTetoWideChar失败。您可以看到该函数如果设置标志以检查无效字符，则该函数将返回零，如果不是真正的UTF-8：

int ret = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, text, -1, wtext, 1000);

Microsoft编译器具有指定源和执行字符集的选项，以及/ a / UTF-8选项（建议）：

/source-charset:<iana-name>|.nnnn      set source character set  
/execution-charset:<iana-name>|.nnnn   set execution character set  
/utf-8                                 set source and execution character set to UTF-8

要修复的多个选项。＃2和＃3假定Microsoft编译器。其他编译器可能会有所不同。

使用char text [] = u8“文本”;由于您现有的默认编码支持中文。源字符将在该编码中解释，然后用该符号重新编码UTF-8。如果将源发送给具有不同OS默认编码的人，则该源对他们不起作用。
将源以UTF-8（w/ bom）重新释放。 MS编译器将检测BOM（用作UTF-8签名的字节订单标记），并处理源，就像指定了/utf-8一样。 文本将包含UTF-8字节。标题将正确显示。
重新保存为UTF-8（无BOM），并使用/utf-8开关进行编译，以告知编译器将源解码为UTF-8而不是默认编码。
使用仅ASCII源和逃生代码明确指定中文字符。

＃4的示例将正确编译，无论OS默认编码如何：

#include <windows.h>

int main() {
    char text[] = "\xe6\x96\x87\xe6\x9c\xac";
    wchar_t wtext[1000];
    MultiByteToWideChar(CP_UTF8, 0, text, -1, wtext, 1000);
    MessageBoxW(NULL, wtext, L"\u6a19\u984c", MB_OK);
    return 0;
}

char text[] = "文本" is only UTF-8 if the source file is encoded in UTF-8. Since your title string displays correctly your encoding is the default Chinese legacy encoding on Windows, and the text string contains bytes in that encoding, and not UTF-8, so MultiByteToWideChar fails. You can see that the function returns zero if you set the flag to check for invalid characters, which happens if it isn't really UTF-8:

int ret = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, text, -1, wtext, 1000);

The Microsoft compiler has options to specify source and execution character set, and a /utf-8 option (recommended):

/source-charset:<iana-name>|.nnnn      set source character set  
/execution-charset:<iana-name>|.nnnn   set execution character set  
/utf-8                                 set source and execution character set to UTF-8

Multiple options to fix. #2 and #3 assume the Microsoft compiler. Other compilers may vary.

Use char text[] = u8"文本"; since your existing default encoding supports Chinese. The source characters will be interpreted in that encoding and then re-encoded in UTF-8 with this notation. If the source is sent to someone with different OS default encoding, it will not work for them.
Re-save the source as UTF-8 w/ BOM. The MS compiler will detect the BOM (byte order mark used as a UTF-8 signature) and process the source as if /utf-8 was specified. text will contain UTF-8 bytes. Title will display correctly.
Re-save as UTF-8 (no BOM) and compile with the /utf-8 switch to inform the compiler to decode the source as UTF-8 instead of the default encoding.
Use ASCII-only source and escape codes to specify the Chinese character explicitly.

Example of #4 that will compile correctly no matter the OS default encoding:

#include <windows.h>

int main() {
    char text[] = "\xe6\x96\x87\xe6\x9c\xac";
    wchar_t wtext[1000];
    MultiByteToWideChar(CP_UTF8, 0, text, -1, wtext, 1000);
    MessageBoxW(NULL, wtext, L"\u6a19\u984c", MB_OK);
    return 0;
}

回复收藏 0 原文

~没有更多了~