宽字符窗口

发布于 2024-12-19 08:36:51 字数 347 浏览 3 评论 0原文

Windows 将 wchar_t 符号定义为 16 位长。然而，使用的 UTF-16 编码告诉我们，某些符号实际上可能是用 4 个字节（32 位）编码的。

这是否意味着，如果我正在为 Windows 开发应用程序，则以下语句：

wchar_t symbol = ... // Whatever

可能仅代表实际符号的一部分？

如果我执行以下操作，会发生什么在 *nix 下也是一样，其中 wchar_t 是 32 位长？

原文

Windows defines the wchar_t symbol to be 16 bits long. However, the UTF-16 encoding used tells us that some symbols may actually be encoded with 4 bytes (32 bits).

Does this mean that if I'm developing an application for Windows, the following statement:

wchar_t symbol = ... // Whatever

might only represent a part of the actual symbol?

And what will happen if I do the same under *nix, where wchar_t is 32 bits long?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

粉红×色少女 2024-12-26 08:36:51

是的，这意味着 symbol 可能包含 Windows 上代理对的一部分。在 *nixes 上，wchar_t 是 32 位长，将保存整个 Unicode 字符集。请注意，Unicode 代码点并不代表字符，因为某些字符可能由多个 Unicode 代码点编码，因此对字符进行计数根本没有意义。特别是，这意味着在 Unicode 库之外的任何地方使用除 UTF-8 编码的窄字符字符串以外的任何内容都是没有意义的，即使在 Windows 上也是如此。

请阅读这个旧线程了解详细信息。

回复收藏 0 原文

~没有更多了~