C++ char16_t的大小取决于什么?

发布于 2024-11-16 13:17:40 字数 511 浏览 5 评论 0原文

这也与 char32_t 和任何 intXX_t 有关。规范指出:

2.14.3.2

char16_t 文字的值 包含单个 c-char 等于 其 ISO 10646 代码点值, 前提是代码点是 可用单个 16 位表示 代码单元。

5.3.3.1:

[..]特别是[..] sizeof(char16_t), sizeof(char32_t), 和 sizeof(wchar_t) 是 实现定义

除了注释它们是“可选”(18.4.1)之外,我看不到有关 intXX_t 类型的任何内容。

如果 char16_t 不保证为 2 字节,那么它是否保证为 16 位(即使在 1 字节 != 8 位的架构上)?

This is also related to char32_t and any intXX_t. The specification points out that:

2.14.3.2:

The value of a char16_t literal
containing a single c-char is equal to
its ISO 10646 code point value,
provided that the code point is
representable with a single 16-bit
code unit.

5.3.3.1:

[..] in particular [..]
sizeof(char16_t), sizeof(char32_t),
and sizeof(wchar_t) are
implementation-defined

I can not see anything about the intXX_t types, apart from the comment that they are "optional" (18.4.1).

If a char16_t isn`t guaranteed to be 2 bytes, is it guaranteed to be 16 bit (even on architectures where 1 byte != 8 bit)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

老子叫无熙 2024-11-23 13:17:40

3.9.1 基本类型 [basic.fundamental]

类型 char16_t 和 char32_t 表示不同的类型,它们分别与 中的 uint_least16_t 和 uint_least32_t 具有相同的大小、符号和对齐方式,称为基础类型。

这意味着 char16_t 至少是 16 位(但可能更大)

但我也相信:

包含单个 c-char 的 char16_t 文字的值等于其 ISO 10646 代码点值,前提是该代码点可以用单个 16 位代码单元表示。

提供相同的保证(尽管不太明确(因为您必须知道 ISO 10646 是 UCS(注意 UCS 与 Unicode 兼容但不完全相同)))。

3.9.1 Fundamental types [basic.fundamental]

Types char16_t and char32_t denote distinct types with the same size, signedness, and alignment as uint_least16_t and uint_least32_t, respectively, in , called the underlying types.

This means char16_t is at least 16 bits (but may be larger)

But I also believe:

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

provides the same guarantees (though less explicitly (as you have to know that ISO 10646 is UCS (Note UCS is compatible but not exactly the same as Unicode))).

国产ˉ祖宗 2024-11-23 13:17:40

包含单个 c-char 的 char16_t 文字的值等于其 ISO 10646 代码点值,前提是该代码点可以用单个 16 位代码单元表示。

如果 char16_t 不是至少 16 位宽,则这是不可能满足的,因此矛盾的是,它保证至少那么宽。

The value of a char16_t literal containing a single c-char is equal to its ISO 10646 code point value, provided that the code point is representable with a single 16-bit code unit.

This is impossible to satisfy if char16_t isn't at least 16 bits wide, so by contradiction, it's guaranteed to be at least that wide.

韶华倾负 2024-11-23 13:17:40

它不能保证正好是 16 位,因为有些平台不支持这么小的类型(例如,DSP 通常无法寻址小于其字大小的任何内容,字大小可能是 24、32 或 64 位)。您的第一个报价保证它至少为 16 位。

It can't be guaranteed to be exactly 16 bits, since there are platforms which don't support types that small (for example, DSPs often can't address anything smaller than their word size, which may be 24, 32 or 64 bits). Your first quote guarantees that it will be at least 16 bits.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文