为什么 C 和 C++ 中存在多字符文字？

发布于 2024-09-27 19:18:48 字数 847 浏览 5 评论 0原文

我不知道 C 和 C++ 允许 多字符文字：不是“c”（C 中的 int 类型和 char< /em> 在 C++ 中），但是 'tralivali' （int 类型！）

enum
{
    ActionLeft = 'left',
    ActionRight = 'right',
    ActionForward = 'forward',
    ActionBackward = 'backward'
};

标准说：

C99 6.4.4.4p10：“ 整数字符常量包含多个字符（例如“ab”），或包含字符或转义符不映射到的序列单字节执行字符，是实现定义的。”

我发现它们广泛用于 C4 引擎但我认为当我们谈论与平台无关的序列化时，它们也可能会令人困惑，因为它们看起来像字符串。那么，多字符文字的使用范围是什么，它们有什么用途吗？ C++ 只是为了与 C 代码兼容吗？它们作为 goto 运算符是否被认为是一个不好的功能？

原文

I didn't know that C and C++ allow multicharacter literal: not 'c' (of type int in C and char in C++), but 'tralivali' (of type int!)

enum
{
    ActionLeft = 'left',
    ActionRight = 'right',
    ActionForward = 'forward',
    ActionBackward = 'backward'
};

Standard says:

C99 6.4.4.4p10: "The value of an
integer character constant containing
more than one character (e.g., 'ab'),
or containing a character or escape
sequence that does not map to a
single-byte execution character, is
implementation-defined."

I found they are widely used in C4 engine. But I suppose they are not safe when we are talking about platform-independent serialization. Thay can be confusing also because look like strings. So what is multicharacter literals scope of usage, are they useful for something? Are they in C++ just for compatibility with C code? Are they considered to be a bad feature as goto operator or not?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

病毒体 2024-10-04 19:18:48

它可以更轻松地在内存转储中挑选出值。

示例：

enum state { waiting, running, stopped };

vs.

enum state { waiting = 'wait', running = 'run.', stopped = 'stop' };

在以下语句之后进行内存转储：

s = stopped;

可能看起来像：

00 00 00 02 . . . .

在第一种情况下， vs:

73 74 6F 70 s t o p

使用多字符文字。（当然，它是说“stop”还是“pots”取决于字节顺序）

It makes it easier to pick out values in a memory dump.

Example:

enum state { waiting, running, stopped };

vs.

enum state { waiting = 'wait', running = 'run.', stopped = 'stop' };

a memory dump after the following statement:

s = stopped;

might look like:

00 00 00 02 . . . .

in the first case, vs:

73 74 6F 70 s t o p

using multicharacter literals. (of course whether it says 'stop' or 'pots' depends on byte ordering)

回复收藏 0 原文

抹茶夏天i‖ 2024-10-04 19:18:48

我不知道它的使用范围有多大，但“实现定义”对我来说是一个很大的危险信号。据我所知，这可能意味着实现可以选择忽略您的字符指定，并根据需要只分配正常的递增值。它可能会做一些“更好”的事情，但是您不能依赖跨编译器（甚至编译器版本）的这种行为。至少“goto”有可预测的（如果不受欢迎的话）行为......

无论如何，这是我的2c。

编辑：关于“实现定义”：

来自 Bjarne Stroustrup 的 C++ 术语表：

实现定义 - C++ 语义的一个方面，为每个实现定义，而不是在标准中为每个实现指定。一个例子是 int 的大小（必须至少为 16 位，但可以更长）。尽可能避免实现定义的行为。另请参见：未定义。 TC++PL C.2。

还...

未定义 - C++ 语义的一个方面，不需要合理的行为。一个例子是取消引用值为零的指针。避免未定义的行为。另请参阅：实现定义。 TC++PL C.2。

我相信这意味着评论是正确的：它至少应该编译，尽管没有指定除此之外的任何内容。另请注意定义中的建议。

回复收藏 0 原文

坏尐絯 2024-10-04 19:18:48

四字符文字，我见过并使用过。它们映射到 4 个字节 = 1 个 32 位字。如上所述，它对于调试目的非常有用。它们可以在带有整数的 switch/case 语句中使用，这很好。

这个（4 个字符）是相当标准的（即至少受 GCC 和 VC++ 支持），尽管结果（编译的实际值）可能因一种实现而异。

但超过 4 个字符？我不会用。

更新：来自 C4 页面：“对于我们的简单操作，我们将只提供一些值的枚举，这是通过指定四字符常量在 C4 中完成的”。所以他们使用 4 个字符文字，就像我的情况一样。

回复收藏 0 原文

分開簡單 2024-10-04 19:18:48

多字符文字允许通过等效的字符表示来指定 int 值。对于枚举、FourCC 代码和标签以及非类型模板参数很有用。使用多字符文字，可以直接在源代码中输入 FourCC 代码，这很方便。

gcc 中的实现描述于 https://gcc.gnu.org/ onlinedocs/cpp/Implementation-define-behavior.html 。请注意，该值会被截断为 int 类型的大小，因此如果您的 int 为 4 个字符宽，则 'efgh' == 'abcdefgh'，尽管 gcc 会发出对溢出的文字发出警告。

不幸的是，如果传递了 -pedantic，gcc 将对所有多字符文字发出警告，因为它们的行为是实现定义的。正如您在上面所看到的，如果您切换实现，两个多字符文字的相等性可能会发生变化。