为什么 C 和 C++ 中存在多字符文字?
我不知道 C 和 C++ 允许 多字符文字
:不是“c”(C 中的 int 类型和 char< /em> 在 C++ 中),但是 'tralivali' (int 类型!)
enum
{
ActionLeft = 'left',
ActionRight = 'right',
ActionForward = 'forward',
ActionBackward = 'backward'
};
标准说:
C99 6.4.4.4p10:“ 整数字符常量包含 多个字符(例如“ab”), 或包含字符或转义符 不映射到的序列 单字节执行字符,是 实现定义的。”
I didn't know that C and C++ allow multicharacter literal
: not 'c' (of type int in C and char in C++), but 'tralivali' (of type int!)
enum
{
ActionLeft = 'left',
ActionRight = 'right',
ActionForward = 'forward',
ActionBackward = 'backward'
};
Standard says:
C99 6.4.4.4p10: "The value of an
integer character constant containing
more than one character (e.g., 'ab'),
or containing a character or escape
sequence that does not map to a
single-byte execution character, is
implementation-defined."
I found they are widely used in C4 engine. But I suppose they are not safe when we are talking about platform-independent serialization. Thay can be confusing also because look like strings. So what is multicharacter literals scope of usage, are they useful for something? Are they in C++ just for compatibility with C code? Are they considered to be a bad feature as goto operator or not?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
它可以更轻松地在内存转储中挑选出值。
示例:
vs.
在以下语句之后进行内存转储:
可能看起来像:
在第一种情况下, vs:
使用多字符文字。 (当然,它是说“stop”还是“pots”取决于字节顺序)
It makes it easier to pick out values in a memory dump.
Example:
vs.
a memory dump after the following statement:
might look like:
in the first case, vs:
using multicharacter literals. (of course whether it says 'stop' or 'pots' depends on byte ordering)
我不知道它的使用范围有多大,但“实现定义”对我来说是一个很大的危险信号。据我所知,这可能意味着实现可以选择忽略您的字符指定,并根据需要只分配正常的递增值。它可能会做一些“更好”的事情,但是您不能依赖跨编译器(甚至编译器版本)的这种行为。至少“goto”有可预测的(如果不受欢迎的话)行为......
无论如何,这是我的2c。
编辑:关于“实现定义”:
来自 Bjarne Stroustrup 的 C++ 术语表:
还...
我相信这意味着评论是正确的:它至少应该编译,尽管没有指定除此之外的任何内容。另请注意定义中的建议。
I don't know how extensively this is used, but "implementation-defined" is a big red-flag to me. As far as I know, this could mean that the implementation could choose to ignore your character designations and just assign normal incrementing values if it wanted. It may do something "nicer", but you can't rely on that behavior across compilers (or even compiler versions). At least "goto" has predictable (if undesirable) behavior...
That's my 2c, anyway.
Edit: on "implementation-defined":
From Bjarne Stroustrup's C++ Glossary:
also...
I believe this means the comment is correct: it should at least compile, although anything beyond that is not specified. Note the advice in the definition, also.
四字符文字,我见过并使用过。它们映射到 4 个字节 = 1 个 32 位字。如上所述,它对于调试目的非常有用。它们可以在带有整数的 switch/case 语句中使用,这很好。
这个(4 个字符)是相当标准的(即至少受 GCC 和 VC++ 支持),尽管结果(编译的实际值)可能因一种实现而异。
但超过 4 个字符?我不会用。
更新:来自 C4 页面:“对于我们的简单操作,我们将只提供一些值的枚举,这是通过指定四字符常量在 C4 中完成的”。所以他们使用 4 个字符文字,就像我的情况一样。
Four character literals, I've seen and used. They map to 4 bytes = one 32 bit word. It's very useful for debugging purposes as said above. They can be used in a switch/case statement with ints, which is nice.
This (4 Chars) is pretty standard (ie supported by GCC and VC++ at least), although results (actual values compiled) may vary from one implementation to another.
But over 4 chars? I wouldn't use.
UPDATE: From the C4 page: "For our simple actions, we'll just provide an enumeration of some values, which is done in C4 by specifying four-character constants". So they are using 4 chars literals, as was my case.
多字符文字允许通过等效的字符表示来指定
int
值。对于枚举、FourCC 代码和标签以及非类型模板参数很有用。使用多字符文字,可以直接在源代码中输入 FourCC 代码,这很方便。gcc 中的实现描述于 https://gcc.gnu.org/ onlinedocs/cpp/Implementation-define-behavior.html 。请注意,该值会被截断为
int
类型的大小,因此如果您的 int 为 4 个字符宽,则'efgh' == 'abcdefgh'
,尽管 gcc 会发出对溢出的文字发出警告。不幸的是,如果传递了 -pedantic,gcc 将对所有多字符文字发出警告,因为它们的行为是实现定义的。正如您在上面所看到的,如果您切换实现,两个多字符文字的相等性可能会发生变化。
Multicharacter literals allow one to specify
int
values via the equivalent representation in characters. Useful for enums, FourCC codes and tags, and non-type template parameters. With a multicharacter literal, a FourCC code can be typed directly into the source, which is handy.The implementation in gcc is described at https://gcc.gnu.org/onlinedocs/cpp/Implementation-defined-behavior.html . Note that the value is truncated to the size of the type
int
, so'efgh' == 'abcdefgh'
if your ints are 4 chars wide, although gcc will issue a warning on the literal that overflows.Unfortunately, gcc will issue a warning on all multi-character literals if
-pedantic
is passed, as their behavior is implementation-defined. As you can see above, it is perhaps possible for equality of two multi-character literals to change if you switch implementations.在 C++ 14 规范草案 N4527 第 2.13.3 节,条目 2:
您问题的先前答案主要涉及支持多字符文字的真实机器。具体来说,在
int
为 4 字节的平台上,四字节多字符就可以了,并且可以根据 Ferrucio 的内存转储示例来方便地使用。但是,由于无法保证这在其他平台上能够正常工作或以相同的方式工作,因此对于可移植程序,应弃用多字符文字。In C++14 specification draft N4527 section 2.13.3, entry 2:
Previous answers to your question pertained mostly on real machines that did support multicharacter literals. Specifically, on platforms where
int
is 4 bytes, four-byte multicharacter is fine and can be used for convenience, as per Ferrucio's mem dump example. But, as there is no guarantee that this will ever work or work the same way on other platforms, use of multicharacter literals should be deprecated for portable programs.