这是未定义的 C 行为吗?
我们班的 C 编程教授问了这个问题:
你得到了代码:
int x=1;
printf("%d",++x,x+1);
它总是会产生什么输出?
大多数学生表示未定义行为。谁能帮助我理解为什么会这样?
感谢您的编辑和答案,但我仍然很困惑。
Our class was asked this question by the C programming prof:
You are given the code:
int x=1;
printf("%d",++x,x+1);
What output will it always produce ?
Most students said undefined behavior. Can anyone help me understand why it is so?
Thanks for the edit and the answers but I'm still confused.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在每种合理情况下,输出可能为 2。但实际上,您所拥有的是未定义的行为。
具体来说,该标准规定:
在计算函数参数之前有一个序列点,在计算所有参数(但尚未调用函数)之后还有一个序列点。在这两者之间(即,在对参数求值时)不有一个序列点(除非参数是一个内部包含一个的表达式,例如使用
&& code>
||
或,
运算符)。这意味着对
printf
的调用正在读取先前的值 both 以确定正在存储的值(即++x
) 确定第二个参数的值(即x+1
)。这显然违反了上面引用的要求,导致未定义的行为。事实上,您提供了一个没有给出转换说明符的额外参数,不会导致未定义的行为。如果您提供转换说明符的更少参数,或如果参数的(升级)类型与转换说明符的类型不一致,您会得到未定义的行为 - 但传递一个额外的参数不。
The output is likely to be 2 in every reasonable case. In reality, what you have is undefined behavior though.
Specifically, the standard says:
There is a sequence point before evaluating the arguments to a function, and a sequence point after all the arguments have been evaluated (but the function not yet called). Between those two (i.e., while the arguments are being evaluated) there is not a sequence point (unless an argument is an expression includes one internally, such as using the
&&
||
or,
operator).That means the call to
printf
is reading the prior value both to determine the value being stored (i.e., the++x
) and to determine the value of the second argument (i.e., thex+1
). This clearly violates the requirement quoted above, resulting in undefined behavior.The fact that you've provided an extra argument for which no conversion specifier is given does not result in undefined behavior. If you supply fewer arguments that conversion specifiers, or if the (promoted) type of the argument disagrees with that of the conversion specifier you get undefined behavior -- but passing an extra parameter does not.
只要程序的行为未定义,任何事情都可能发生 - 经典的短语是“恶魔可能会从你的鼻子里飞出来” - 尽管大多数实现都不会走那么远。
函数的参数在概念上是并行计算的(技术术语是它们的计算之间没有序列点)。这意味着表达式
++x
和x+1
可以按此顺序、相反顺序或某种交错方式进行计算。当您修改变量并尝试并行访问其值时,行为是未定义的。在许多实现中,参数按顺序求值(尽管并不总是从左到右)。所以在现实世界中你不可能看到除了 2 之外的任何东西。
然而,编译器可以生成如下代码:
r1
中。r1
加 1 来计算x+1
。r1
加 1 来计算++x
。没关系,因为x
已加载到r1
中。鉴于编译器的设计方式,步骤 2 不能修改r1
,因为只有在两个序列点之间读取和写入x
时才会发生这种情况。这是 C 标准所禁止的。r1
存储到x
中。在这个(假设的,但正确的)编译器上,程序将打印 3。
(编辑:将额外参数传递给
printf
是正确的(§7.19.6.1-2 in N1256; “https://stackoverflow.com/users/165520/prasoon-saurav">Prasoon Saurav)指出了这一点,另外:添加了一个示例。)Any time the behavior of a program is undefined, anything can happen — the classical phrase is that "demons may fly out of your nose" — although most implementations don't go that far.
The arguments of a function are conceptually evaluated in parallel (the technical term is that there is no sequence point between their evaluation). That means the expressions
++x
andx+1
may be evaluated in this order, in the opposite order, or in some interleaved way. When you modify a variable and try to access its value in parallel, the behavior is undefined.With many implementations, the arguments are evaluated in sequence (though not always from left to right). So you're unlikely to see anything but 2 in the real world.
However, a compiler could generate code like this:
r1
.x+1
by adding 1 tor1
.++x
by adding 1 tor1
. That's ok becausex
has been loaded intor1
. Given how the compiler was designed, step 2 cannot have modifiedr1
, because that could only happen ifx
was read as well as written between two sequence points. Which is forbidden by the C standard.r1
intox
.And on this (hypothetical, but correct) compiler, the program would print 3.
(EDIT: passing an extra argument to
printf
is correct (§7.19.6.1-2 in N1256; thanks to Prasoon Saurav) for pointing this out. Also: added an example.)正确答案是:代码产生未定义的行为。
行为未定义的原因是两个表达式
++x
和x + 1
正在修改x
并读取x 出于不相关(与修改)的原因,并且这两个操作不是由序列点分隔的。这会导致 C(和 C++)中未定义的行为。 C语言标准6.5/2中给出了要求。
请注意,这种情况下的未定义行为与 printf 函数仅给出一个格式说明符和两个实际参数这一事实完全无关。在 C 语言中,为
printf
提供的参数多于格式字符串中的格式说明符是完全合法的。同样,问题的根源在于违反了 C 语言的表达式求值要求。另请注意,本次讨论的一些参与者未能掌握未定义行为的概念,并坚持将其与未指定行为的概念混合在一起。为了更好地说明差异,让我们考虑以下简单示例。
上面的代码与原始代码“等效”,只是涉及
x
的操作被包装到函数中。在这个最新的例子中会发生什么?这段代码中没有未定义的行为。但由于
printf
参数的求值顺序未指定,因此该代码会产生未指定的行为,即printf 可能code> 将被调用为
printf("%d", 2, 2)
或printf("%d", 2, 3)
。在这两种情况下,输出确实都是2
。然而,此变体的重要区别在于,对 x 的所有访问都被包装到每个函数开头和结尾处的序列点中,因此此变体不会产生未定义的行为。这正是其他一些发帖者试图强加于原始示例的推理。但这是不可能的。原始示例产生未定义行为,这是完全不同的野兽。他们显然试图坚持认为,在实践中,未定义的行为始终等同于未指定的行为。这是一个完全虚假的说法,只能表明制造者缺乏专业知识。原始代码产生未定义的行为,期间。
为了继续这个例子,让我们修改前面的代码示例,
代码的输出将变得通常不可预测。它可以打印
2 2
或打印2 3
。但请注意,即使行为是不可预测的,它仍然不会产生未定义的行为。该行为是未指定,而不是未定义。未指定的行为仅限于两种可能性:2 2
或2 3
。未定义的行为不限于任何事物。它可以格式化您的硬盘驱动器而不是打印某些内容。感受差异。The correct answer is: the code produces undefined behavior.
The reason the behavior is undefined is that the two expressions
++x
andx + 1
are modifyingx
and readingx
for an unrelated (to modification) reason and these two actions are not separated by a sequence point. This results in undefined behavior in C (and C++). The requirement is given in 6.5/2 of C language standard.Note, that the undefined behavior in this case has absolutely nothing to do with the fact that
printf
function is given only one format specifier and two actual arguments. To give more arguments toprintf
than there are format specifiers in the format string is perfectly legal in C. Again, the problem is rooted in the violation of expression evaluation requirements of C language.Also note, that some participants of this discussion fail to grasp the concept of undefined behavior, and insist on mixing it with the concept of unspecified behavior. To better illustrate the difference let's consider the following simple example
The above code is "equivalent" to the original one, except that the operations that involve our
x
are wrapped into functions. What is going to happen in this latest example?There's no undefined behavior in this code. But since the order of evaluation of
printf
arguments is unspecified, this code produces unspecified behavior, i.e. it is possible thatprintf
will be called asprintf("%d", 2, 2)
or asprintf("%d", 2, 3)
. In both cases the output will indeed be2
. However, the important difference of this variant is that all accesses tox
are wrapped into sequence points present at the beginning and at the end of each function, so this variant does not produce undefined behavior.This is exactly the reasoning some other posters are trying to force onto the original example. But it cannot be done. The original example produces undefined behavior, which is a completely different beast. They are apparently trying to insist that in practice undefined behavior is always equivalent to unspecified behavior. This is a totally bogus claim that only indicate the lack of expertise in those who make it. The original code produces undefined behavior, period.
To continue with the example, let's modify the previous code sample to
the output of the code will become generally unpredictable. It can print
2 2
or it can print2 3
. However note that even though the behavior is unpredictable, it still does not produce the undefined behavior. The behavior is unspecified, bit not undefined. Unspecified behavior is restricted to two possibilities: either2 2
or2 3
. Undefined behavior is not restricted to anything. It can format you hard drive instead of printing something. Feel the difference.因为函数参数的计算顺序没有指定。
Because order in which function parameters are calculated is not specified.
在我能想到的所有环境中它都会产生 2。然而,C99 标准的严格解释导致行为未定义,因为对 x 的访问不满足序列点之间存在的要求。
我现在将解决第二个问题,我将其理解为“为什么我班上的大多数学生说所显示的代码构成未定义的行为?”我想到目前为止还没有其他发帖者给出答案。一部分学生会记得表达式未定义值的示例,例如
您给出的代码适合这种模式,但学生错误地认为该行为无论如何都已定义,因为 printf 忽略了最后一个参数。这种细微差别让很多学生感到困惑。另一部分学生将像 David Thornley 一样精通标准,并出于上述正确原因说“未定义的行为”。
It will produce 2 in all environments I can think of. Strict interpretation of the C99 standard however renders the behaviour undefined because the accesses to x do not meet the requirements that exist between sequence points.
I will now address the second question which I understand as "Why do most of the students of my class say that the shown code constitutes undefined behaviour?" and I think no other poster has answered so far. One part of the students will have remembered examples of undefined value of expressions like
The code you give fits this pattern but the students erroneously think that the behaviour is defined anyway because printf ignores the last parameter. This nuance confuses many students. Another part of the student will be as well versed in standard as David Thornley and say "undefined behaviour" for the correct reasons explained above.
关于未定义行为的观点是正确的,但还有一个额外的问题:printf 可能会失败。它正在做文件IO;它可能失败的原因有很多,如果不知道完整的程序及其执行的上下文,就不可能消除它们。
The points made about undefined behavior are correct, but there is one additional wrinkle: printf may fail. It's doing file IO; there are any number of reasons it could fail, and it's impossible to eliminate them without knowing the complete program and the context in which it will be executed.
与 codaddict 相呼应,答案是 2。
将使用参数 2 调用 printf 并打印它。
如果将此代码放入如下上下文中:
那么该函数的行为就被完全且明确地定义了。当然,我并不是说这是好的或正确的,也不是说 x 的值可以在之后确定。
Echoing codaddict the answer is 2.
printf will be called with argument 2 and it will print it.
If this code is put in a context like:
Then the behaviour of that function is completely and unambiguously defined. I'm not of course arguing that this is good or correct or that the value of x is determinable afterwards.
输出将始终为(对于 99.98% 最重要的符合标准的编译器和系统)2。
根据标准,这似乎是,根据定义,“未定义的行为”,一个定义/答案这是自我辩护,并且没有说明实际会发生什么,尤其是为什么。
实用程序 splint (不是标准合规性检查工具)以及 splint 的程序员将其视为“未指定的行为”。这基本上意味着,
(x+1)
的计算可以给出 1+1 或 2+1,具体取决于x
的更新实际完成的时间。然而,由于表达式被丢弃(printf 格式读取 1 个参数),因此输出不受影响,我们仍然可以说它是 2。如前所述,未指定的行为仅影响
(x+1)
的计算,而不影响整个语句或它的其他表达式。因此,在“未指定行为”的情况下,我们可以说输出是 2,没有人可以反对。但这不是未指定的行为,它似乎是“未定义的行为”。而且“未定义的行为”似乎必须影响整个语句而不是单个表达式。这是由于“未定义行为”实际发生的位置(即到底影响什么)的神秘性。
如果有动机将“未定义的行为”附加到
(x+1)
表达式上,就像“未指定的行为”的情况一样,那么我们仍然可以说输出始终为 (100%) 2. 将“未定义行为”附加到(x+1)
意味着我们无法判断它是 1+1 还是 2+1;它只是“任何东西”。但同样,由于 printf,“任何内容”都被丢弃,这意味着答案将是“始终 (100%) 2”。相反,由于神秘的不对称性,“未定义的行为”不能仅附加到
x+1
,但实际上它必须影响如果不是整个语句,至少是++x
(顺便说一下,它是未定义行为的原因)。如果它只感染++x
表达式,输出是一个“未定义值”,即任何整数,例如-5847834或9032。如果它感染整个语句,那么你可以在你的代码中看到gargabe控制台输出,您可能必须使用 ctrl-c 停止程序,可能在它开始阻塞您的 cpu 之前。根据都市传说,“未定义的行为”不仅会影响整个程序,还会影响你的计算机和物理定律,因此你的程序可以创造出神秘的生物,然后飞走或吃掉你。
没有答案能够充分解释有关该主题的任何内容。它们只是“哦,看标准是这么说的”(和往常一样,这只是一种解释!)。因此,至少您已经了解到“标准存在”,并且它们提出了教育问题(因为当然,不要忘记您的代码是错误,无论未定义/未指定的行为主义和其他标准事实) ),没有用的逻辑论证,没有目的的深入调查和理解。
The output will be always (for 99.98% of the most important stadard compliant compilers and systems) 2.
According to the standard, this seems to be, by definition, "undefined behaviour", a definition/answer that is self-justifying and that says nothing about what actually can happen, and especially why.
The utility splint (which is not a std compliance checking tool), and so splint's programmers, consider this as "unspecified behaviour". This means, basically, that the evaluation of
(x+1)
can give 1+1 or 2+1, depending on when the update ofx
is actually done. Since however the expression is discarded (printf format reads 1 argument), the output is unaffected, and we can still say it is 2.As said before, the unspecified behaviour affect just the evaluation of
(x+1)
, not the whole statement or other expressions of it. So in the case of "unspecified behaviour" we can say that the output is 2, and nobody could object.But this is not unspecified behaviour, it seems to be "undefined behaviour". And the "undefined behaviour" seems to have to be something that affect the whole statement instead of the single expression. This is due to the mistery around where the "undefined behaviour" actually occur (i.e. what exactly affects).
If there would be motivations to attach the "undefined behaviour" just to the
(x+1)
expression, as in the "unspecified behaviour" case, then we still could say that the output is always (100%) 2. Attaching the "undefined behaviour" just to(x+1)
means that we are not able to say if it is 1+1 or 2+1; it is just "anything". But again, that "anything" is dropped because of the printf, and this means that the answer would be "always (100%) 2".Instead, because of misterious asymmetries, the "undefined behaviour" can't be attached just to the
x+1
, but indeed it must affect at least the++x
(which by the way is the responsible for the undefined behaviour), if not the whole statement. If it infects just the++x
expression, the output is a "undefined value", i.e. any integer, e.g. -5847834 or 9032. If it infects the whole statement, then you could see gargabe in your console output, likely you could have to stop the program with ctrl-c, possibly before it starts to choke your cpu.According to an urban legend, the "undefined behaviour" infects not only the whole program, but also your computer and the laws of physics, so that misterious creatures can be created by your program and fly away or eat you.
No answers explain anything competently about the topic. They are just a "oh see the standard says this" (and it is just an interpretation, as usual!). So at least you have learned that "standards exist", and they make arid the educational questions (since of course, don't forget that your code is wrong, regardless undefined/unspecified behaviourism and other standard facts), unuseful the logic arguments and aimless the deep investigations and understanding.