为什么这些构造使用未定义前后的不确定行为?

发布于 2025-02-11 01:57:31 字数 633 浏览 3 评论 0 原文

#include <stdio.h>

int main(void)
{
   int i = 0;
   i = i++ + ++i;
   printf("%d\n", i); // 3

   i = 1;
   i = (i++);
   printf("%d\n", i); // 2 Should be 1, no ?

   volatile int u = 0;
   u = u++ + ++u;
   printf("%d\n", u); // 1

   u = 1;
   u = (u++);
   printf("%d\n", u); // 2 Should also be one, no ?

   register int v = 0;
   v = v++ + ++v;
   printf("%d\n", v); // 3 (Should be the same as u ?)

   int w = 0;
   printf("%d %d\n", ++w, w); // shouldn't this print 1 1

   int x[2] = { 5, 8 }, y = 0;
   x[y] = y ++;
   printf("%d %d\n", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}
#include <stdio.h>

int main(void)
{
   int i = 0;
   i = i++ + ++i;
   printf("%d\n", i); // 3

   i = 1;
   i = (i++);
   printf("%d\n", i); // 2 Should be 1, no ?

   volatile int u = 0;
   u = u++ + ++u;
   printf("%d\n", u); // 1

   u = 1;
   u = (u++);
   printf("%d\n", u); // 2 Should also be one, no ?

   register int v = 0;
   v = v++ + ++v;
   printf("%d\n", v); // 3 (Should be the same as u ?)

   int w = 0;
   printf("%d %d\n", ++w, w); // shouldn't this print 1 1

   int x[2] = { 5, 8 }, y = 0;
   x[y] = y ++;
   printf("%d %d\n", x[0], x[1]); // shouldn't this print 0 8? or 5 0?
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(15

潇烟暮雨 2025-02-18 01:57:31

C具有未定义的行为的概念,即某些语言构造在语法上是有效的,但是在运行代码时您无法预测行为。

据我所知,标准并未明确说出为什么存在不确定行为的概念。在我看来,这仅仅是因为语言设计师希望在语义上有一些余地,而不是要求所有实现都以完全相同的方式处理整数溢出,这很可能会施加严重的性能成本,而是离开了行为未定义,这样,如果您编写导致整数溢出的代码,可能会发生任何事情。

因此,考虑到这一点,为什么这些“问题”?该语言清楚地说,某些事情导致不确定的行为。没有问题,不涉及“应该”。如果在声明一个涉及变量之一 volatile 时,未定义的行为会发生变化,则不会证明或更改任何内容。它是不确定的;您无法推论这种行为。

您最有趣的示例,其中一个

u = (u++);

是一个不确定行为的教科书示例(请参见“ noreferrer”>序列点)。

C has the concept of undefined behavior, i.e. some language constructs are syntactically valid but you can't predict the behavior when the code is run.

As far as I know, the standard doesn't explicitly say why the concept of undefined behavior exists. In my mind, it's simply because the language designers wanted there to be some leeway in the semantics, instead of i.e. requiring that all implementations handle integer overflow in the exact same way, which would very likely impose serious performance costs, they just left the behavior undefined so that if you write code that causes integer overflow, anything can happen.

So, with that in mind, why are these "issues"? The language clearly says that certain things lead to undefined behavior. There is no problem, there is no "should" involved. If the undefined behavior changes when one of the involved variables is declared volatile, that doesn't prove or change anything. It is undefined; you cannot reason about the behavior.

Your most interesting-looking example, the one with

u = (u++);

is a text-book example of undefined behavior (see Wikipedia's entry on sequence points).

送君千里 2025-02-18 01:57:31

此处引用的大多数答案都强调了这些结构的行为是不确定的。要了解为什么这些构造的行为不确定,让我们首先根据C11标准来理解这些术语:

测序:(5.1.2.3)

给定任何两个评估 a b ,如果 a 是在 b 之前对待的,则是执行 A 应在执行 B 之前。

未序列:

如果 a B 之前未进行测序,则 a 和 b 是未序列的。

评估可以是两件事之一:

  • 价值计算,它们是表达式的结果;
  • 副作用,是对象的修改。

序列点:

表达式评估和 b 之间的序列点的存在意味着每个 value Computation 副作用< /em>与 a 关联之前,请在每个 value Computation side效果之前对与 b 关联的副作用。 >

现在提出问题,因为

int i = 1;
i = i++;

标准之类的表达方式说:

6.5表达式:

如果对标量对象的副作用相对于 对同一标量对象的不同副作用或使用同一标量的值计算值对象,行为不确定。 [...]

因此,上述表达式调用UB,因为对同一对象的两个副作用 i 相对于彼此而言是未序列的。这意味着如果通过 i 将副作用在副作用之前或之后通过 ++
取决于分配是在增量之前还是之后发生的,将产生不同的结果,这是不确定行为的情况之一

让我们重命名 i 在分配的左侧为 il ,在分配右边(在Expression i ++ 中)为 ir ,然后表达式就像

il = ir++     // Note that suffix l and r are used for the sake of clarity.
              // Both il and ir represents the same object.  

一个重要的点关于Postfix +++ 操作员是:

仅仅是因为 ++ 是在变量之后出现的,这并不意味着增量发生后期。只要编译器确保使用原始值。

这意味着可以评估表达式 il = ir ++ 可以作为

temp = ir;      // i = 1
ir = ir + 1;    // i = 2   side effect by ++ before assignment
il = temp;      // i = 1   result is 1  

temp = ir;      // i = 1
il = temp;      // i = 1   side effect by assignment before ++
ir = ir + 1;    // i = 2   result is 2  

产生两个不同的结果 1 2 ,取决于副作用的顺序通过分配和 ++ ,因此调用了未定义的行为。

Most of the answers here quoted from C standard emphasizing that the behaviour of these constructs are undefined. To understand why the behaviour of these constructs are undefined, let's understand these terms first in the light of C11 standard:

Sequenced: (5.1.2.3)

Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B.

Unsequenced:

If A is not sequenced before or after B, then A and B are unsequenced.

Evaluations can be one of two things:

  • value computations, which work out the result of an expression; and
  • side effects, which are modifications of objects.

Sequence Point:

The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.

Now coming to the question, for the expressions like

int i = 1;
i = i++;

standard says that:

6.5 Expressions:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behaviour is undefined. [...]

Therefore, the above expression invokes UB because two side effects on the same object i is unsequenced relative to each other. That means it is not sequenced whether the side effect by assignment to i will be done before or after the side effect by ++.
Depending on whether assignment occurs before or after the increment, different results will be produced and that's the one of the case of undefined behaviour.

Lets rename the i at left of assignment be il and at the right of assignment (in the expression i++) be ir, then the expression be like

il = ir++     // Note that suffix l and r are used for the sake of clarity.
              // Both il and ir represents the same object.  

An important point regarding Postfix ++ operator is that:

just because the ++ comes after the variable does not mean that the increment happens late. The increment can happen as early as the compiler likes as long as the compiler ensures that the original value is used.

It means the expression il = ir++ could be evaluated either as

temp = ir;      // i = 1
ir = ir + 1;    // i = 2   side effect by ++ before assignment
il = temp;      // i = 1   result is 1  

or

temp = ir;      // i = 1
il = temp;      // i = 1   side effect by assignment before ++
ir = ir + 1;    // i = 2   result is 2  

resulting in two different results 1 and 2 which depends on the sequence of side effects by assignment and ++ and hence invokes undefined behaviour.

无可置疑 2025-02-18 01:57:31

我认为C99标准的相关部分是6.5表达式,§2

在上一个序列和下一个序列之间,一个对象应具有其存储值
通过评估表达式最多修改一次。此外,先前的值
应仅读取以确定要存储的值。

和6.5.16分配运营商,§4:

未指定操作数的评估顺序。如果尝试修改
分配运算符的结果或在下一个序列点之后访问它,
行为是不确定的。

I think the relevant parts of the C99 standard are 6.5 Expressions, §2

Between the previous and next sequence point an object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored.

and 6.5.16 Assignment operators, §4:

The order of evaluation of the operands is unspecified. If an attempt is made to modify
the result of an assignment operator or to access it after the next sequence point, the
behavior is undefined.

银河中√捞星星 2025-02-18 01:57:31

只需编译并拆卸您的代码线,如果您倾向于知道自己到底有多了解。

这就是我在计算机上得到的,以及我认为正在发生的事情:(

$ cat evil.c
void evil(){
  int i = 0;
  i+= i++ + ++i;
}
$ gcc evil.c -c -o evil.bin
$ gdb evil.bin
(gdb) disassemble evil
Dump of assembler code for function evil:
   0x00000000 <+0>:   push   %ebp
   0x00000001 <+1>:   mov    %esp,%ebp
   0x00000003 <+3>:   sub    $0x10,%esp
   0x00000006 <+6>:   movl   $0x0,-0x4(%ebp)  // i = 0   i = 0
   0x0000000d <+13>:  addl   $0x1,-0x4(%ebp)  // i++     i = 1
   0x00000011 <+17>:  mov    -0x4(%ebp),%eax  // j = i   i = 1  j = 1
   0x00000014 <+20>:  add    %eax,%eax        // j += j  i = 1  j = 2
   0x00000016 <+22>:  add    %eax,-0x4(%ebp)  // i += j  i = 3
   0x00000019 <+25>:  addl   $0x1,-0x4(%ebp)  // i++     i = 4
   0x0000001d <+29>:  leave  
   0x0000001e <+30>:  ret
End of assembler dump.

我...假设0x00000014指令是某种编译器优化吗?)

Just compile and disassemble your line of code, if you are so inclined to know how exactly it is you get what you are getting.

This is what I get on my machine, together with what I think is going on:

$ cat evil.c
void evil(){
  int i = 0;
  i+= i++ + ++i;
}
$ gcc evil.c -c -o evil.bin
$ gdb evil.bin
(gdb) disassemble evil
Dump of assembler code for function evil:
   0x00000000 <+0>:   push   %ebp
   0x00000001 <+1>:   mov    %esp,%ebp
   0x00000003 <+3>:   sub    $0x10,%esp
   0x00000006 <+6>:   movl   $0x0,-0x4(%ebp)  // i = 0   i = 0
   0x0000000d <+13>:  addl   $0x1,-0x4(%ebp)  // i++     i = 1
   0x00000011 <+17>:  mov    -0x4(%ebp),%eax  // j = i   i = 1  j = 1
   0x00000014 <+20>:  add    %eax,%eax        // j += j  i = 1  j = 2
   0x00000016 <+22>:  add    %eax,-0x4(%ebp)  // i += j  i = 3
   0x00000019 <+25>:  addl   $0x1,-0x4(%ebp)  // i++     i = 4
   0x0000001d <+29>:  leave  
   0x0000001e <+30>:  ret
End of assembler dump.

(I... suppose that the 0x00000014 instruction was some kind of compiler optimization?)

橪书 2025-02-18 01:57:31

行为无法真正解释,因为它同时调用未指定的行为不确定的行为,因此我们将无法对此代码做出任何一般预测,尽管如果您阅读了 olve olve olve maudal 工作,例如 deep c and

因此,继续进行未指定的行为,在 C99标准草案 6.5

语法指示运算符和操作数的分组。74)除指定外
稍后(对于函数call(),&amp;&amp;,|,?:和逗号操作员), subsexpressions评估顺序以及发生在哪些副作用的顺序。强>


因此,当我们有这样的行时:

i = i++ + ++i;

我们不知道 i ++ 还是 ++ i 将首先进行评估。这主要是为了给编译器更好的优化选项

由于程序正在修改变量( i u 等),因此我们在这里也有 。 href =“ http://en.wikipedia.org/wiki/sequence_point” rel =“ noreferrer”>序列点。从标准草案 6.5 2 强调我的

上一个和下一个序列之间的对象应具有其存储值
通过评估表达式,最多可以修改
。此外,先前的值
应仅阅读以确定要存储的值

它引用了以下代码示例未定义:

i = ++i + 1;
a[i++] = i; 

在所有这些示例中,代码正在尝试以同一序列进行多次修改对象,该序列将以; 在这些情况下以这些序列结尾。 :

i = i++ + ++i;
^   ^       ^

i = (i++);
^    ^

u = u++ + ++u;
^   ^       ^

u = (u++);
^    ^

v = v++ + ++v;
^   ^       ^

未指定的行为 C99标准草案 3.4.4

使用未指定的价值或该国际标准提供的其他行为
两种或多种可能性,并且没有在任何中选择的进一步要求
实例

未定义的行为 3.4.3 中定义了:

行为,使用不可移动或错误的程序构建或错误的数据后,
该国际标准不征收任何要求

,并指出:

可能的未定义行为范围从完全忽略情况,以不可预测的结果忽略情况,到以记录的环境特征(有或没有发出诊断消息的发行)的方式在翻译或程序执行过程中的行为,到终止翻译或执行(发出诊断消息)。

The behavior can't really be explained because it invokes both unspecified behavior and undefined behavior, so we can not make any general predictions about this code, although if you read Olve Maudal's work such as Deep C and Unspecified and Undefined sometimes you can make good guesses in very specific cases with a specific compiler and environment but please don't do that anywhere near production.

So moving on to unspecified behavior, in draft c99 standard section6.5 paragraph 3 says(emphasis mine):

The grouping of operators and operands is indicated by the syntax.74) Except as specified
later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.

So when we have a line like this:

i = i++ + ++i;

we do not know whether i++ or ++i will be evaluated first. This is mainly to give the compiler better options for optimization.

We also have undefined behavior here as well since the program is modifying variables(i, u, etc..) more than once between sequence points. From draft standard section 6.5 paragraph 2(emphasis mine):

Between the previous and next sequence point an object shall have its stored value
modified at most once
by the evaluation of an expression. Furthermore, the prior value
shall be read only to determine the value to be stored
.

it cites the following code examples as being undefined:

i = ++i + 1;
a[i++] = i; 

In all these examples the code is attempting to modify an object more than once in the same sequence point, which will end with the ; in each one of these cases:

i = i++ + ++i;
^   ^       ^

i = (i++);
^    ^

u = u++ + ++u;
^   ^       ^

u = (u++);
^    ^

v = v++ + ++v;
^   ^       ^

Unspecified behavior is defined in the draft c99 standard in section 3.4.4 as:

use of an unspecified value, or other behavior where this International Standard provides
two or more possibilities and imposes no further requirements on which is chosen in any
instance

and undefined behavior is defined in section 3.4.3 as:

behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements

and notes that:

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

轮廓§ 2025-02-18 01:57:31

另一种回答这一点的方法,而不是陷入序列点和未定义行为的神秘细节,只是问,他们应该是什么意思? 程序员试图做什么?

​没有人会在真实的程序中写下它,这并不明显,它的作用并不能想到有人可能一直试图编码会导致这种特殊的操作顺序。而且,由于对您和我来说并不明显,如果编译器不知道应该做什么,这在我的书中很好。

第二个片段 i = i ++ ,更容易理解。看起来有人正在尝试增加 i ,并将结果分配回 i 。但是有几种方法可以在C中执行此操作。在几乎任何编程语言中相同:

i = i + 1

c当然都有一个方便的快捷方式:

i++

这也意味着:“取 i 的值,添加1,然后将结果分配回 i “。 的值,则构造了两者的杂物。

i = i++

因此,如果我们通过编写我们真正说的是“取 i 的值,则添加1,将结果分配给 i ,将结果分配给 并将结果分配回 i ”。我们感到困惑,因此,如果编译器也感到困惑,这不会打扰我太多。

实际上,这些疯狂表达式唯一写的是人们将它们用作人工示例 ++ 应该如何工作的时候。当然,重要的是要了解 ++ 的工作原理。但是,使用 ++ 的一个实际规则是,“如果不明显使用 ++ 的表达式是什么意思,请不要写。”

我们曾经在comp.lang.c上花费无数小时讨论这样的表情,而为什么它们不确定。我的两个更长的答案,试图真正解释原因,在网上存档:

? > c FAQ列表

Another way of answering this, rather than getting bogged down in arcane details of sequence points and undefined behavior, is simply to ask, what are they supposed to mean? What was the programmer trying to do?

The first fragment asked about, i = i++ + ++i, is pretty clearly insane in my book. No one would ever write it in a real program, it's not obvious what it does, there's no conceivable algorithm someone could have been trying to code that would have resulted in this particular contrived sequence of operations. And since it's not obvious to you and me what it's supposed to do, it's fine in my book if the compiler can't figure out what it's supposed to do, either.

The second fragment, i = i++, is a little easier to understand. It looks like someone is trying to increment i, and assign the result back to i. But there are a couple ways of doing this in C. The most basic way to take i's value, add 1, and assign the result back to i, is the same in almost any programming language:

i = i + 1

C, of course, has a handy shortcut:

i++

This also means, "take i's value, add 1, and assign the result back to i". So if we construct a hodgepodge of the two, by writing

i = i++

what we're really saying is "take i's value, add 1, assign the result back to i, and assign the result back to i". We're confused, so it doesn't bother me too much if the compiler gets confused, too.

Realistically, the only time these crazy expressions get written is when people are using them as artificial examples of how ++ is supposed to work. And of course it is important to understand how ++ works. But one practical rule for using ++ is, "If it's not obvious what an expression using ++ means, don't write it."

We used to spend countless hours on comp.lang.c discussing expressions like these and why they're undefined. Two of my longer answers, that try to really explain why, are archived on the web:

See also question 3.8 and the rest of the questions in section 3 of the C FAQ list.

蓝海 2025-02-18 01:57:31

通常,这个问题链接为与代码或类似变体有关的问题的

printf("%d %d\n", i, i++);

重复

printf("%d %d\n", ++i, i++);

虽然这也是 不确定的行为 语句进行比较时涉及:

x = i++ + i++;

printf()与以下

printf("%d %d\n", ++i, i++);

评估顺序 printf()中的参数是 unsecified 。这意味着,可以按任何顺序评估表达式 I ++ ++ i 。 c11标准对此具有一些相关的描述:

附件J,未注册行为

函数指定者,参数和
参数中的子表达在函数调用中评估
(6.5.2.2)。

3.4.4,未指定的行为

使用未指定的值或其他行为
国际标准提供了两种或多种可能性并强加
在任何情况下都没有选择的进一步要求。

示例一个未指定行为的示例是
评估了函数的论点。

未指定的行为本身不是问题。考虑此示例:

printf("%d %d\n", ++x, y++);

这也具有未指定的行为,因为 ++ X y ++ 的评估顺序未指定。但这是完全合法和有效的陈述。在此语句中,没有 不确定的行为。因为修改( ++ X y ++ )已完成 dintistion 对象。

以下语句

printf("%d %d\n", ++i, i++);

不确定的行为呈现的是,这两个表达式修改了相同对象 i 而无需中间 序列点


另一个细节是printf()调用中涉及的 comma saparator ,而不是 逗号操作员

这是一个重要的区别,因为 comma运算符确实在对操作数的评估之间介绍了序列点,这使得以下合法:

int i = 5;
int j;

j = (++i, i++);  // No undefined behaviour here because the comma operator 
                 // introduces a sequence point between '++i' and 'i++'

printf("i=%d j=%d\n",i, j); // prints: i=7 j=6

逗号操作员左右评估其操作数 - 权利和仅产生上一部操作数的价值。因此,在 j =(++ i,i ++); ++ i 增量代码> i ++ 产生 i 的旧值( 6 ),该值分配给 J 。然后 i 由于插入后,变为 7 。

因此,如果函数调用中的 comma 是逗号操作员,那么

printf("%d %d\n", ++i, i++);

将不是问题。但是它调用了不确定的行为,因为 comma 这是 saparator


对于那些不确定的行为新手的人,将受益于阅读每个C程序员都应该了解有关不确定的行为的知识以了解C中的概念和许多其他未定义行为的变体

。 /A/4105123/1275169“>未定义,未指定和实现定义的行为也很重要。

Often this question is linked as a duplicate of questions related to code like

printf("%d %d\n", i, i++);

or

printf("%d %d\n", ++i, i++);

or similar variants.

While this is also undefined behaviour as stated already, there are subtle differences when printf() is involved when comparing to a statement such as:

x = i++ + i++;

In the following statement:

printf("%d %d\n", ++i, i++);

the order of evaluation of arguments in printf() is unspecified. That means, expressions i++ and ++i could be evaluated in any order. C11 standard has some relevant descriptions on this:

Annex J, unspecified behaviours

The order in which the function designator, arguments, and
subexpressions within the arguments are evaluated in a function call
(6.5.2.2).

3.4.4, unspecified behavior

Use of an unspecified value, or other behavior where this
International Standard provides two or more possibilities and imposes
no further requirements on which is chosen in any instance.

EXAMPLE An example of unspecified behavior is the order in which the
arguments to a function are evaluated.

The unspecified behaviour itself is NOT an issue. Consider this example:

printf("%d %d\n", ++x, y++);

This too has unspecified behaviour because the order of evaluation of ++x and y++ is unspecified. But it's perfectly legal and valid statement. There's no undefined behaviour in this statement. Because the modifications (++x and y++) are done to distinct objects.

What renders the following statement

printf("%d %d\n", ++i, i++);

as undefined behaviour is the fact that these two expressions modify the same object i without an intervening sequence point.


Another detail is that the comma involved in the printf() call is a separator, not the comma operator.

This is an important distinction because the comma operator does introduce a sequence point between the evaluation of their operands, which makes the following legal:

int i = 5;
int j;

j = (++i, i++);  // No undefined behaviour here because the comma operator 
                 // introduces a sequence point between '++i' and 'i++'

printf("i=%d j=%d\n",i, j); // prints: i=7 j=6

The comma operator evaluates its operands left-to-right and yields only the value of the last operand. So in j = (++i, i++);, ++i increments i to 6 and i++ yields old value of i (6) which is assigned to j. Then i becomes 7 due to post-increment.

So if the comma in the function call were to be a comma operator then

printf("%d %d\n", ++i, i++);

will not be a problem. But it invokes undefined behaviour because the comma here is a separator.


For those who are new to undefined behaviour would benefit from reading What Every C Programmer Should Know About Undefined Behavior to understand the concept and many other variants of undefined behaviour in C.

This post: Undefined, unspecified and implementation-defined behavior is also relevant.

薯片软お妹 2025-02-18 01:57:31

虽然任何编译器和处理器实际上不太可能这样做,但根据C标准,编译器以序列实现“ I ++”是合法的:

In a single operation, read `i` and lock it to prevent access until further notice
Compute (1+read_value)
In a single operation, unlock `i` and store the computed value

虽然我认为任何处理器都不支持硬件来允许这样的硬件有效地完成的一件事,可以轻松地想象这种行为会使多线程代码更容易的情况(例如,如果两个线程尝试同时执行上述序列,则可以保证, i 会增加到两个),并且某些未来的处理器可能会提供类似的功能并不完全不可思议。

如果编译器要编写 i ++ ,如上所述(根据标准合法),并在整个总体表达式(也是合法)的整个评估中插入上述说明,并且如果不是碰巧请注意,其他指令之一碰巧访问 i ,编译器可以生成一系列僵局的指令。可以肯定的是,在两个地方使用相同变量 i 的情况下,编译器几乎可以肯定会检测到问题和 q ,并使用(*p)(*q)在上面的表达式中(而不是使用 i 两次)不需要编译器识别或避免如果同一个对象的地址都通过 p q 传递的僵局。

While it is unlikely that any compilers and processors would actually do so, it would be legal, under the C standard, for the compiler to implement "i++" with the sequence:

In a single operation, read `i` and lock it to prevent access until further notice
Compute (1+read_value)
In a single operation, unlock `i` and store the computed value

While I don't think any processors support the hardware to allow such a thing to be done efficiently, one can easily imagine situations where such behavior would make multi-threaded code easier (e.g. it would guarantee that if two threads try to perform the above sequence simultaneously, i would get incremented by two) and it's not totally inconceivable that some future processor might provide a feature something like that.

If the compiler were to write i++ as indicated above (legal under the standard) and were to intersperse the above instructions throughout the evaluation of the overall expression (also legal), and if it didn't happen to notice that one of the other instructions happened to access i, it would be possible (and legal) for the compiler to generate a sequence of instructions that would deadlock. To be sure, a compiler would almost certainly detect the problem in the case where the same variable i is used in both places, but if a routine accepts references to two pointers p and q, and uses (*p) and (*q) in the above expression (rather than using i twice) the compiler would not be required to recognize or avoid the deadlock that would occur if the same object's address were passed for both p and q.

无风消散 2025-02-18 01:57:31

虽然表达式的语法 a = a ++ a +++a+a ++ 是合法的,但这些构造的行为 IS 不确定,因为不遵守C标准中的 c99 6.5p2

  1. 在上一个序列和下一个序列之间,一个对象应通过评估表达式将其存储的值最多修改。 [72]此外,应仅阅读先前的值以确定要存储的值[73]

脚注73 进一步澄清

  1. 本段呈现未定义的语句表达式,例如

      i = ++ i +1;
    a [i ++] = i;
     

    允许

      i = i + 1;
    a [i] = i;
     

各种序列点在 c11 (和):

  1. 以下是5.1.2.3中描述的序列点:

    • 在功能指定器的评估与函数调用中的实际参数与实际呼叫之间。 (6.5.2.2)。
    • 在以下操作员的第一和第二操作数的评估之间:逻辑和&amp; (6.5.13);逻辑或|| (6.5.14);逗号,(6.5.17)。
    • 在条件的第一操作数的评估之间? :操作员以及第二和第三操作数的哪个(6.5.15)。
    • 完整声明者的结束:声明者(6.7.6);
    • 在评估完整表达和要评估的下一个完整表达之间。以下是完整的表达式:不是复合文字的一部分(6.7.9);表达式语句中的表达式(6.8.3);选择语句的控制表达式(如果或开关)(6.8.4);一段时间或do语句的控制表达(6.8.5);用于语句的每个(可选)表达式(6.8.5.3);返回语句(6.8.6.4)中的(可选)表达式。
    • 在库函数返回(7.1.4)之前。
    • 在与每个格式化的输入/输出功能转换指定符(7.21.6,7.29.2)相关的操作之后。
    • 在每个调用比较函数的呼叫函数以及对比较函数的任何调用和任何作为参数传递给该调用的对象的移动之间的呼叫(7.22.5)之间。

相同

  1. 如果使用同一标量对象的值对同一标量对象的不同副作用对同一标量对象的不同副作用相对于对象的副作用是未序列的,则行为是不确定的。如果表达式的子表达有多个允许的顺序,则如果在任何有序中发生这种不序列的副作用,则行为是不确定的。84)

您可以通过例如使用GCC使用GCC的最新版本的GCC来检测程序中的此类错误 -wall -werror ,然后GCC将直接拒绝编译您的程序。以下是GCC的输出(Ubuntu 6.2.0-5ubuntu12)6.2.0 20161005:

% gcc plusplus.c -Wall -Werror -pedantic
plusplus.c: In function ‘main’:
plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
    i = i++ + ++i;
    ~~^~~~~~~~~~~
plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
    i = (i++);
    ~~^~~~~~~
plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
    u = u++ + ++u;
    ~~^~~~~~~~~~~
plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
    u = (u++);
    ~~^~~~~~~
plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
    v = v++ + ++v;
    ~~^~~~~~~~~~~
plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
cc1: all warnings being treated as errors

重要的部分是知道序列点是什么 - 什么是序列点,什么不是不是 。例如, comma运算符是一个序列,因此

j = (i ++, ++ i);

定义明确,并且会递增 i 一个,产生旧值,丢弃该值;然后在逗号操作员,解决副作用;然后通过一个增量 i ,结果值成为表达式的值 - ie这只是一个人为编写 j =(i += 2) when但是,这是一种写作的“巧妙”方法,

i += 2;
j = i;

但是,,函数参数列表中的是不是逗号运算符,并且在不同参数的评估之间没有序列。相反,他们的评估是相互序列的;因此,函数调用

int i = 0;
printf("%d %d\n", i++, ++i, i);

具有 不确定的行为,因为 i ++ ++ i 之间没有序列点。 /strong>,因此 i 的值两次都通过 i ++ ++ i ,在上一个序列和下一个序列之间进行了修改。观点。

While the syntax of the expressions like a = a++ or a++ + a++ is legal, the behaviour of these constructs is undefined because a shall in C standard is not obeyed. C99 6.5p2:

  1. Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. [72] Furthermore, the prior value shall be read only to determine the value to be stored [73]

With footnote 73 further clarifying that

  1. This paragraph renders undefined statement expressions such as

    i = ++i + 1;
    a[i++] = i;
    

    while allowing

    i = i + 1;
    a[i] = i;
    

The various sequence points are listed in Annex C of C11 (and C99):

  1. The following are the sequence points described in 5.1.2.3:

    • Between the evaluations of the function designator and actual arguments in a function call and the actual call. (6.5.2.2).
    • Between the evaluations of the first and second operands of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); comma , (6.5.17).
    • Between the evaluations of the first operand of the conditional ? : operator and whichever of the second and third operands is evaluated (6.5.15).
    • The end of a full declarator: declarators (6.7.6);
    • Between the evaluation of a full expression and the next full expression to be evaluated. The following are full expressions: an initializer that is not part of a compound literal (6.7.9); the expression in an expression statement (6.8.3); the controlling expression of a selection statement (if or switch) (6.8.4); the controlling expression of a while or do statement (6.8.5); each of the (optional) expressions of a for statement (6.8.5.3); the (optional) expression in a return statement (6.8.6.4).
    • Immediately before a library function returns (7.1.4).
    • After the actions associated with each formatted input/output function conversion specifier (7.21.6, 7.29.2).
    • Immediately before and immediately after each call to a comparison function, and also between any call to a comparison function and any movement of the objects passed as arguments to that call (7.22.5).

The wording of the same paragraph in C11 is:

  1. If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.84)

You can detect such errors in a program by for example using a recent version of GCC with -Wall and -Werror, and then GCC will outright refuse to compile your program. The following is the output of gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005:

% gcc plusplus.c -Wall -Werror -pedantic
plusplus.c: In function ‘main’:
plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
    i = i++ + ++i;
    ~~^~~~~~~~~~~
plusplus.c:6:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
plusplus.c:10:6: error: operation on ‘i’ may be undefined [-Werror=sequence-point]
    i = (i++);
    ~~^~~~~~~
plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
    u = u++ + ++u;
    ~~^~~~~~~~~~~
plusplus.c:14:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
plusplus.c:18:6: error: operation on ‘u’ may be undefined [-Werror=sequence-point]
    u = (u++);
    ~~^~~~~~~
plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
    v = v++ + ++v;
    ~~^~~~~~~~~~~
plusplus.c:22:6: error: operation on ‘v’ may be undefined [-Werror=sequence-point]
cc1: all warnings being treated as errors

The important part is to know what a sequence point is -- and what is a sequence point and what isn't. For example the comma operator is a sequence point, so

j = (i ++, ++ i);

is well-defined, and will increment i by one, yielding the old value, discard that value; then at comma operator, settle the side effects; and then increment i by one, and the resulting value becomes the value of the expression - i.e. this is just a contrived way to write j = (i += 2) which is yet again a "clever" way to write

i += 2;
j = i;

However, the , in function argument lists is not a comma operator, and there is no sequence point between evaluations of distinct arguments; instead their evaluations are unsequenced with regard to each other; so the function call

int i = 0;
printf("%d %d\n", i++, ++i, i);

has undefined behaviour because there is no sequence point between the evaluations of i++ and ++i in function arguments, and the value of i is therefore modified twice, by both i++ and ++i, between the previous and the next sequence point.

月野兔 2025-02-18 01:57:31

您的问题可能不是:“为什么这些构造在C中不确定的行为?”。您的问题可能是:“为什么此代码(使用 ++ )不给我我期望的价值?”,有人将您的问题标记为重复,然后将您发送给您。

这个答案试图回答这个问题:为什么您的代码不给您期望的答案,以及如何学会识别(并避免)无法正常工作的表达式。

我假设您已经听到了C的 ++ - 运算符的基本定义,以及前缀表单 ++ x 的前缀如何有所不同。从后缀表单 x ++ 。但是这些操作员很难考虑,因此,为了确保您理解,也许您写了一个小小的测试程序,涉及类似的内容

int x = 5;
printf("%d %d %d\n", x, ++x, x++);

,但令人惊讶的是,该程序确实 not> 帮助您理解 - 它印刷了一些奇怪的,莫名其妙的输出表明,也许 ++ 可以做一些完全不同的事情,而不是您认为的事情。

或者,也许您正在研究一种难以理解的表达,就像

int x = 5;
x = x++ + ++x;
printf("%d\n", x);

有人给您那个代码的难题。此代码也没有意义,尤其是如果您运行它 - 如果您在两个不同的编译器下进行编译并运行它,则您可能会得到两个不同的答案!那怎么了?哪个答案正确? (答案是他们俩都是,或者它们都不是。)

正如您现在所听到的,这些表达式是 ,这意味着C语言无法保证它们的内容'LL做。这是一个奇怪而令人不安的结果,因为您可能认为您可以编写并运行的任何程序都会产生独特的,定义明确的输出。但是在不确定的行为的情况下,事实并非如此。

是什么使表达不确定?涉及 ++ 的表达方式 - 总是不确定吗?当然不是:这些是有用的操作员,如果正确使用它们,则它们定义得很好。

对于我们正在谈论的表达式,使它们不确定的是什么时候发生太多事情,当我们无法分辨出什么顺序会发生什么,但是当顺序与结果至关重要时。

让我们回到我在此答案中使用的两个示例。当我写

printf("%d %d %d\n", x, ++x, x++);

这个问题时,在实际调用 printf 之前,编译器首先计算 x 的值,或 x ++ ,或者也许 ++ x ?但事实证明我们不知道。 C中没有规则说,函数的论点会从左到右评估,左右或以其他顺序进行评估。因此,我们不能说编译器是否会首先进行 X ,然后 ++ X ,然后 X ++ ,或 X ++ 然后 ++ X 然后 X 或其他一些顺序。但是该顺序显然很重要,因为根据编译器使用的顺序,我们显然会打印出不同的数字。

那疯狂的表情呢?

x = x++ + ++x;

此表达式的问题在于它包含三种不同的尝试来修改 x 的值:(1) x ++ part零件试图服用 x 的值,添加1,将新值存储在 x 中,然后返回旧值; (2) ++ X 部分试图采取 X 的值,添加1,将新值存储在 x 中,然后返回新价值; (3) x = 部分试图将其他两个的总和分配回 x 。这三个尝试任务中的哪个将“获胜”?这三个值中的哪个实际上将确定 x 的最终值?再次,也许令人惊讶的是,C中没有规则可以告诉我们。

您可能会想象,优先级或关联或从左到右的评估会告诉您发生了什么顺序,但事实并非如此。您可能不相信我,但请言语,我会再说一遍:优先级和关联性并不能确定表达式评估顺序的各个方面。我们试图将新值分配给 x ,优先级和关联性 not 的不同点告诉我们,这些尝试首先发生,最后或任何事物。


因此,如果您想确保所有程序都明确定义,您可以编写哪些表达方式,以及哪些表达式以及您不能写的内容?

这些表达式都很好:

y = x++;
z = x++ + y++;
x = x + 1;
x = a[i++];
x = a[i++] + b[j++];
x[i++] = a[j++] + b[k++];
x = *p++;
x = *p++ + *q++;

这些表达式都是不确定的:

x = x++;
x = x++ + ++x;
y = x + x++;
a[i] = i++;
a[i++] = i;
printf("%d %d %d\n", x, ++x, x++);

最后一个问题是,您如何确定哪些表达式定义明确,哪些表达式不确定?

正如我之前说的那样,未定义的表达式是一次发生太多事情的表达式,您无法确定您的顺序发生了什么,以及顺序很重要的地方:

  1. 如果有一个变量正在修改(分配给)两个或多个不同的地方,您怎么知道首先发生了哪种修改?
  2. 如果有一个变量在一个地方被修改,并且在另一个地方使用其值,那么您如何知道它是使用旧值还是新值?

作为#1的示例,在表达式中,

x = x++ + ++x;

有三次修改 x 的尝试。

作为#2的一个示例,在表达式中,

y = x + x++;

我们都使用 x 的值,然后对其进行修改。

这就是答案:确保在您编写的任何表达式中,每个变量最多都会一次修改,并且如果修改变量,您也不会尝试使用该变量的其他地方的值。


还有一件事。您可能想知道如何“修复”我通过介绍开始这个答案的未定义表达式。

对于 printf(“%d%d%d \ n”,x,++ x,x ++); ,这很容易 - 只需将其写入三个单独的 printf 呼叫:

printf("%d ", x);
printf("%d ", ++x);
printf("%d\n", x++);

现在的行为已完全定义,您将获得明智的结果。

另一方面,对于 x = x +++++ x ,无法修复它。无法编写它,以确保与您的期望相匹配的行为 - 但这没关系,因为无论如何,您永远都不会在真实的程序中写出 x = x = x+++++++x 的表达式。

Your question was probably not, "Why are these constructs undefined behavior in C?". Your question was probably, "Why did this code (using ++) not give me the value I expected?", and someone marked your question as a duplicate, and sent you here.

This answer tries to answer that question: why did your code not give you the answer you expected, and how can you learn to recognize (and avoid) expressions that will not work as expected.

I assume you've heard the basic definition of C's ++ and -- operators by now, and how the prefix form ++x differs from the postfix form x++. But these operators are hard to think about, so to make sure you understood, perhaps you wrote a tiny little test program involving something like

int x = 5;
printf("%d %d %d\n", x, ++x, x++);

But, to your surprise, this program did not help you understand — it printed some strange, inexplicable output, suggesting that maybe ++ does something completely different, not at all what you thought it did.

Or, perhaps you're looking at a hard-to-understand expression like

int x = 5;
x = x++ + ++x;
printf("%d\n", x);

Perhaps someone gave you that code as a puzzle. This code also makes no sense, especially if you run it — and if you compile and run it under two different compilers, you're likely to get two different answers! What's up with that? Which answer is correct? (And the answer is that both of them are, or neither of them are.)

As you've heard by now, these expressions are undefined, which means that the C language makes no guarantee about what they'll do. This is a strange and unsettling result, because you probably thought that any program you could write, as long as it compiled and ran, would generate a unique, well-defined output. But in the case of undefined behavior, that's not so.

What makes an expression undefined? Are expressions involving ++ and -- always undefined? Of course not: these are useful operators, and if you use them properly, they're perfectly well-defined.

For the expressions we're talking about, what makes them undefined is when there's too much going on at once, when we can't tell what order things will happen in, but when the order matters to the result we'll get.

Let's go back to the two examples I've used in this answer. When I wrote

printf("%d %d %d\n", x, ++x, x++);

the question is, before actually calling printf, does the compiler compute the value of x first, or x++, or maybe ++x? But it turns out we don't know. There's no rule in C which says that the arguments to a function get evaluated left-to-right, or right-to-left, or in some other order. So we can't say whether the compiler will do x first, then ++x, then x++, or x++ then ++x then x, or some other order. But the order clearly matters, because depending on which order the compiler uses, we'll clearly get a different series of numbers printed out.

What about this crazy expression?

x = x++ + ++x;

The problem with this expression is that it contains three different attempts to modify the value of x: (1) the x++ part tries to take x's value, add 1, store the new value in x, and return the old value; (2) the ++x part tries to take x's value, add 1, store the new value in x, and return the new value; and (3) the x = part tries to assign the sum of the other two back to x. Which of those three attempted assignments will "win"? Which of the three values will actually determine the final value of x? Again, and perhaps surprisingly, there's no rule in C to tell us.

You might imagine that precedence or associativity or left-to-right evaluation tells you what order things happen in, but they do not. You may not believe me, but please take my word for it, and I'll say it again: precedence and associativity do not determine every aspect of the evaluation order of an expression in C. In particular, if within one expression there are multiple different spots where we try to assign a new value to something like x, precedence and associativity do not tell us which of those attempts happens first, or last, or anything.


So with all that background and introduction out of the way, if you want to make sure that all your programs are well-defined, which expressions can you write, and which ones can you not write?

These expressions are all fine:

y = x++;
z = x++ + y++;
x = x + 1;
x = a[i++];
x = a[i++] + b[j++];
x[i++] = a[j++] + b[k++];
x = *p++;
x = *p++ + *q++;

These expressions are all undefined:

x = x++;
x = x++ + ++x;
y = x + x++;
a[i] = i++;
a[i++] = i;
printf("%d %d %d\n", x, ++x, x++);

And the last question is, how can you tell which expressions are well-defined, and which expressions are undefined?

As I said earlier, the undefined expressions are the ones where there's too much going at once, where you can't be sure what order things happen in, and where the order matters:

  1. If there's one variable that's getting modified (assigned to) in two or more different places, how do you know which modification happens first?
  2. If there's a variable that's getting modified in one place, and having its value used in another place, how do you know whether it uses the old value or the new value?

As an example of #1, in the expression

x = x++ + ++x;

there are three attempts to modify x.

As an example of #2, in the expression

y = x + x++;

we both use the value of x, and modify it.

So that's the answer: make sure that in any expression you write, each variable is modified at most once, and if a variable is modified, you don't also attempt to use the value of that variable somewhere else.


One more thing. You might be wondering how to "fix" the undefined expressions I started this answer by presenting.

In the case of printf("%d %d %d\n", x, ++x, x++);, it's easy — just write it as three separate printf calls:

printf("%d ", x);
printf("%d ", ++x);
printf("%d\n", x++);

Now the behavior is perfectly well defined, and you'll get sensible results.

In the case of x = x++ + ++x, on the other hand, there's no way to fix it. There's no way to write it so that it has guaranteed behavior matching your expectations — but that's okay, because you would never write an expression like x = x++ + ++x in a real program anyway.

木緿 2025-02-18 01:57:31

C标准说,最多应在两个序列点之间最多分配一个变量。例如,半颜色是一个序列。
因此,表格的每个陈述:

i = i++;
i = i++ + ++i;

等等,违反了该规则。该标准还说,行为是未定义的,没有指定。一些编译器确实检测到这些并产生一些结果,但这不是根据标准。

但是,可以在两个序列点之间增加两个不同的变量。

while(*src++ = *dst++);

以上是复制/分析字符串的常见编码实践。

The C standard says that a variable should only be assigned at most once between two sequence points. A semi-colon for instance is a sequence point.
So every statement of the form:

i = i++;
i = i++ + ++i;

and so on violate that rule. The standard also says that behavior is undefined and not unspecified. Some compilers do detect these and produce some result but this is not per standard.

However, two different variables can be incremented between two sequence points.

while(*src++ = *dst++);

The above is a common coding practice while copying/analysing strings.

夏雨凉 2025-02-18 01:57:31

https://stackoverflow.com/questions/29505280/29505280/incrementing-arreay-arrey-ray-incray-index-in-c-in-c-in-inc-in-c-in-c-in-inc-inc.-index-inc-in-c-inc. 有人询问了类似的陈述:

int k[] = {0,1,2,3,4,5,6,7,8,9,10};
int i = 0;
int num;
num = k[++i+k[++i]] + k[++i];
printf("%d", num);

哪个打印7 ... OP期望它打印6。

++ i 增量不能保证在其余计算之前完成所有内容。实际上,不同的编译器将在这里获得不同的结果。在您提供的示例中,执行了第一个2 ++ i ,然后读取 k [] 的值,然后是最后的 ++ i 然后 k []

num = k[i+1]+k[i+2] + k[i+3];
i += 3

现代编译器将很好地优化它。实际上,可能比您最初编写的代码更好(假设它按照您希望的方式起作用)。

In https://stackoverflow.com/questions/29505280/incrementing-array-index-in-c someone asked about a statement like:

int k[] = {0,1,2,3,4,5,6,7,8,9,10};
int i = 0;
int num;
num = k[++i+k[++i]] + k[++i];
printf("%d", num);

which prints 7... the OP expected it to print 6.

The ++i increments aren't guaranteed to all complete before the rest of the calculations. In fact, different compilers will get different results here. In the example you provided, the first 2 ++i executed, then the values of k[] were read, then the last ++i then k[].

num = k[i+1]+k[i+2] + k[i+3];
i += 3

Modern compilers will optimize this very well. In fact, possibly better than the code you originally wrote (assuming it had worked the way you had hoped).

黯淡〆 2025-02-18 01:57:31

文档中提供了关于这种计算中发生的情况的一个很好的解释noreferrer“> n1188 来自 >。

我解释了这些想法。

在这种情况下适用的标准ISO 9899的主要规则是6.5p2。

在上一个序列和下一个序列之间,一个对象应通过评估表达式将其存储的值最多修改。此外,应仅读取先前的值以确定要存储的值。

i = i ++ 之类的表达式中的序列点在 i = 之前和 i ++ 之前。

在我上面引用的论文中,您可以确定该程序是由小框形成的,每个框中包含连续2个序列之间的指令。在 i = i = i ++ 的情况下,序列点是在标准的附件C中定义的,有2个序列点可以界定全表达。这种表达式在句法上等效于表达式词的条目以语法的backus-naur形式(标准的附件A中提供了语法)。

因此,盒子内的说明顺序没有明确的顺序。

i=i++

可以将其解释为

tmp = i
i=i+1
i = tmp

tmp = i
i = tmp
i=i+1

可以解释,因为所有这些形式都可以解释代码 i = i ++ 都是有效的,并且因为两者都会产生不同的答案,所以行为不确定。

因此,序列点可以通过开始,每个组成程序的框的末端(盒子是C中的原子单元),并且在任何情况下均未定义说明顺序。更改该订单有时会更改结果。

编辑:

解释此类歧义的其他良好来源是 c-faq site(也出版了作为一本书),即在这里在这里 and 在这里

A good explanation about what happens in this kind of computation is provided in the document n1188 from the ISO W14 site.

I explain the ideas.

The main rule from the standard ISO 9899 that applies in this situation is 6.5p2.

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

The sequence points in an expression like i=i++ are before i= and after i++.

In the paper that I quoted above it is explained that you can figure out the program as being formed by small boxes, each box containing the instructions between 2 consecutive sequence points. The sequence points are defined in annex C of the standard, in the case of i=i++ there are 2 sequence points that delimit a full-expression. Such an expression is syntactically equivalent with an entry of expression-statement in the Backus-Naur form of the grammar (a grammar is provided in annex A of the Standard).

So the order of instructions inside a box has no clear order.

i=i++

can be interpreted as

tmp = i
i=i+1
i = tmp

or as

tmp = i
i = tmp
i=i+1

because both all these forms to interpret the code i=i++ are valid and because both generate different answers, the behavior is undefined.

So a sequence point can be seen by the beginning and the end of each box that composes the program [the boxes are atomic units in C] and inside a box the order of instructions is not defined in all cases. Changing that order one can change the result sometimes.

EDIT:

Other good source for explaining such ambiguities are the entries from c-faq site (also published as a book) , namely here and here and here .

被翻牌 2025-02-18 01:57:31

原因是该程序正在运行未定义的行为。问题在于评估顺序,因为根据C ++ 98标准不需要序列点(根据C ++ 11术语,在另一个术语之前或之后未对操作进行测序)。

但是,如果您坚持一个编译器,只要您不添加函数呼叫或指示,就会发现行为持续存在,这会使行为更加混乱。

使用 nuwen mingw 15 GCC 7.1您会得到:

 #include<stdio.h>
 int main(int argc, char ** argv)
 {
    int i = 0;
    i = i++ + ++i;
    printf("%d\n", i); // 2

    i = 1;
    i = (i++);
    printf("%d\n", i); //1

    volatile int u = 0;
    u = u++ + ++u;
    printf("%d\n", u); // 2

    u = 1;
    u = (u++);
    printf("%d\n", u); //1

    register int v = 0;
    v = v++ + ++v;
    printf("%d\n", v); //2
 }

GCC如何工作?它在右侧(RHS)的左至右顺序评估子表达式,然后将值分配给左侧(LHS)。这正是Java和C#表现并定义其标准的方式。 (是的,Java和C#中的等效软件具有定义的行为)。它在RHS语句中以左至右顺序评估每个子表达式;对于每个子表达式:首先评估++ c(预先提交),然后将值C用于操作,然后将邮政为增​​量C ++)。

根据 gcc c ++:操作员

在GCC C ++中,操作员的优先级控制着顺序
评估个人操作员

#include<stdio.h>
int main(int argc, char ** argv)
{
    int i = 0;
    //i = i++ + ++i;
    int r;
    r=i;
    i++;
    ++i;
    r+=i;
    i=r;
    printf("%d\n", i); // 2

    i = 1;
    //i = (i++);
    r=i;
    i++;
    i=r;
    printf("%d\n", i); // 1

    volatile int u = 0;
    //u = u++ + ++u;
    r=u;
    u++;
    ++u;
    r+=u;
    u=r;
    printf("%d\n", u); // 2

    u = 1;
    //u = (u++);
    r=u;
    u++;
    u=r;
    printf("%d\n", u); // 1

    register int v = 0;
    //v = v++ + ++v;
    r=v;
    v++;
    ++v;
    r+=v;
    v=r;
    printf("%d\n", v); //2
}

然后,我们转到

#include<stdio.h>
int main(int argc, char ** argv)
{
    int i = 0;
    i = i++ + ++i;
    printf("%d\n", i); // 3

    i = 1;
    i = (i++);
    printf("%d\n", i); // 2 

    volatile int u = 0;
    u = u++ + ++u;
    printf("%d\n", u); // 3

    u = 1;
    u = (u++);
    printf("%d\n", u); // 2 

    register int v = 0;
    v = v++ + ++v;
    printf("%d\n", v); // 3 
}

视觉工作室如何工作,它采用另一种方法,它评估了第一次通过的所有预插入表达式,然后在第二次通过的操作中使用变量值,然后在第三次通过时从RHS到LHS,然后最后,它在一个通过中评估了所有插入后表达式。

因此,如视觉C ++所理解的定义行为C ++的等效词:

#include<stdio.h>
int main(int argc, char ** argv)
{
    int r;
    int i = 0;
    //i = i++ + ++i;
    ++i;
    r = i + i;
    i = r;
    i++;
    printf("%d\n", i); // 3

    i = 1;
    //i = (i++);
    r = i;
    i = r;
    i++;
    printf("%d\n", i); // 2 

    volatile int u = 0;
    //u = u++ + ++u;
    ++u;
    r = u + u;
    u = r;
    u++;
    printf("%d\n", u); // 3

    u = 1;
    //u = (u++);
    r = u;
    u = r;
    u++;
    printf("%d\n", u); // 2 

    register int v = 0;
    //v = v++ + ++v;
    ++v;
    r = v + v;
    v = r;
    v++;
    printf("%d\n", v); // 3 
}

作为Visual Studio文档在

,如果几个操作员一起出现,它们具有相等的优先级,并根据其关联性进行评估。表中的操作员以后缀操作员开头的节中描述了。

The reason is that the program is running undefined behavior. The problem lies in the evaluation order, because there is no sequence points required according to C++98 standard ( no operations is sequenced before or after another according to C++11 terminology).

However if you stick to one compiler, you will find the behavior persistent, as long as you don't add function calls or pointers, which would make the behavior more messy.

Using Nuwen MinGW 15 GCC 7.1 you will get:

 #include<stdio.h>
 int main(int argc, char ** argv)
 {
    int i = 0;
    i = i++ + ++i;
    printf("%d\n", i); // 2

    i = 1;
    i = (i++);
    printf("%d\n", i); //1

    volatile int u = 0;
    u = u++ + ++u;
    printf("%d\n", u); // 2

    u = 1;
    u = (u++);
    printf("%d\n", u); //1

    register int v = 0;
    v = v++ + ++v;
    printf("%d\n", v); //2
 }

How does GCC work? it evaluates sub expressions at a left to right order for the right hand side (RHS) , then assigns the value to the left hand side (LHS) . This is exactly how Java and C# behave and define their standards. (Yes, the equivalent software in Java and C# has defined behaviors). It evaluate each sub expression one by one in the RHS Statement in a left to right order; for each sub expression: the ++c (pre-increment) is evaluated first then the value c is used for the operation, then the post increment c++).

according to GCC C++: Operators

In GCC C++, the precedence of the operators controls the order in
which the individual operators are evaluated

the equivalent code in defined behavior C++ as GCC understands:

#include<stdio.h>
int main(int argc, char ** argv)
{
    int i = 0;
    //i = i++ + ++i;
    int r;
    r=i;
    i++;
    ++i;
    r+=i;
    i=r;
    printf("%d\n", i); // 2

    i = 1;
    //i = (i++);
    r=i;
    i++;
    i=r;
    printf("%d\n", i); // 1

    volatile int u = 0;
    //u = u++ + ++u;
    r=u;
    u++;
    ++u;
    r+=u;
    u=r;
    printf("%d\n", u); // 2

    u = 1;
    //u = (u++);
    r=u;
    u++;
    u=r;
    printf("%d\n", u); // 1

    register int v = 0;
    //v = v++ + ++v;
    r=v;
    v++;
    ++v;
    r+=v;
    v=r;
    printf("%d\n", v); //2
}

Then we go to Visual Studio. Visual Studio 2015, you get:

#include<stdio.h>
int main(int argc, char ** argv)
{
    int i = 0;
    i = i++ + ++i;
    printf("%d\n", i); // 3

    i = 1;
    i = (i++);
    printf("%d\n", i); // 2 

    volatile int u = 0;
    u = u++ + ++u;
    printf("%d\n", u); // 3

    u = 1;
    u = (u++);
    printf("%d\n", u); // 2 

    register int v = 0;
    v = v++ + ++v;
    printf("%d\n", v); // 3 
}

How does Visual Studio work, it takes another approach, it evaluates all pre-increments expressions in first pass, then uses variables values in the operations in second pass, assign from RHS to LHS in third pass, then at last pass it evaluates all the post-increment expressions in one pass.

So the equivalent in defined behavior C++ as Visual C++ understands:

#include<stdio.h>
int main(int argc, char ** argv)
{
    int r;
    int i = 0;
    //i = i++ + ++i;
    ++i;
    r = i + i;
    i = r;
    i++;
    printf("%d\n", i); // 3

    i = 1;
    //i = (i++);
    r = i;
    i = r;
    i++;
    printf("%d\n", i); // 2 

    volatile int u = 0;
    //u = u++ + ++u;
    ++u;
    r = u + u;
    u = r;
    u++;
    printf("%d\n", u); // 3

    u = 1;
    //u = (u++);
    r = u;
    u = r;
    u++;
    printf("%d\n", u); // 2 

    register int v = 0;
    //v = v++ + ++v;
    ++v;
    r = v + v;
    v = r;
    v++;
    printf("%d\n", v); // 3 
}

as Visual Studio documentation states at Precedence and Order of Evaluation:

Where several operators appear together, they have equal precedence and are evaluated according to their associativity. The operators in the table are described in the sections beginning with Postfix Operators.

北方的韩爷 2025-02-18 01:57:31

理解这一点的关键是 i ++ value i ,其效果是将1添加到 i (即,在变量 i 中存储值 i+1 ),但这并不意味着在确定值时将进行存储。

i +++++ I 之类的表达式中。

但是,当两侧的效果发生时,这是不确定的,因此整个表达式的值不确定( i +++++++ i )。首先引用 i 在当前语句之前使用 i 的值,或右手效应后的 i (未确认执行顺序)或反之亦然第二个引用 i 在第一个效果之后使用该值。 C标准专门说明未定义并定义其将优化者限制为无助的特定执行顺序。

编译器注意到净效应是将 i 以2递增并进行评估是完全合理的(并且可能有效)是很合理的( i 商店 i+2 i 中,或者不

应该做的是尝试弄清楚您的编译器并更改

编译器优化 行为

不相关变化或编译器的新版本都可以改变自己的

。显然(对您来说!)对周围代码的 您需要的代码(例如 2*i+1; i+= 2; ),并意识到所有现代商业编译器都将(当进行优化时)将其转化为您平台最有效的代码,并且 都有明显的保证意义。

对于所有人类的读者来说, 不要想象它比 i = i+1 更有效,因为所有现代商业编译器都将为两者散发相同的代码。他们不是愚蠢的。

The key to understanding this is that the value of the expression i++ is i and it's effect is to add 1 to i (i.e. store the value i+1 in the variable i) but that does not mean that the store will take place when the value is determined.

In an expression like i++ + ++i the value of the left-hand-side of the addition is i and right-hand-side is i+1.

But it's undefined when the effect of either side takes place so undefined what the value of the whole expression (i++ + ++i). Will the first reference to i use the value of i before the current statement or the one after the effect of right hand side (no order of execution is confirmed) or vice versa will the second reference to i use the value after the effect of the first one. The C Standard specifically states that is undefined and defining it constrains the optimiser to a specific order of execution which is unhelpful.

It's perfectly reasonable (and possibly efficient) for a compilter to notice that the net effect is to increment i by 2 and evaluate (what amounts to i+i+1 and later store i+2 in i, or not do that.

What you should not do is try and work out what your compiler does and play to it.

Changes to the compiler optimisation settings, apparently (to you!) unrelated changes to the surrounding code or new releases of the compiler could all change the behaviour.

You lay yourself open to one of the most time consuming kinds of bug that suddenly arise in apparently unchanged code.

Write the code you need (e.g. 2*i+1; i+=2;) and realise that all modern commercial compilers will (when optimisation is on) translate that into the most efficient code for your platform and that it has an obvious and guaranteed meaning to all human readers.

I even suggest never using ++ in any other expression than standalone and then only because it's easy to read. Don't imagine it's somehow more efficient than i=i+1 because all modern commercial compilers will emit the same code for both. They ain't daft.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文