为什么 n++执行速度比 n=n+1 更快?

发布于 2024-09-02 09:56:14 字数 161 浏览 5 评论 0原文

在C语言中,为什么n++执行速度比n=n+1快?

(int n=...;  n++;)
(int n=...;  n=n+1;)

我们的老师在今天的课堂上问了这个问题。 (这不是家庭作业)

In C language, Why does n++ execute faster than n=n+1?

(int n=...;  n++;)
(int n=...;  n=n+1;)

Our instructor asked that question in today's class. (this is not homework)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

一张白纸 2024-09-09 09:56:14

如果您正在使用“石器时代”编译器,情况就是如此……

对于“石器时代”
++nn++ 快于 n=n+1
机器通常有增量x以及向x添加const

  • n++的情况下,你将只有2个内存访问(读n, inc n, write n )
  • 如果是 n=n+1,您将有 3 次内存访问(读取 n、读取 const、添加 n 和 const、写入 n)

但是今天的编译器会自动将 n=n+1 转换为 ++n,而且它会做的事情比你想象的还要多!!

还有今天的out-of-顺序处理器 - 尽管存在“石器时代”编译器 - 在许多情况下运行时可能根本不会受到影响!

相关

That would be true if you are working on a "stone-age" compiler...

In case of "stone-age":
++n is faster than n++ is faster than n=n+1
Machine usually have increment x as well as add const to x

  • In case of n++, you will have 2 memory access only (read n, inc n, write n )
  • In case of n=n+1, you will have 3 memory access (read n, read const, add n and const, write n)

But today's compiler will automatically convert n=n+1 to ++n, and it will do more than you may imagine!!

Also on today's out-of-order processors -despite the case of "stone-age" compiler- runtime may not be affected at all in many cases!!

Related

世界如花海般美丽 2024-09-09 09:56:14

在 x86 的 GCC 4.4.3 上,无论有没有优化,它们都会编译为完全相同的汇编代码,因此执行时间相同。正如您在汇编中看到的,GCC 只是将 n++ 转换为 n=n+1,然后将其优化为单指令添加(在 -O2 中)。

您的讲师关于 n++ 更快的建议仅适用于非常旧的非优化编译器,这些编译器不够智能,无法为 n = n + 1 选择就地更新指令>。这些编译器在 PC 世界中已经过时多年,但在奇怪的专有嵌入式平台上仍然可以找到。

C 代码:

int n;

void nplusplus() {
    n++;
}

void nplusone() {
    n = n + 1;
}

输出汇编(无优化):

    .file   "test.c"
    .comm   n,4,4
    .text
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

输出汇编(带 -O2 优化):

    .file   "test.c"
    .text
    .p2align 4,,15
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
    .p2align 4,,15
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .comm   n,4,4
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

On GCC 4.4.3 for x86, with or without optimizations, they compile to the exact same assembly code, and thus take the same amount of time to execute. As you can see in the assembly, GCC simply converts n++ into n=n+1, then optimizes it into the one-instruction add (in the -O2).

Your instructor's suggestion that n++ is faster only applies to very old, non-optimizing compilers, which were not smart enough to select the in-place update instructions for n = n + 1. These compilers have been obsolete in the PC world for years, but may still be found for weird proprietary embedded platforms.

C code:

int n;

void nplusplus() {
    n++;
}

void nplusone() {
    n = n + 1;
}

Output assembly (no optimizations):

    .file   "test.c"
    .comm   n,4,4
    .text
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

Output assembly (with -O2 optimizations):

    .file   "test.c"
    .text
    .p2align 4,,15
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
    .p2align 4,,15
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .comm   n,4,4
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits
安稳善良 2024-09-09 09:56:14

编译器会将n + 1优化为空。

您的意思是 n = n + 1 吗?

如果是这样,它们将编译为相同的程序集。 (假设优化已开启并且它们是语句,而不是表达式)

The compiler will optimize n + 1 into nothingness.

Do you mean n = n + 1?

If so, they will compile to identical assembly. (Assuming that optimizations are on and that they're statements, not expressions)

你对谁都笑 2024-09-09 09:56:14

谁说可以的?您的编译器实际上将其全部优化掉,使其成为一个没有实际意义的问题。

Who says it does? Your compiler optimizes it all away, really, making it a moot point.

不再让梦枯萎 2024-09-09 09:56:14

现代编译器应该能够将这两种形式识别为等效形式,并将它们转换为最适合您的目标平台的格式。此规则有一个例外:具有副作用的变量访问。例如,如果n是某个内存映射的硬件寄存器,则从它读取和写入它可能不仅仅只是传输数据值(例如,读取可能会清除中断)。您可以使用易失性关键字让编译器知道它需要小心优化对n的访问,在这种情况下,编译器可能会生成与不同的代码>n++(递增操作)和n = n + 1(读取、添加和存储操作)。然而,对于普通变量,编译器应该将两种形式优化为相同的东西。

Modern compilers should be able to recognize both forms as equivalent and convert them to the format that works best on your target platform. There is one exception to this rule: variable accesses that have side effects. For example, if n is some memory-mapped hardware register, reading from it and writing to it may do more than just transferring a data value (reading might clear an interrupt, for instance). You would use the volatile keyword to let the compiler know that it needs to be careful about optimizing accesses to n, and in that case the compiler might generate different code from n++ (increment operation) and n = n + 1 (read, add, and store operations). However for normal variables, the compiler should optimize both forms to the same thing.

浅浅淡淡 2024-09-09 09:56:14

事实并非如此。编译器将针对目标体系结构进行更改。像这样的微观优化通常会带来可疑的好处,但重要的是,这肯定不值得程序员花时间。

It doesn't really. The compiler will make changes specific to the target architecture. Micro-optimizations like this often have dubious benefits, but importantly, are certainly not worth the programmer's time.

秋意浓 2024-09-09 09:56:14

实际上,原因是后修复操作符的定义与前置修复操作符的定义不同。 ++n 将递增“n”并返回对“n”的引用,而 n++ 将递增“n”将返回一个 const 副本“n”。因此,短语n = n + 1会更有效。但我不得不同意上面的海报。好的编译器应该优化掉未使用的返回对象。

Actually, the reason is that the operator is defined differently for post-fix than it is for pre-fix. ++n will increment "n" and return a reference to "n" while n++ will increment "n" will returning a const copy of "n". Hence, the phrase n = n + 1 will be more efficient. But I have to agree with the above posters. Good compilers should optimize away an unused return object.

許願樹丅啲祈禱 2024-09-09 09:56:14

在 C 语言中,n++ 表达式的副作用根据定义等价于 n = n + 1 表达式的副作用。由于您的代码仅依赖于副作用,因此很明显正确的答案是这些表达式始终具有完全相同的性能。 (顺便说一句,无论编译器中的任何优化设置如何,因为该问题与任何优化完全无关。)

只有当编译器有意(并且恶意!)尝试引入这些表达式时,这些表达式的性能才有可能出现任何实际差异 。分歧。但在这种情况下,当然,它可以采用任何一种方式,即编译器作者想要倾斜它的方式。

In C language the side-effect of n++ expressions is by definition equivalent to the side effect of n = n + 1 expression. Since your code relies on the side-effects only, it is immediately obvious that the correct answer is that these expression always have exactly equivalent performance. (Regardless of any optimization settings in the compiler, BTW, since the issue has absolutely nothing to do with any optimizations.)

Any practical divergence in performance of these expressions is only possible if the compiler is intentionally (and maliciously!) trying to introduce that divergence. But in this case it can go either way, of course, i.e. whichever way the compiler's author wanted to skew it.

梦一生花开无言 2024-09-09 09:56:14

我认为这更像是硬件问题而不是软件问题...如果我没记错的话,在较旧的CPU中,n=n+1需要两个内存位置,其中++n只是一个微控制器命令...但我怀疑这适用于现代建筑......

I think it's more like a hardware question rather than software... If I remember corectly, in older CPUs the n=n+1 requires two locations of memory, where the ++n is simply a microcontroller command... But I doubt this applies for modern architectures...

话少情深 2024-09-09 09:56:14

所有这些都取决于编译器/处理器/编译指令。因此,做出“一般情况下什么更快”的假设并不是一个好主意。

All those things depends on compiler/processor/compilation directives. So make any assumptions "what is faster in general" is not a good idea.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文