performance optimization c micro-optimization

为什么 n++执行速度比 n=n+1 更快？

发布于 2024-09-02 09:56:14 字数 161 浏览 8 评论 0原文

在C语言中，为什么n++执行速度比n=n+1快？

(int n=...;  n++;)
(int n=...;  n=n+1;)

我们的老师在今天的课堂上问了这个问题。（这不是家庭作业）

原文

In C language, Why does n++ execute faster than n=n+1?

(int n=...;  n++;)
(int n=...;  n=n+1;)

Our instructor asked that question in today's class. (this is not homework)

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

一张白纸 2024-09-09 09:56:14

如果您正在使用“石器时代”编译器，情况就是如此……

对于“石器时代”：
++n 比 n++ 快于 n=n+1
机器通常有增量x以及向x添加const

在n++的情况下，你将只有2个内存访问（读n, inc n, write n )
如果是 n=n+1，您将有 3 次内存访问（读取 n、读取 const、添加 n 和 const、写入 n）

但是今天的编译器会自动将 n=n+1 转换为 ++n，而且它会做的事情比你想象的还要多！！

还有今天的out-of-顺序处理器 - 尽管存在“石器时代”编译器 - 在许多情况下运行时可能根本不会受到影响！

相关

回复收藏 0 原文

世界如花海般美丽 2024-09-09 09:56:14

在 x86 的 GCC 4.4.3 上，无论有没有优化，它们都会编译为完全相同的汇编代码，因此执行时间相同。正如您在汇编中看到的，GCC 只是将 n++ 转换为 n=n+1，然后将其优化为单指令添加（在 -O2 中）。

您的讲师关于 n++ 更快的建议仅适用于非常旧的非优化编译器，这些编译器不够智能，无法为 n = n + 1 选择就地更新指令>。这些编译器在 PC 世界中已经过时多年，但在奇怪的专有嵌入式平台上仍然可以找到。

C 代码：

int n;

void nplusplus() {
    n++;
}

void nplusone() {
    n = n + 1;
}

输出汇编（无优化）：

    .file   "test.c"
    .comm   n,4,4
    .text
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

输出汇编（带 -O2 优化）：

    .file   "test.c"
    .text
    .p2align 4,,15
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
    .p2align 4,,15
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .comm   n,4,4
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

On GCC 4.4.3 for x86, with or without optimizations, they compile to the exact same assembly code, and thus take the same amount of time to execute. As you can see in the assembly, GCC simply converts n++ into n=n+1, then optimizes it into the one-instruction add (in the -O2).

Your instructor's suggestion that n++ is faster only applies to very old, non-optimizing compilers, which were not smart enough to select the in-place update instructions for n = n + 1. These compilers have been obsolete in the PC world for years, but may still be found for weird proprietary embedded platforms.

C code:

int n;

void nplusplus() {
    n++;
}

void nplusone() {
    n = n + 1;
}

Output assembly (no optimizations):

    .file   "test.c"
    .comm   n,4,4
    .text
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    movl    n, %eax
    addl    $1, %eax
    movl    %eax, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

Output assembly (with -O2 optimizations):

    .file   "test.c"
    .text
    .p2align 4,,15
.globl nplusplus
    .type   nplusplus, @function
nplusplus:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusplus, .-nplusplus
    .p2align 4,,15
.globl nplusone
    .type   nplusone, @function
nplusone:
    pushl   %ebp
    movl    %esp, %ebp
    addl    $1, n
    popl    %ebp
    ret
    .size   nplusone, .-nplusone
    .comm   n,4,4
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits

回复收藏 0 原文

安稳善良 2024-09-09 09:56:14

编译器会将n + 1优化为空。

您的意思是 n = n + 1 吗？

如果是这样，它们将编译为相同的程序集。（假设优化已开启并且它们是语句，而不是表达式）

回复收藏 0 原文

你对谁都笑 2024-09-09 09:56:14

谁说可以的？您的编译器实际上将其全部优化掉，使其成为一个没有实际意义的问题。

回复收藏 0 原文

不再让梦枯萎 2024-09-09 09:56:14

现代编译器应该能够将这两种形式识别为等效形式，并将它们转换为最适合您的目标平台的格式。此规则有一个例外：具有副作用的变量访问。例如，如果n是某个内存映射的硬件寄存器，则从它读取和写入它可能不仅仅只是传输数据值（例如，读取可能会清除中断）。您可以使用易失性关键字让编译器知道它需要小心优化对n的访问，在这种情况下，编译器可能会生成与不同的代码>n++（递增操作）和n = n + 1（读取、添加和存储操作）。然而，对于普通变量，编译器应该将两种形式优化为相同的东西。