哪个更快？ ++、+= 或 x + 1？

发布于 2024-11-17 04:02:24 字数 418 浏览 4 评论 0原文

我正在使用 C#（这个问题对于 C++ 等类似语言也有效），我试图找出最快、最有效的增量方法。在我的游戏中，这不仅仅是一两个增量，每秒大约有 300 个增量。就像屏幕上每个精灵的帧数都在递增一样，我的RPG角色的速度和位置，相机的偏移等等。所以我在想，什么方法是最有效的？例如，我可以在每次移动时增加 5 y_pos：

Player.YPos += 5;

Player.YPos = Player.YPos + 5;

for (int i = 0; i < 5; i++)
{
    Player.YPos++;
}

哪个是最有效（且最快）的？

原文

I am using C# (This question is also valid for similar languages like C++) and I am trying to figure out the fastest and most efficient way to increment. It isn't just one or two increments, in my game, its like 300 increments per second. Like the Frames of every sprite on the screen are incrementing, the speed and positions of my rpg character, the offset of the camera etc. So I am thinking, what way is the most efficient? e.g for incrementing 5 y_pos on every movement I can do:

Player.YPos += 5;

Player.YPos = Player.YPos + 5;

for (int i = 0; i < 5; i++)
{
    Player.YPos++;
}

Which is the most efficient (and fastest)?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

祁梦 2024-11-24 04:02:25

选项 1 和 2 将导致编译器生成相同的代码。选项 3 会慢很多。

认为 i++ 比 i += 1 甚至 i = i + 1 更快是一个谬论。所有像样的编译器都会将这三个指令转换为相同的代码。

对于加法这样的琐碎操作，编写最清晰的代码，让编译器担心它的速度。

回复收藏 0 原文

寄意 2024-11-24 04:02:25

编译器应该为1和2生成相同的程序集，并且它可能展开选项3中的循环。当遇到这样的问题时，您可以使用一个有用的工具来根据经验测试发生的事情是查看编译器生成的程序集。在 g++ 中，这可以使用 -S 开关来实现。

例如，当使用命令 g++ -S inc.cpp（使用 g++ 4.5.2）编译时，选项 1 和 2 都会生成此汇编程序


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    addl    $5, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

g++ 为选项 3 生成效率明显较低的汇编程序：


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    addl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    cmpl    $4, -8(%rbp)
    setle   %al
    testb   %al, %al
    jne .L3
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

但是通过对（甚至 -O1）的优化，g++ 会为所有 3 个选项生成此内容：


main:
.LFB0:
    .cfi_startproc
    leal    5(%rdi), %eax
    ret
    .cfi_endproc

g++ 不仅展开选项 3 中的循环，而且还使用 lea 指令在一条指令中进行加法，而不是用 mov 胡闹。

因此，g++ 将始终为选项 1 和 2 生成相同的程序集。仅当您显式打开优化时，g++ 才会为所有 3 个选项生成相同的程序集（这是您可能期望的行为）。

（看起来您应该能够也检查 C# 生成的程序集，尽管我从未尝试过）

The compiler should produce the same assembly for 1 and 2 and it may unroll the loop in option 3. When faced with questions like this, a useful tool you can use to empirically test what's going on is to look at the assembly produced by the compiler. In g++ this can be achieved using the -S switch.

For example, both options 1 and 2 produce this assembler when compiled with the command g++ -S inc.cpp (using g++ 4.5.2)


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    addl    $5, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

g++ produces significantly less efficient assembler for option 3:


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    addl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    cmpl    $4, -8(%rbp)
    setle   %al
    testb   %al, %al
    jne .L3
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

But with optimisation on (even -O1) g++ produces this for all 3 options:


main:
.LFB0:
    .cfi_startproc
    leal    5(%rdi), %eax
    ret
    .cfi_endproc

g++ not only unrolls the loop in option 3, but it also uses the lea instruction to do the addition in a single instruction instead of faffing about with mov.

So g++ will always produce the same assembly for options 1 and 2. g++ will produce the same assembly for all 3 options only if you explicitly turn optimisation on (which is the behaviour you'd probably expect).

(and it looks like you should be able to inspect the assembly produced by C# too, although I've never tried that)

回复收藏 0 原文

哭了丶谁疼 2024-11-24 04:02:25

它们是相同的：

static void Main(string[] args)
{
    int a = 0;
    a++;
    a +=1;
    a = a+1;
}

ILSpy 中的上述代码是：

private static void Main(string[] args)
{
    int a = 0;
    a++;
    a++;
    a++;
}

所有这些的 IL 也相同（在发布模式下） :

.method private hidebysig static void  Main(string[] args) cil managed
{
    .entrypoint
    // Code size       15 (0xf)
    .maxstack  2
    .locals init ([0] int32 a)
    IL_0000:  ldc.i4.0
    IL_0001:  stloc.0
    IL_0002:  ldloc.0
    IL_0003:  ldc.i4.1
    IL_0004:  add
    IL_0005:  stloc.0
    IL_0006:  ldloc.0
    IL_0007:  ldc.i4.1
    IL_0008:  add
    IL_0009:  stloc.0
    IL_000a:  ldloc.0
    IL_000b:  ldc.i4.1
    IL_000c:  add
    IL_000d:  stloc.0
    IL_000e:  ret
} // end of method Program::Main

They are same:

static void Main(string[] args)
{
    int a = 0;
    a++;
    a +=1;
    a = a+1;
}

The above code in ILSpy is:

private static void Main(string[] args)
{
    int a = 0;
    a++;
    a++;
    a++;
}

Also the IL for all these is same as well (In Release mode):

.method private hidebysig static void  Main(string[] args) cil managed
{
    .entrypoint
    // Code size       15 (0xf)
    .maxstack  2
    .locals init ([0] int32 a)
    IL_0000:  ldc.i4.0
    IL_0001:  stloc.0
    IL_0002:  ldloc.0
    IL_0003:  ldc.i4.1
    IL_0004:  add
    IL_0005:  stloc.0
    IL_0006:  ldloc.0
    IL_0007:  ldc.i4.1
    IL_0008:  add
    IL_0009:  stloc.0
    IL_000a:  ldloc.0
    IL_000b:  ldc.i4.1
    IL_000c:  add
    IL_000d:  stloc.0
    IL_000e:  ret
} // end of method Program::Main

回复收藏 0 原文