哪个更快? ++、+= 或 x + 1?

发布于 2024-11-17 04:02:24 字数 418 浏览 1 评论 0原文

我正在使用 C#(这个问题对于 C++ 等类似语言也有效),我试图找出最快、最有效的增量方法。在我的游戏中,这不仅仅是一两个增量,每秒大约有 300 个增量。就像屏幕上每个精灵的帧数都在递增一样,我的RPG角色的速度和位置,相机的偏移等等。所以我在想,什么方法是最有效的?例如,我可以在每次移动时增加 5 y_pos

1.

Player.YPos += 5;

2.

Player.YPos = Player.YPos + 5;

3.

for (int i = 0; i < 5; i++)
{
    Player.YPos++;
}

哪个是最有效(且最快)的?

I am using C# (This question is also valid for similar languages like C++) and I am trying to figure out the fastest and most efficient way to increment. It isn't just one or two increments, in my game, its like 300 increments per second. Like the Frames of every sprite on the screen are incrementing, the speed and positions of my rpg character, the offset of the camera etc. So I am thinking, what way is the most efficient? e.g for incrementing 5 y_pos on every movement I can do:

1.

Player.YPos += 5;

2.

Player.YPos = Player.YPos + 5;

3.

for (int i = 0; i < 5; i++)
{
    Player.YPos++;
}

Which is the most efficient (and fastest)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

祁梦 2024-11-24 04:02:25

选项 1 和 2 将导致编译器生成相同的代码。选项 3 会慢很多。

认为 i++i += 1 甚至 i = i + 1 更快是一个谬论。所有像样的编译器都会将这三个指令转换为相同的代码。

对于加法这样的琐碎操作,编写最清晰的代码,让编译器担心它的速度。

Options 1 and 2 will result in identical code being produced by the compiler. Option 3 will be much slower.

It's a fallacy that i++ is faster than i += 1 or even i = i + 1. All decent compilers will turn those three instructions into the same code.

For such a trivial operation as addition, write the clearest code and let the compiler worry about making it fast.

寄意 2024-11-24 04:02:25

编译器应该为1和2生成相同的程序集,并且它可能展开选项3中的循环。当遇到这样的问题时,您可以使用一个有用的工具来根据经验测试发生的事情是查看编译器生成的程序集。在 g++ 中,这可以使用 -S 开关来实现。

例如,当使用命令 g++ -S inc.cpp(使用 g++ 4.5.2)编译时,选项 1 和 2 都会生成此汇编程序


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    addl    $5, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

g++ 为选项 3 生成效率明显较低的汇编程序:


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    addl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    cmpl    $4, -8(%rbp)
    setle   %al
    testb   %al, %al
    jne .L3
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

但是通过对(甚至 -O1)的优化,g++ 会为所有 3 个选项生成此内容:


main:
.LFB0:
    .cfi_startproc
    leal    5(%rdi), %eax
    ret
    .cfi_endproc

g++ 不仅展开选项 3 中的循环,而且还使用 lea 指令 在一条指令中进行加法,而不是用 mov 胡闹。

因此,g++ 将始终为选项 1 和 2 生成相同的程序集。仅当您显式打开优化时,g++ 才会为所有 3 个选项生成相同的程序集(这是您可能期望的行为)。

(看起来您应该能够 也检查 C# 生成的程序集,尽管我从未尝试过)

The compiler should produce the same assembly for 1 and 2 and it may unroll the loop in option 3. When faced with questions like this, a useful tool you can use to empirically test what's going on is to look at the assembly produced by the compiler. In g++ this can be achieved using the -S switch.

For example, both options 1 and 2 produce this assembler when compiled with the command g++ -S inc.cpp (using g++ 4.5.2)


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    addl    $5, -4(%rbp)
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

g++ produces significantly less efficient assembler for option 3:


main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    movq    %rsp, %rbp
    .cfi_offset 6, -16
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    addl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    cmpl    $4, -8(%rbp)
    setle   %al
    testb   %al, %al
    jne .L3
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

But with optimisation on (even -O1) g++ produces this for all 3 options:


main:
.LFB0:
    .cfi_startproc
    leal    5(%rdi), %eax
    ret
    .cfi_endproc

g++ not only unrolls the loop in option 3, but it also uses the lea instruction to do the addition in a single instruction instead of faffing about with mov.

So g++ will always produce the same assembly for options 1 and 2. g++ will produce the same assembly for all 3 options only if you explicitly turn optimisation on (which is the behaviour you'd probably expect).

(and it looks like you should be able to inspect the assembly produced by C# too, although I've never tried that)

哭了丶谁疼 2024-11-24 04:02:25

它们是相同的:

static void Main(string[] args)
{
    int a = 0;
    a++;
    a +=1;
    a = a+1;
}

ILSpy 中的上述代码是:

private static void Main(string[] args)
{
    int a = 0;
    a++;
    a++;
    a++;
}

所有这些的 IL 也相同(在发布模式下) :

.method private hidebysig static void  Main(string[] args) cil managed
{
    .entrypoint
    // Code size       15 (0xf)
    .maxstack  2
    .locals init ([0] int32 a)
    IL_0000:  ldc.i4.0
    IL_0001:  stloc.0
    IL_0002:  ldloc.0
    IL_0003:  ldc.i4.1
    IL_0004:  add
    IL_0005:  stloc.0
    IL_0006:  ldloc.0
    IL_0007:  ldc.i4.1
    IL_0008:  add
    IL_0009:  stloc.0
    IL_000a:  ldloc.0
    IL_000b:  ldc.i4.1
    IL_000c:  add
    IL_000d:  stloc.0
    IL_000e:  ret
} // end of method Program::Main

They are same:

static void Main(string[] args)
{
    int a = 0;
    a++;
    a +=1;
    a = a+1;
}

The above code in ILSpy is:

private static void Main(string[] args)
{
    int a = 0;
    a++;
    a++;
    a++;
}

Also the IL for all these is same as well (In Release mode):

.method private hidebysig static void  Main(string[] args) cil managed
{
    .entrypoint
    // Code size       15 (0xf)
    .maxstack  2
    .locals init ([0] int32 a)
    IL_0000:  ldc.i4.0
    IL_0001:  stloc.0
    IL_0002:  ldloc.0
    IL_0003:  ldc.i4.1
    IL_0004:  add
    IL_0005:  stloc.0
    IL_0006:  ldloc.0
    IL_0007:  ldc.i4.1
    IL_0008:  add
    IL_0009:  stloc.0
    IL_000a:  ldloc.0
    IL_000b:  ldc.i4.1
    IL_000c:  add
    IL_000d:  stloc.0
    IL_000e:  ret
} // end of method Program::Main
梦中的蝴蝶 2024-11-24 04:02:25

选项 1 和 2 编译后将产生相同的代码。选项 3 会慢得多,因为它会导致涉及的 for 循环代码更多。

Options 1 and 2 will result in identical code after being compiled. Option 3 will be much slower as its results in more code for the for loop involved.

星軌x 2024-11-24 04:02:24

(针对 C# 的答案可能会有很大差异,因为 C++ 可能会有很大差异。)

1 和 2 是等效的。

3肯定会慢一些。

话虽如此,每秒仅执行 300 次,您不会注意到任何差异。您是否知道计算机在一秒钟内可以用原始 CPU + 内存执行多少操作?一般来说,您编写代码时应该将清晰性作为最重要的事情。无论如何都要担心性能 - 但只有当您有办法衡量它时,才能a)判断您是否需要担心,以及b)任何更改是否真正提高了性能。

在这种情况下,我会说选项 1 是最清晰的,所以我会使用它。

(Answer specific to C# as C++ may vary significantly.)

1 and 2 are equivalent.

3 would definitely be slower.

Having said that, doing this a mere 300 times a second, you wouldn't notice any difference. Are you aware of just how much a computer can do in terms of raw CPU+memory in a second? In general, you should write code for clarity as the most important thing. By all means worry about performance - but only when you have a way to measure it, in order to a) tell whether you need to worry, and b) whether any changes actually improve the performance.

In this case, I'd say that option 1 is the clearest, so that's what I'd use.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文