C++ 中的性能比较(普通函数调用 vs for_each+mem_fun vs lambda 表达式)
这些片段中哪一个(性能)最好?
1)
for(list<Enemy*>::iterator iter = enemies.begin(); iter != enemies.end(); iter ++)
(*iter)->prepare(time_elapsed);
2)
for_each(enemies.begin(), enemies.end(), [time_elapsed] (Enemy *e) {e->prepare(time_elapsed);});
3)
for_each(enemies.begin(), enemies.end(), bind2nd(mem_fun1<void, Enemy, GLfloat>(&Enemy::prepare), time_elapsed));
Which is the best one (in performance) among these snippets?
1)
for(list<Enemy*>::iterator iter = enemies.begin(); iter != enemies.end(); iter ++)
(*iter)->prepare(time_elapsed);
2)
for_each(enemies.begin(), enemies.end(), [time_elapsed] (Enemy *e) {e->prepare(time_elapsed);});
3)
for_each(enemies.begin(), enemies.end(), bind2nd(mem_fun1<void, Enemy, GLfloat>(&Enemy::prepare), time_elapsed));
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
Lambda 是最快的解决方案。引用基于堆栈的变量涉及一些特殊的优化。此外,在 C++0x 中,它们比任何绑定内容都灵活得多,并且第一个循环也有清晰度方面的缺点。 Lambda 在各个方面都是 winrar。
然而,我正在认真考虑微优化,除非这是在一个运行数十亿次的真正的内部循环中。
Lambdas are the fastest solution. There are special optimizations involved with taking references to stack-based variables. In addition, in C++0x, they're FAR more flexible than any of that bind stuff, and the first loop also has the clarity disadvantage. Lambdas are the winrar in every way.
However, I'm seriously thinking micro-optimization, unless this is in a really, really inner loop that runs billions of times.
我在没有编译器优化的情况下进行了测量:
2) 和 3) 的运行时间几乎相同,而 1) 则慢了 10 倍。我还添加了对 Visual C++ 2010 有效的 4)
,并且在语义上应该与 Ferruccio 的相同:
4) 也几乎与 2) 和 3) 一样快。
使用 -O2 时,所有的运行时间几乎相同。
I did the measurations with no compiler optimizations:
2) and 3) have almost the same running time, instead 1) is 10 times slower. I added also a 4)
valid for Visual C++ 2010 and should be sematically the same of Ferruccio's:
4) is also almost as fast as 2) and 3).
With -O2 all have almost the same running time.
2和3本质上是相同的。 1 可能会更快,因为它每次迭代执行一次函数调用,而 2 和 3 每次迭代执行两次函数调用。然后,一些函数调用可能会被内联。真正判断的唯一方法是测量。
另外,既然您要引入 lambda 函数 (C++0x),为什么不将基于范围的 for 循环添加到您的测量中:
当然,假设您的编译器支持它们。
编辑:我刚刚注意到 vc++2010 标签。不幸的是,您的编译器尚不支持它们:-(
2 and 3 are essentially identical. 1 might be faster because it performs one function call per iteration whereas 2 and 3 perform two function calls per iteration. Then again some of the function calls might get inlined. The only way to really tell is to measure.
Also, since you're throwing in lambda functions (C++0x), why not add range-based for loops to your measurements:
assuming your compiler supports them, of course.
EDIT: I just noticed the vc++2010 tag. Unfortunately, your compiler does not yet support them :-(
答案是“在您测量某些东西当前不够快之前,这根本不重要”。这个问题本质上是过早的优化。在它太慢并且您测量循环为瓶颈之前,您的首要任务是使用传达您想要执行的操作的代码最清晰的方式。先写好代码,然后再优化。
The answer is "it's not at all important until you've measured that something is currently not fast enough". This question is essentially premature optimisation. Until it's too slow and you've measured the loop as being a bottleneck, your priority is to use the code that communicates what you're trying to do in the clearest way. Write nice code first, optimise later.