函数中提前返回的效率
作为一个缺乏经验的程序员,我经常遇到这种情况,并且特别想知道我正在尝试优化的一个雄心勃勃的、速度密集型的项目。对于主要的类 C 语言(C、objC、C++、Java、C# 等)及其常用编译器,这两个函数的运行效率是否相同?编译后的代码有什么不同吗?
void foo1(bool flag)
{
if (flag)
{
//Do stuff
return;
}
//Do different stuff
}
void foo2(bool flag)
{
if (flag)
{
//Do stuff
}
else
{
//Do different stuff
}
}
基本上,提前中断
或返回
时是否存在直接的效率奖励/惩罚?堆栈框架是如何参与的?有优化的特殊情况吗?是否有任何因素(例如内联或“Do stuff”的大小)可能会对此产生重大影响?
我始终支持通过较小的优化来提高易读性(我经常通过参数验证看到 foo1),但这种情况出现得如此频繁,以至于我想一劳永逸地抛开所有担忧。
我意识到过早优化的陷阱......呃,这些都是一些痛苦的回忆。
编辑:我接受了一个答案,但 EJP 的答案非常简洁地解释了为什么 return
的使用实际上可以忽略不计(在汇编中,return
创建了一个到末尾的“分支”该分支会改变 PC 寄存器,并且还可能影响缓存和管道,这是非常小的。)特别是对于这种情况,它实际上没有什么区别,因为两者if/else
和 return
创建到函数末尾的相同分支。
This is a situation I encounter frequently as an inexperienced programmer and am wondering about particularly for an ambitious, speed-intensive project of mine I'm trying to optimize. For the major C-like languages (C, objC, C++, Java, C#, etc) and their usual compilers, will these two functions run just as efficiently? Is there any difference in the compiled code?
void foo1(bool flag)
{
if (flag)
{
//Do stuff
return;
}
//Do different stuff
}
void foo2(bool flag)
{
if (flag)
{
//Do stuff
}
else
{
//Do different stuff
}
}
Basically, is there ever a direct efficiency bonus/penalty when break
ing or return
ing early? How is the stackframe involved? Are there optimized special cases? Are there any factors (like inlining or the size of "Do stuff") that could affect this significantly?
I'm always a proponent of improved legibility over minor optimizations (I see foo1 a lot with parameter validation), but this comes up so frequently that I'd like to set aside all worry once and for all.
And I'm aware of the pitfalls of premature optimization... ugh, those are some painful memories.
EDIT: I accepted an answer, but EJP's answer explains pretty succinctly why the use of a return
is practically negligible (in assembly, the return
creates a 'branch' to the end of the function, which is extremely fast. The branch alters the PC register and may also affect the cache and pipeline, which is pretty minuscule.) For this case in particular, it literally makes no difference because both the if/else
and the return
create the same branch to the end of the function.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
完全没有区别:
这意味着即使没有在两个编译器中进行优化,生成的代码也没有任何区别
There is no difference at all:
Meaning no difference in generated code whatsoever even without optimization in two compilers
简短的回答是,没有区别。帮自己一个忙,别再担心这个了。优化编译器几乎总是比你聪明。
专注于可读性和可维护性。
如果您想看看会发生什么,请通过优化来构建它们并查看汇编器输出。
The short answer is, no difference. Do yourself a favour and stop worrying about this. The optimising compiler is almost always smarter than you.
Concentrate on readability and maintainability.
If you want to see what happens, build these with optimisations on and look at the assembler output.
有趣的答案:虽然我确实同意所有这些答案(到目前为止),但这个问题可能存在的含义到目前为止完全被忽视。
如果上面的简单示例通过资源分配进行扩展,然后进行错误检查并可能释放资源,那么情况可能会发生变化。
考虑一下初学者可能采取的天真的方法:
上面的例子代表了过早返回风格的极端版本。请注意,随着时间的推移,随着代码复杂性的增加,代码会变得非常重复且不可维护。现在人们可能会使用 异常处理 来捕获这些。
在查看下面的 goto 示例后,Philip 建议在上面的 catch 块内使用无中断 switch/case。人们可以切换(typeof(e)),然后通过
free_resourcex()
调用,但这并不是一件小事并且需要设计考虑。请记住,没有中断的 switch/case 与下面带有菊花链标签的 goto 完全相同...正如 Mark B 指出的那样,在 C++ 中,遵循资源获取即初始化被认为是很好的风格原则,RAII 简而言之。这个概念的要点是使用对象实例化来获取资源。一旦对象超出范围并调用其析构函数,资源就会自动释放。对于相互依赖的资源,必须特别注意确保释放的正确顺序,并设计对象的类型,以便所需的数据可用于所有析构函数。
或者在异常前的日子里可能会这样做:
但是这个过于简化的示例有几个缺点:只有当分配的资源不相互依赖时才可以使用它(例如,它不能用于分配内存,然后打开文件句柄,然后将数据从句柄读取到内存中),并且它不提供单独的、可区分的错误代码作为返回值。
为了保持代码快速(!)、紧凑、易于阅读和扩展Linus Torvalds 对内核代码强制采用了不同的风格处理资源,甚至以一种绝对有意义的方式使用臭名昭著的goto:
内核邮件列表上讨论的要点是,大多数语言功能与 goto 语句相比“首选”的是隐式 goto,例如巨大的、树状的 if/else、异常处理程序、循环/中断/继续语句等。并且上面示例中的 goto 被认为是可以的,因为它们仅跳转距离小,标签清晰,并且没有其他混乱的代码,以便跟踪错误情况。 这个问题也在 stackoverflow 上讨论过。
然而,最后一个示例中缺少的是返回错误代码的好方法。我正在考虑在每个
free_resource_x()
调用之后添加一个result_code++
并返回该代码,但这抵消了上述编码风格的一些速度增益。而且成功的话很难返回0。也许我只是缺乏想象力;-)所以,是的,我确实认为是否提前回报编码的问题存在很大差异。但我也认为,只有在更复杂的代码中,这种情况才会很明显,这些代码更难或不可能为编译器重组和优化。一旦资源分配发挥作用,通常就是这种情况。
Interesting answers: Although I do agree with all of them (so far), there are possible connotations to this question that are up to now completely disregarded.
If the simple example above is extended with resource allocation, and then error checking with a potential resulting freeing of resources, the picture might change.
Consider the naive approach beginners might take:
The above would represent an extreme version of the style of returning prematurely. Notice how the code becomes very repetitive and non-maintainable over time when its complexity grows. Nowadays people might use exception handling to catch these.
Philip suggested, after looking at the goto example below, to use a break-less switch/case inside the catch block above. One could switch(typeof(e)) and then fall through the
free_resourcex()
calls but this is not trivial and needs design consideration. And remember that a switch/case without breaks is exactly like the goto with daisy-chained labels below...As Mark B pointed out, in C++ it is considered good style to follow the Resource Aquisition is Initialization principle, RAII in short. The gist of the concept is to use object instantiation to aquire resources. The resources are then automatically freed as soon as the objects go out of scope and their destructors are called. For interdepending resources special care has to be taken to ensure the correct order of deallocation and to design the types of objects such that required data is available for all destructors.
Or in pre-exception days might do:
But this over-simplified example has several drawbacks: It can be used only if the allocated resources do not depend on each other (e.g. it could not be used for allocating memory, then opening a filehandle, then reading data from the handle into the memory), and it does not provide individial, distinguishable error codes as return values.
To keep code fast(!), compact, and easily readable and extensible Linus Torvalds enforced a different style for kernel code that deals with resources, even using the infamous goto in a way that makes absolutely sense:
The gist of the discussion on the kernel mailing lists is that most language features that are "preferred" over the goto statement are implicit gotos, such as huge, tree-like if/else, exception handlers, loop/break/continue statements, etc. And goto's in the above example are considered ok, since they are jumping only a small distance, have clear labels, and free the code of other clutter for keeping track of the error conditions. This question has also been discussed here on stackoverflow.
However what's missing in the last example is a nice way to return an error code. I was thinking of adding a
result_code++
after eachfree_resource_x()
call, and returning that code, but this offsets some of the speed gains of the above coding style. And it's hard to return 0 in case of success. Maybe I'm just unimaginative ;-)So, yes, I do think there is a big difference in the question of coding premature returns or not. But I also think it is apparent only in more complicated code that is harder or impossible to restructure and optimize for the compiler. Which is usually the case once resource allocation comes into play.
尽管这不是一个很好的答案,但生产编译器在优化方面会比您做得更好。我更喜欢可读性和可维护性而不是这些类型的优化。
Even though this isn't much an answer, a production compiler is going to be much better at optimizing than you are. I would favor readability and maintainability over these kinds of optimizations.
具体来说,
return
将被编译为到方法末尾的分支,其中将有RET
指令或其他任何指令。如果省略它,则else
之前的块的末尾将被编译为到else
块末尾的分支。所以你可以看到,在这个具体案例中,这没有任何区别。To be specific about this, the
return
will be compiled into a branch to the end of the method, where there will be aRET
instruction or whatever it may be. If you leave it out, the end of the block before theelse
will be compiled into a branch to the end of theelse
block. So you can see in this specific case it makes no difference whatsoever.如果您确实想知道特定编译器和系统的编译代码是否存在差异,您必须自己编译并查看程序集。
然而,在大的计划中,几乎可以肯定编译器可以比你的微调更好地优化,即使不能,它也不太可能对你的程序的性能产生真正的影响。
相反,以最清晰的方式编写代码,以便人类阅读和维护,并让编译器做它最擅长的事情:从源代码中生成最好的程序集。
If you really want to know if there's a difference in compiled code for your particular compiler and system, you'll have to compile and look at the assembly yourself.
However in the big scheme of things it's almost certain that the compiler can optimize better than your fine tuning, and even if it can't it's very unlikely to actually matter for your program's performance.
Instead, write the code in the clearest way for humans to read and maintain, and let the compiler do what it does best: Generate the best assembly it can from your source.
在您的示例中,回报是显而易见的。当返回的内容是上方/下方的一两页 //do different stuff 发生时,调试人员会发生什么情况?当代码更多时,更难找到/查看。
In your example, the return is noticeable. What happens to the person debugging when the return is a page or two above/below where //do different stuff occurs? Much harder to find/see when there is more code.
我强烈同意blueshift:可读性和可维护性第一!。但如果你真的很担心(或者只是想了解你的编译器在做什么,从长远来看这绝对是一个好主意),你应该自己寻找。
这意味着使用反编译器或查看低级编译器输出(例如汇编语言)。在 C# 或任何 .Net 语言中,此处记录的工具将为您提供所需的信息需要。
但正如您自己所观察到的,这可能是不成熟的优化。
I agree strongly with blueshift: readability and maintainability first!. But if you're really worried (or just want to learn what your compiler is doing, which definitely a good idea in the long run), you should look for yourself.
This will mean using a decompiler or looking at low level compiler output (e.g. assembly lanuage). In C#, or any .Net language, the tools documented here will give you what you need.
But as you yourself have observed, this is probably premature optimization.
来自《干净的代码:敏捷软件工艺手册》
in 代码只会让读者导航到该函数并浪费时间阅读 foo(boolean flag)
更好的结构化代码库将为您提供更好的机会来优化代码。
From Clean Code: A Handbook of Agile Software Craftsmanship
in code will just make the reader to navigate to the function and waste time reading foo(boolean flag)
Better structured code base will give you better opportunity to optimize code.
一种思想流派(现在不记得是谁提出的)是,从结构的角度来看,所有函数都应该只有一个返回点,以使代码更易于阅读和调试。我想,这更多的是为了编程宗教辩论。
您可能想要控制违反此规则的函数何时以及如何退出的一个技术原因是,当您编写实时应用程序时,并且您想要确保通过该函数的所有控制路径都需要相同数量的时钟周期才能完成。
One school of thought (can't remember the egghead who proposed it at the moment) is that all function should only have one return point from a structural point of view to make the code easier to read and debug. That, I suppose, is more for programming religious debate.
One technical reason you may want to control when and how a function exits that breaks this rule is when you are coding real-time applications and you want to make sure that all control paths through the function take the same number of clock cycles to complete.
我很高兴你提出这个问题。您应该始终使用分支而不是提前返回。为什么停在那里?如果可以的话,将所有功能合并为一个(至少尽可能多)。如果没有递归,这是可行的。最后,你将拥有一个巨大的主要功能,但这正是你需要/想要的这类事情。然后,将标识符重命名为尽可能短。这样,当执行代码时,读取名称所花费的时间就会减少。接下来做...
I'm glad you brought this question up. You should always use the branches over an early return. Why stop there? Merge all your functions into one if you can (at least as much as you can). This is doable if there is no recursion. In the end, you will have one massive main function, but that is what you need/want for this sort of thing. Afterward, rename your identifiers to be as short as possible. That way when your code is executed, less time is spent reading names. Next do ...