如何避免out参数?
我见过很多论点,认为使用返回值比输出参数更可取。我确信避免它们的原因,但我发现自己不确定是否遇到了不可避免的情况。
我的问题的第一部分是:您最喜欢/常见的使用 out 参数的方法是什么?大意是这样的:伙计,在同行评审中,我总是看到其他程序员这样做,而他们本可以轻松地这样做。
我的问题的第二部分涉及我遇到的一些特定情况,在这些情况下我想避免使用 out 参数,但无法想出一种干净的方法来做到这一点。
示例1: 我有一堂课,里面有一本我想避免的昂贵副本。可以在该对象上完成工作,这使得对象的复制成本很高。建立数据的工作也并非微不足道。目前,我将把这个对象传递给一个函数来修改该对象的状态。对我来说,这比新建工作函数内部的对象并将其返回更可取,因为它允许我将内容保留在堆栈上。
class ExpensiveCopy //Defines some interface I can't change.
{
public:
ExpensiveCopy(const ExpensiveCopy toCopy){ /*Ouch! This hurts.*/ };
ExpensiveCopy& operator=(const ExpensiveCopy& toCopy){/*Ouch! This hurts.*/};
void addToData(SomeData);
SomeData getData();
}
class B
{
public:
static void doWork(ExpensiveCopy& ec_out, int someParam);
//or
// Your Function Here.
}
使用我的函数,我调用这样的代码:
const int SOME_PARAM = 5;
ExpensiveCopy toModify;
B::doWork(toModify, SOME_PARAM);
我想要这样的东西:
ExpensiveCopy theResult = B::doWork(SOME_PARAM);
但我不知道这是否可能。
第二个例子: 我有一系列对象。数组中的对象是复杂类型,我需要对每个元素进行工作,我希望将这些工作与访问每个元素的主循环分开。目前的代码如下所示:
std::vector<ComplexType> theCollection;
for(int index = 0; index < theCollection.size(); ++index)
{
doWork(theCollection[index]);
}
void doWork(ComplexType& ct_out)
{
//Do work on the individual element.
}
关于如何处理其中一些情况有什么建议吗?我主要使用 C++ 工作,但我有兴趣了解其他语言是否有助于更轻松的设置。我遇到过 RVO 作为一种可能的解决方案,但我需要阅读更多内容,它听起来像是编译器特定的功能。
I've seen numerous arguments that using a return value is preferable to out parameters. I am convinced of the reasons why to avoid them, but I find myself unsure if I'm running into cases where it is unavoidable.
Part One of my question is: What are some of your favorite/common ways of getting around using an out parameter? Stuff along the lines: Man, in peer reviews I always see other programmers do this when they could have easily done it this way.
Part Two of my question deals with some specific cases I've encountered where I would like to avoid an out parameter but cannot think of a clean way to do so.
Example 1:
I have a class with an expensive copy that I would like to avoid. Work can be done on the object and this builds up the object to be expensive to copy. The work to build up the data is not exactly trivial either. Currently, I will pass this object into a function that will modify the state of the object. This to me is preferable to new'ing the object internal to the worker function and returning it back, as it allows me to keep things on the stack.
class ExpensiveCopy //Defines some interface I can't change.
{
public:
ExpensiveCopy(const ExpensiveCopy toCopy){ /*Ouch! This hurts.*/ };
ExpensiveCopy& operator=(const ExpensiveCopy& toCopy){/*Ouch! This hurts.*/};
void addToData(SomeData);
SomeData getData();
}
class B
{
public:
static void doWork(ExpensiveCopy& ec_out, int someParam);
//or
// Your Function Here.
}
Using my function, I get calling code like this:
const int SOME_PARAM = 5;
ExpensiveCopy toModify;
B::doWork(toModify, SOME_PARAM);
I'd like to have something like this:
ExpensiveCopy theResult = B::doWork(SOME_PARAM);
But I don't know if this is possible.
Second Example:
I have an array of objects. The objects in the array are a complex type, and I need to do work on each element, work that I'd like to keep separated from the main loop that accesses each element. The code currently looks like this:
std::vector<ComplexType> theCollection;
for(int index = 0; index < theCollection.size(); ++index)
{
doWork(theCollection[index]);
}
void doWork(ComplexType& ct_out)
{
//Do work on the individual element.
}
Any suggestions on how to deal with some of these situations? I work primarily in C++, but I'm interested to see if other languages facilitate an easier setup. I have encountered RVO as a possible solution, but I need to read up more on it and it sounds like a compiler specific feature.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
据我所知,更喜欢返回值而不是输出参数的原因是它更清晰,并且它适用于纯函数式编程(如果函数仅依赖于输入参数、返回值并且具有无副作用)。第一个原因是风格上的,在我看来并不是那么重要。第二个不太适合 C++。因此,我不会尝试扭曲任何内容以避免参数丢失。
简单的事实是,某些函数必须返回多个内容,并且在大多数语言中,这建议输出参数。 Common Lisp 有
multiple-value-bind
和multiple-value-return
,其中绑定提供符号列表并返回值列表。在某些情况下,函数可以返回复合值,例如随后将被解构的值列表,对于 C++ 函数返回std::pair
来说并不是什么大问题。在 C++ 中以这种方式返回两个以上的值会变得很尴尬。定义一个结构体总是可以的,但是定义和创建它通常比输出参数更混乱。在某些情况下,返回值会过载。在 C 中, getchar() 返回一个 int ,其想法是 int 值比 char 更多(在我知道的所有实现中为 true,在我可以轻松想象的一些实现中为 false),因此其中之一这些值可用于表示文件结束。 atoi() 返回一个整数,要么是它所传递的字符串表示的整数,要么是零(如果没有),因此它对“0”和“frog”返回相同的值。 (如果您想知道是否存在 int 值,请使用 strtol(),它有一个 out 参数。)
在发生错误时总是有抛出异常的技术,但是并非所有多个返回值都是错误,也并非所有错误都是异常的。
因此,重载的返回值会导致问题,多个值返回并不容易在所有语言中使用,并且单个返回并不总是存在。抛出异常通常是不合适的。使用 out 参数通常是最干净的解决方案。
As far as I can see, the reasons to prefer return values to out parameters are that it's clearer, and it works with pure functional programming (you can get some nice guarantees if a function depends only on input parameters, returns a value, and has no side effects). The first reason is stylistic, and in my opinion not all that important. The second isn't a good fit with C++. Therefore, I wouldn't try to distort anything to avoid out parameters.
The simple fact is that some functions have to return multiple things, and in most languages this suggests out parameters. Common Lisp has
multiple-value-bind
andmultiple-value-return
, in which a list of symbols is provided by the bind and a list of values is returned. In some cases, a function can return a composite value, such as a list of values which will then get deconstructed, and it isn't a big deal for a C++ function to return astd::pair
. Returning more than two values this way in C++ gets awkward. It's always possible to define a struct, but defining and creating it will often be messier than out parameters.In some cases, the return value gets overloaded. In C,
getchar()
returns an int, with the idea being that there are more int values than char (true in all implementations I know of, false in some I can easily imagine), so one of the values can be used to denote end-of-file.atoi()
returns an integer, either the integer represented by the string it's passed or zero if there is none, so it returns the same thing for "0" and "frog". (If you want to know whether there was an int value or not, usestrtol()
, which does have an out parameter.)There's always the technique of throwing an exception in case of an error, but not all multiple return values are errors, and not all errors are exceptional.
So, overloaded return values causes problems, multiple value returns aren't easy to use in all languages, and single returns don't always exist. Throwing an exception is often inappropriate. Using out parameters is very often the cleanest solution.
问问自己为什么要使用某种方法来对这个复制成本高昂的对象执行工作。假设您有一棵树,您会将树发送到某种构建方法中,还是为树提供自己的构建方法?当你的设计有点偏离时,这样的情况就会经常出现,但当你把它拍下来时,往往会自行折叠起来。
我知道在实践中我们并不总是能够改变每个对象,但是传入参数是一种副作用操作,它使得弄清楚发生了什么变得更加困难,而且你永远不需要这样做(除非在其他人的代码框架内工作而被迫)。
有时它更容易,但毫无理由地使用它绝对是不可取的(如果您经历过一些大型项目,其中总是有六个输出参数,您就会明白我的意思)。
Ask yourself why you have some method that performs work on this expensive to copy object in the first place. Say you have a tree, would you send the tree off into some building method or else give the tree its own building method? Situations like this come up constantly when you have a little bit off design but tend to fold into themselves when you have it down pat.
I know in practicality we don't always get to change every object at all, but passing in out parameters is a side effect operation, and it makes it much harder to figure out what's going on, and you never really have to do it (except as forced by working within others' code frameworks).
Sometimes it is easier, but it's definitely not desirable to use it for no reason (if you've suffered through a few large projects where there's always half a dozen out parameters you'll know what I mean).
我不确定您为什么试图避免在这里传递引用。几乎在这些情况下,存在引用传递语义。
该代码
对我来说看起来非常好。
如果你真的想修改它,那么你有几个选项
I'm not sure why you're trying to avoid passing references here. It's pretty much these situations that pass-by-reference semantics exist.
The code
looks perfectly fine to me.
If you really want to modify it then you've got a couple of options
如果启用了优化,每个有用的编译器都会执行 RVO(返回值优化),因此以下操作实际上不会导致复制:
在某些情况下,编译器也可以应用 NRVO,即返回值优化:
但这并不完全可靠,仅适用于更琐碎的情况,并且必须经过测试。如果您无法测试每种情况,只需在第二种情况下使用带有引用的输出参数即可。
Every useful compiler does RVO (return value optimization) if optimizations are enabled, thus the following effectively doesn't result in copying:
In some cases compilers can apply NRVO, named return value optimization, as well:
This however isn't exactly reliable, only works in more trivial cases and would have to be tested. If you're not up to testing every case, just use out-parameters with references in the second case.
在我看来,您应该问自己的第一件事是复制
ExppressiveCopy
是否真的如此昂贵。为了回答这个问题,您通常需要一个分析器。除非探查器告诉您复制确实是一个瓶颈,否则只需编写更易于阅读的代码:ExpenseCopy obj = doWork(param);
。当然,确实存在出于性能或其他原因无法复制对象的情况。那么尼尔的答案适用。
IMO the first thing you should ask yourself is whether copying
ExpensiveCopy
really is so prohibitive expensive. And to answer that, you will usually need a profiler. Unless a profiler tells you that the copying really is a bottleneck, simply write the code that's easier to read:ExpensiveCopy obj = doWork(param);
.Of course, there are indeed cases where objects cannot be copied for performance or other reasons. Then Neil's answer applies.
除了这里的所有评论之外,我还要提到,在 C++0x 中,您很少会使用输出参数来实现优化目的——因为移动构造函数(请参阅 此处)
In addition to all comments here I'd mention that in C++0x you'd rarely use output parameter for optimization purpose -- because of Move Constructors (see here)
除非你走“一切都是不可变的”路线,这不太适合 C++。您无法轻易避免 out 参数。 C++ 标准库使用它们,对它来说足够好的对我来说就足够了。
Unless you are going down the "everything is immutable" route, which doesn't sit too well with C++. you cannot easily avoid out parameters. The C++ Standard Library uses them, and what's good enough for it is good enough for me.
至于您的第一个示例:返回值优化通常允许直接在中创建返回的对象-place,而不必四处复制对象。所有现代编译器都这样做。
As to your first example: return value optimization will often allow the returned object to be created directly in-place, instead of having to copy the object around. All modern compilers do this.
您在什么平台上工作?
我问的原因是很多人建议返回值优化,这是几乎每个编译器中都存在的非常方便的编译器优化。此外,微软和英特尔还实施了他们所谓的命名返回值优化,这更加方便。
在标准返回值优化中,您的返回语句是对对象构造函数的调用,它告诉编译器消除临时值(不一定是复制操作)。
在命名返回值优化中,您可以按名称返回值,编译器也会执行相同的操作。 NRVO 的优点是您可以在返回创建的值之前对其执行更复杂的操作(例如对其调用函数)。
虽然如果返回的数据非常大,这些都不能真正消除昂贵的副本,但它们确实有帮助。
就避免复制而言,唯一真正的方法是使用指针或引用,因为您的函数需要在您希望其最终到达的位置修改数据。这意味着您可能希望有一个传递-参考参数。
另外,我认为我应该指出,由于这个原因,按引用传递在高性能代码中非常常见。复制数据的成本可能非常高,而且人们在优化代码时常常会忽略这一点。
What platform are you working on?
The reason I ask is that many people have suggested Return Value Optimization, which is a very handy compiler optimization present in almost every compiler. Additionally Microsoft and Intel implement what they call Named Return Value Optimization which is even more handy.
In standard Return Value Optimization your return statement is a call to an object's constructor, which tells the compiler to eliminate the temporary values (not necessarily the copy operation).
In Named Return Value Optimization you can return a value by its name and the compiler will do the same thing. The advantage to NRVO is that you can do more complex operations on the created value (like calling functions on it) before returning it.
While neither of these really eliminate an expensive copy if your returned data is very large, they do help.
In terms of avoiding the copy the only real way to do that is with pointers or references because your function needs to be modifying the data in the place you want it to end up in. That means you probably want to have a pass-by-reference parameter.
Also I figure I should point out that pass-by-reference is very common in high-performance code for specifically this reason. Copying data can be incredibly expensive, and it is often something people overlook when optimizing their code.