使交换更快、更易于使用且异常安全
我昨晚无法入睡,开始思考 std::swap
。这是熟悉的 C++98 版本:
template <typename T>
void swap(T& a, T& b)
{
T c(a);
a = b;
b = c;
}
如果用户定义的类 Foo
使用外部资源,则效率很低。常见的习惯用法是提供一个方法 void Foo::swap(Foo& other)
和一个 std::swap
的特化。请注意,这不适用于类模板,因为您无法部分特化函数模板,并且在std
命名空间中重载名称是非法的。解决方案是在自己的命名空间中编写模板函数,并依靠参数依赖查找来查找它。这主要取决于客户端是否遵循“using std::swap
idiom”,而不是直接调用std::swap
。非常脆。
在 C++0x 中,如果 Foo
有用户定义的移动构造函数和移动赋值运算符,则提供自定义 swap
方法和 std::swap<; Foo>
专门化几乎没有性能优势,因为 std::swap
的 C++0x 版本使用高效的移动而不是复制:
#include <utility>
template <typename T>
void swap(T& a, T& b)
{
T c(std::move(a));
a = std::move(b);
b = std::move(c);
}
不必摆弄 swap< /code> 已经减轻了程序员的很多负担。 当前的编译器还不会自动生成移动构造函数和移动赋值运算符,但据我所知,这将会改变。剩下的唯一问题是异常安全,因为一般来说,移动操作是允许抛出的,这会带来一大堆蠕虫。问题“移出的对象的状态到底是什么?”让事情变得更加复杂。
然后我在想,如果一切顺利的话,C++0x 中的 std::swap
的语义到底是什么?交换前后对象的状态如何?通常,通过移动操作进行交换不会触及外部资源,只会触及“平面”对象表示本身。
那么为什么不简单地编写一个 swap
模板来完成以下任务:交换对象表示?
#include <cstring>
template <typename T>
void swap(T& a, T& b)
{
unsigned char c[sizeof(T)];
memcpy( c, &a, sizeof(T));
memcpy(&a, &b, sizeof(T));
memcpy(&b, c, sizeof(T));
}
这是最有效的:它只是简单地清除原始内存。它不需要用户的任何干预:不需要定义特殊的交换方法或移动操作。这意味着它甚至可以在 C++98 中工作(请注意,它没有右值引用)。但更重要的是,我们现在可以忘记异常安全问题,因为 memcpy 永远不会抛出异常。
我可以看到这种方法有两个潜在的问题:
首先,并非所有对象都应该被交换。如果类设计者隐藏了复制构造函数或复制赋值运算符,则尝试交换类的对象应该会在编译时失败。我们可以简单地引入一些死代码来检查复制和赋值在类型上是否合法:
template <typename T>
void swap(T& a, T& b)
{
if (false) // dead code, never executed
{
T c(a); // copy-constructible?
a = b; // assignable?
}
unsigned char c[sizeof(T)];
std::memcpy( c, &a, sizeof(T));
std::memcpy(&a, &b, sizeof(T));
std::memcpy(&b, c, sizeof(T));
}
任何像样的编译器都可以轻松地摆脱死代码。 (可能有更好的方法来检查“交换一致性”,但这不是重点。重要的是它是可能的)。
其次,某些类型可能在复制构造函数和复制赋值运算符中执行“不寻常”的操作。例如,他们可能会将其更改通知观察者。我认为这是一个小问题,因为此类对象可能不应该首先提供复制操作。
请告诉我您对这种交换方法的看法。在实践中它会起作用吗?你会用它吗?您能确定这会破坏的库类型吗?您还发现其他问题吗?讨论!
I could not sleep last night and started thinking about std::swap
. Here is the familiar C++98 version:
template <typename T>
void swap(T& a, T& b)
{
T c(a);
a = b;
b = c;
}
If a user-defined class Foo
uses external ressources, this is inefficient. The common idiom is to provide a method void Foo::swap(Foo& other)
and a specialization of std::swap<Foo>
. Note that this does not work with class templates since you cannot partially specialize a function template, and overloading names in the std
namespace is illegal. The solution is to write a template function in one's own namespace and rely on argument dependent lookup to find it. This depends critically on the client to follow the "using std::swap
idiom" instead of calling std::swap
directly. Very brittle.
In C++0x, if Foo
has a user-defined move constructor and a move assignment operator, providing a custom swap
method and a std::swap<Foo>
specialization has little to no performance benefit, because the C++0x version of std::swap
uses efficient moves instead of copies:
#include <utility>
template <typename T>
void swap(T& a, T& b)
{
T c(std::move(a));
a = std::move(b);
b = std::move(c);
}
Not having to fiddle with swap
anymore already takes a lot of burden away from the programmer.
Current compilers do not generate move constructors and move assignment operators automatically yet, but as far as I know, this will change. The only problem left then is exception-safety, because in general, move operations are allowed to throw, and this opens up a whole can of worms. The question "What exactly is the state of a moved-from object?" complicates things further.
Then I was thinking, what exactly are the semantics of std::swap
in C++0x if everything goes fine? What is the state of the objects before and after the swap? Typically, swapping via move operations does not touch external resources, only the "flat" object representations themselves.
So why not simply write a swap
template that does exactly that: swap the object representations?
#include <cstring>
template <typename T>
void swap(T& a, T& b)
{
unsigned char c[sizeof(T)];
memcpy( c, &a, sizeof(T));
memcpy(&a, &b, sizeof(T));
memcpy(&b, c, sizeof(T));
}
This is as efficient as it gets: it simply blasts through raw memory. It does not require any intervention from the user: no special swap methods or move operations have to be defined. This means that it even works in C++98 (which does not have rvalue references, mind you). But even more importantly, we can now forget about the exception-safety issues, because memcpy
never throws.
I can see two potential problems with this approach:
First, not all objects are meant to be swapped. If a class designer hides the copy constructor or the copy assignment operator, trying to swap objects of the class should fail at compile-time. We can simply introduce some dead code that checks whether copying and assignment are legal on the type:
template <typename T>
void swap(T& a, T& b)
{
if (false) // dead code, never executed
{
T c(a); // copy-constructible?
a = b; // assignable?
}
unsigned char c[sizeof(T)];
std::memcpy( c, &a, sizeof(T));
std::memcpy(&a, &b, sizeof(T));
std::memcpy(&b, c, sizeof(T));
}
Any decent compiler can trivially get rid of the dead code. (There are probably better ways to check the "swap conformance", but that is not the point. What matters is that it's possible).
Second, some types might perform "unusual" actions in the copy constructor and copy assignment operator. For example, they might notify observers of their change. I deem this a minor issue, because such kinds of objects probably should not have provided copy operations in the first place.
Please let me know what you think of this approach to swapping. Would it work in practice? Would you use it? Can you identify library types where this would break? Do you see additional problems? Discuss!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
对象一旦被构造,就会在复制其驻留的字节时以多种方式中断。事实上,人们可能会想出无数种看似不可行的情况事情 - 尽管在实践中它可能适用于 98% 的情况。
这是因为所有这一切的根本问题是,除了 C 语言之外,在 C++ 中我们不能将对象视为纯粹的原始字节。毕竟,这就是我们进行构造和销毁的原因:将原始存储转变为对象,并将对象变回原始存储。一旦构造函数运行,对象所在的内存就不仅仅是原始存储。如果你把它当作不存在,你就会破坏一些类型。
然而,从本质上讲,移动对象的性能不应该比您的想法差那么多,因为一旦您开始递归内联对
std::move()
的调用,您通常最终会到达 内置函数被移动。 (如果某些类型需要移动更多内容,您最好不要自己摆弄这些内存!)当然,整体移动内存通常比单个移动更快(并且编译器不太可能发现它可以将单个动作优化为一个包罗万象的std::memcpy()
),但这就是我们为不透明对象提供的抽象所付出的代价。而且它非常小,特别是当你将它与我们过去所做的复制相比时。不过,您可以使用
std::memcpy()
针对聚合类型优化swap()
。There's many ways in which an object, once being constructed, can break when you copy the bytes it resides in. In fact, one could come up with a seemingly endless number of cases where this would not do the right thing - even though in practice it might work in 98% of all cases.
That's because the underlying problem to all this is that, other than in C, in C++ we must not treat objects as if they are mere raw bytes. That's why we have construction and destruction, after all: to turn raw storage into objects and objects back into raw storage. Once a constructor has run, the memory where the object resides is more than only raw storage. If you treat it as if it weren't, you will break some types.
However, essentially, moving objects shouldn't perform that much worse than your idea, because, once you start to recursively inline the calls to
std::move()
, you usually ultimately arrive at where built-ins are moved. (And if there's more to moving for some types, you'd better not fiddle with the memory of those yourself!) Granted, moving memory en bloc is usually faster than single moves (and it's unlikely that a compiler might find out that it could optimize the individual moves to one all-encompassingstd::memcpy()
), but that's the price we pay for the abstraction opaque objects offer us. And it's quite small, especially when you compare it to the copying we used to do.You could, however, have an optimized
swap()
usingstd::memcpy()
for aggregate types.这将破坏具有指向其自身成员的指针的类实例。例如:
现在,如果你只执行 memcpy(),currentPos 会指向哪里?显然是到旧位置。这将导致非常有趣的错误,其中每个实例实际上都使用另一个实例的缓冲区。
This will break class instances that have pointers to their own members. For example:
Now, if you just do memcpy(), where would currentPos point? To the old location, obviously. This will lead to very funny bugs where each instance actually uses another's buffer.
有些类型可以交换但不能复制。独特的智能指针可能是最好的例子。检查可复制性和可分配性是错误的。
如果 T 不是 POD 类型,则使用 memcpy 复制/移动是未定义的行为。
更好的习惯用法是非成员交换,并要求用户不合格地调用交换,因此适用 ADL。这也适用于模板:
关键是使用 std::swap 声明作为后备。 Template 交换的友谊很好地简化了定义; NonTemplate 的交换也可能是一个朋友,但这是一个实现细节。
Some types can be swapped but cannot be copied. Unique smart pointers are probably the best example. Checking for copyability and assignability is wrong.
If T isn't a POD type, using memcpy to copy/move is undefined behavior.
A better idiom is a non-member swap and requiring users to call swap unqualified, so ADL applies. This also works with templates:
The key is the using declaration for std::swap as a fallback. The friendship for Template's swap is nice for simplifying the definition; the swap for NonTemplate might also be a friend, but that's an implementation detail.
很简单,这就是一大错误。通知观察者的类和不应该被复制的类是完全不相关的。那么shared_ptr呢?它显然应该是可复制的,但它也显然通知观察者——引用计数。现在,在这种情况下,引用计数在交换后是相同的,但对于所有类型来说绝对不是这样,尤其是如果涉及多线程,则不是这样,在常规副本而不是交换的情况等。对于可以移动或交换但不能复制的类来说,这是错误的。尤其。
他们肯定不可以。在几乎任何涉及移动可能抛出异常的情况下,几乎不可能保证强大的异常安全性。标准库的 C++0x 定义从内存中明确指出,任何标准容器中可用的任何类型在移动时都不得抛出异常。
但这也是错误的。您假设任何对象的移动纯粹是它的成员变量 - 但可能不是全部。我可能有一个基于实现的缓存,并且我可能决定在我的类中,我不应该移动这个缓存。作为实现细节,我完全有权不移动任何我认为没有必要移动的成员变量。然而,您想要移动所有这些。
现在,您的示例代码确实应该对许多类有效。然而,对于许多完全合法的类来说,它绝对是无效的,更重要的是,如果操作可以简化为该操作,那么它无论如何都会编译为该操作。这是破坏了完美的课程,却没有任何好处。
That is, quite simply, a load of wrong. Classes that notify observers and classes that shouldn't be copied are completely unrelated. How about shared_ptr? It obviously should be copyable, but it also obviously notifies an observer- the reference count. Now it's true that in this case, the reference count is the same after the swap, but that's definitely not true for all types and it's especially not true if multi-threading is involved, it's not true in the case of a regular copy instead of a swap, etc. This is especially wrong for classes that can be moved or swapped but not copied.
They are most assuredly not. It is virtually impossible to guarantee strong exception safety in pretty much any circumstance involving moves when the move might throw. The C++0x definition of the Standard library, from memory, explicitly states any type usable in any Standard container must not throw when moving.
That is also wrong. You're assuming that the move of any object is purely it's member variables- but it might not be all of them. I might have an implementation-based cache and I might decide that within my class, I should not move this cache. As an implementation detail it is entirely within my rights not to move any member variables that I deem are not necessary to be moved. You, however, want to move all of them.
Now, it's true that your sample code should be valid for a lot of classes. However, it's extremely very definitely not valid for many classes that are completely and totally legitimate, and more importantly, it's going to compile down to that operation anyway if the operation can be reduced to that. This is breaking perfectly good classes for absolutely no benefit.
如果有人将您的
swap
版本与多态类型一起使用,则会造成严重破坏。考虑:
这是错误的,因为现在 b 包含
Derived
类型的 vtable,因此在不是Derived< 类型的对象上调用
Derived::vfunc
/代码>。普通的 std::swap 只交换 Base 的数据成员,因此使用 std::swap 就可以了
your
swap
version will cause havoc if someone uses it with polymorphic types.consider:
this is wrong, because now b contains the vtable of the
Derived
type, soDerived::vfunc
is invoked on a object which isnt of typeDerived
.The normal
std::swap
only swaps the data members ofBase
, so this is OK withstd::swap