在 C# 中进行浅复制的最快方法
我想知道在 C# 中进行浅复制的最快方法是什么? 我只知道有2种方法可以进行浅复制:
- MemberwiseClone
- 逐个复制每个字段(手动)
我发现(2)比(1)更快。 我想知道是否还有其他方法进行浅复制?
I wonder what is the fastest way to do shallow copying in C#? I only know there are 2 ways to do shallow copy:
- MemberwiseClone
- Copy each field one by one (manual)
I found that (2) is faster than (1). I'm wondering if there's another way to do shallow copying?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
这是一个复杂的主题,有许多可能的解决方案,并且每种解决方案都有很多优点和缺点。 这里有一篇精彩的文章,概述了在C#。 总结一下:
手动克隆
乏味,但控制水平高。
使用 MemberwiseClone 进行克隆
仅创建浅拷贝,即对于引用类型字段,原始对象及其克隆引用同一个对象。
带反射的克隆
默认为浅拷贝,可以重写做深拷贝。 优点:自动化。 缺点:反射速度慢。
克隆与序列化
简单、自动化。 放弃一些控制权,序列化是最慢的。
使用 IL 克隆、使用扩展方法克隆
更高级的解决方案,但并不常见。
This is a complex subject with lots of possible solutions and many pros and cons to each. There is a wonderful article here that outlines several different ways of making a copy in C#. To summarize:
Clone Manually
Tedious, but high level of control.
Clone with MemberwiseClone
Only creates a shallow copy, i.e. for reference-type fields the original object and its clone refer to the same object.
Clone with Reflection
Shallow copy by default, can be re-written to do deep copy. Advantage: automated. Disadvantage: reflection is slow.
Clone with Serialization
Easy, automated. Give up some control and serialization is slowest of all.
Clone with IL, Clone with Extension Methods
More advanced solutions, not as common.
我想先引用几句话:
和
理论上,浅拷贝的最佳实现是 C++ 拷贝构造函数:它知道编译时的大小,然后对所有字段进行成员克隆。 下一个最好的办法是使用memcpy或类似的东西,这基本上就是MemberwiseClone应该如何工作。 这意味着,理论上它应该消除性能方面的所有其他可能性。 对吗?
...但显然它的速度并不快,也不会消除所有其他解决方案。 实际上,我在底部发布了一个速度快 2 倍以上的解决方案。 所以:错误。
测试 MemberwiseClone 的内部
让我们从一个使用简单 blittable 类型的小测试开始,以检查这里关于性能的基本假设:
测试是这样设计的:这样我们就可以检查
MemberwiseClone
与原始memcpy
的性能,这是可能的,因为这是一个 blittable 类型。要自行测试,请使用不安全代码进行编译,禁用 JIT 抑制,编译发布模式并进行测试。 我还将时间安排放在每行相关的后面。
实现 1:
基本上我多次运行这些测试,检查程序集输出以确保该内容没有被优化掉,等等。最终结果是我知道这个测试大约需要多少秒一行代码的成本,在我的电脑上是 0.40 秒。 这是我们使用
MemberwiseClone
的基线。实现 2:
如果仔细观察这些数字,您会注意到以下几点:
那么为什么这一切都这么慢呢?
我的解释是它与GC有关。 基本上,实现不能依赖于内存在完整 GC 之前和之后保持不变的事实(内存的地址可以在 GC 期间更改,这可能随时发生,包括在浅复制期间)。 这意味着您只有 2 个可能的选择:
MemberwiseClone
将使用方法 1,这意味着您将因固定过程而受到性能影响。(快得多)更快的实现
在所有情况下,我们的非托管代码都无法对类型的大小做出假设,并且必须固定数据。 对大小进行假设使编译器能够进行更好的优化,例如循环展开、寄存器分配等(就像 C++ 复制构造函数比 memcpy 更快一样)。 不必固定数据意味着我们不会受到额外的性能影响。 由于 .NET JIT 是针对汇编程序的,理论上这意味着我们应该能够使用简单的 IL 发出来更快地实现,并允许编译器对其进行优化。
那么总结一下为什么这比本机实现更快?
我们的目标是原始
memcpy
的性能或更好:0.17s 。为此,我们基本上只能使用
调用
、创建对象并执行一堆复制
指令。 它看起来有点像上面的Cloner
实现,但有一些重要的区别(最重要的是:没有Dictionary
和没有多余的CreateDelegate
调用)。 这里是:我测试了这段代码,结果是:0.16s。 这意味着它比
MemberwiseClone
快大约 2.5 倍。更重要的是,这个速度与 memcpy 相当,这或多或少是“正常情况下的最佳解决方案”。
就我个人而言,我认为这是最快的解决方案 - 最好的部分是:如果 .NET 运行时会变得更快(对 SSE 指令等的适当支持),那么这个解决方案也会变得更快。
编者注:
上面的示例代码假设默认构造函数是公共的。 如果不是,则调用
GetConstructor
将返回 null。 在这种情况下,请使用其他GetConstructor
签名之一来获取受保护或私有构造函数。请参阅 https://learn.microsoft。 com/en-us/dotnet/api/system.type.getconstructor?view=netframework-4.8
I'd like to start with a few quotes:
and
Theoretically the best implementation of a shallow copy is a C++ copy constructor: it knows the size compile-time, and then does a memberwise clone of all fields. The next best thing is using
memcpy
or something similar, which is basically howMemberwiseClone
should work. This means, in theory it should obliterate all other possibilities in terms of performance. Right?... but apparently it isn't blazing fast and it doesn't obliterate all the other solutions. At the bottom I've actually posted a solution that's over 2x faster. So: Wrong.
Testing the internals of MemberwiseClone
Let's start with a little test using a simple blittable type to check the underlying assumptions here about performance:
The test is devised in such a way that we can check the performance of
MemberwiseClone
agaist rawmemcpy
, which is possible because this is a blittable type.To test by yourself, compile with unsafe code, disable the JIT suppression, compile release mode and test away. I've also put the timings after every line that's relevant.
Implementation 1:
Basically I ran these tests a number of times, checked the assembly output to ensure that the thing wasn't optimized away, etc. The end result is that I know approximately how much seconds this one line of code costs, which is 0.40s on my PC. This is our baseline using
MemberwiseClone
.Implementation 2:
If you look closely at these numbers, you'll notice a few things:
So why is all of this so slow?
My explanation is that it has to do with the GC. Basically the implementations cannot rely on the fact that memory will stay the same before and after a full GC (The address of the memory can be changed during a GC, which can happen at any moment, including during your shallow copy). This means you only have 2 possible options:
GCHandle.Alloc
is just one of the ways to do this, it's well known that things like C++/CLI will give you better performance.MemberwiseClone
will use method 1, which means you'll get a performance hit because of the pinning procedure.A (much) faster implementation
In all cases our unmanaged code cannot make assumptions about the size of the types and it has to pin data. Making assumptions about size enables the compiler to do better optimizations, like loop unrolling, register allocation, etc. (just like a C++ copy ctor is faster than
memcpy
). Not having to pin data means we don't get an extra performance hit. Since .NET JIT's to assembler, in theory this means that we should be able to make a faster implementation using simple IL emitting, and allowing the compiler to optimize it.So to summarize on why this can be faster than the native implementation?
What we're aiming for is the performance of raw
memcpy
or better: 0.17s.To do that, we basically cannot use more than just a
call
, create the object, and perform a bunch ofcopy
instructions. It looks a bit like theCloner
implementation above, but some important differences (most significant: noDictionary
and no redundantCreateDelegate
calls). Here goes:I've tested this code with the result: 0.16s. This means it's approximately 2.5x faster than
MemberwiseClone
.More importantly, this speed is on-par with
memcpy
, which is more or less the 'optimal solution under normal circumstances'.Personally, I think this is the fastest solution - and the best part is: if the .NET runtime will get faster (proper support for SSE instructions etc), so will this solution.
Editorial Note:
The sample code above assumes that the default constructor is public. If it is not, the call to
GetConstructor
returns null. In that case, use one of the otherGetConstructor
signatures to obtain protected or private constructors.See https://learn.microsoft.com/en-us/dotnet/api/system.type.getconstructor?view=netframework-4.8
我很困惑。
MemberwiseClone()
应该消灭浅复制的任何其他性能。 在 CLI 中,除 RCW 之外的任何类型都应该能够按以下顺序进行浅复制:memcpy
将数据从原来的转移到新的。 由于目标位于托儿所中,因此不需要写屏障。SuppressFinalize
并且此类标志存储在对象标头中,请在克隆中取消设置它。CLR 内部团队的有人可以解释为什么情况并非如此吗?
I'm confused.
MemberwiseClone()
should annihilate the performance of anything else for a shallow copy. In the CLI, any type other than an RCW should be able to be shallow-copied by the following sequence:memcpy
the data from the original to the new. Since the target is in the nursery, no write barriers are required.SuppressFinalize
called on it and such a flag is stored in the object header, unset it in the clone.Can someone on the CLR internals team explain why this is not the case?
为什么要把事情复杂化呢? MemberwiseClone 就足够了。
Why complicate things? MemberwiseClone would suffice.
这是一种使用动态 IL 生成来实现此目的的方法。 我在网上某个地方找到了它:
This is a way to do it using dynamic IL generation. I found it somewhere online:
事实上,MemberwiseClone 通常比其他方法要好得多,尤其是对于复杂类型。
原因是:如果你手动创建一个副本,它必须调用该类型的构造函数之一,但是使用成员克隆,我猜它只是复制一块内存。 对于那些具有非常昂贵的构造操作的类型,成员克隆绝对是最好的方法。
曾经我写过这样的类型:
{string A = Guid.NewGuid().ToString()},我发现成员克隆比创建新实例和手动分配成员要快得多。
下面的代码的结果:
Manual Copy:00:00:00.0017099
MemberwiseClone:00:00:00.0009911
最后,我在这里提供我的代码:
In fact, MemberwiseClone is usually much better than others, especially for complex type.
The reason is that:if you manual create a copy, it must call one of the type's constructor, but use memberwise clone, I guess it just copy a block of memory. for those types has very expensive construct actions, memberwise clone is absolutely the best way.
Onece i wrote such type:
{string A = Guid.NewGuid().ToString()}, I found memberwise clone is muct faster than create a new instance and manual assign members.
The code below's result:
Manual Copy:00:00:00.0017099
MemberwiseClone:00:00:00.0009911
finally, I provide my code here:
这是一个小型帮助程序类,它使用反射来访问
MemberwiseClone
,然后缓存委托以避免不必要地使用反射。你可以这样称呼它:
Here is a small helper class that uses reflection to access
MemberwiseClone
and then caches the delegate to avoid using reflection more than necessary.You can call it like this:
MemberwiseClone 需要较少的维护。 我不知道默认属性值是否有帮助,也许可以忽略具有默认值的项目。
MemberwiseClone requires less maintenance. I don't know if having default property values helps any, maybe if could ignore items with default values.