在什么情况下我应该在 C++ 中使用 memcpy 而不是标准运算符?
我什么时候可以使用 memcpy
获得更好的性能,或者我如何从使用它中受益? 例如:
float a[3]; float b[3];
代码:
memcpy(a, b, 3*sizeof(float));
比这个更快吗?
a[0] = b[0];
a[1] = b[1];
a[2] = b[2];
When can I get better performance using memcpy
or how do I benefit from using it?
For example:
float a[3]; float b[3];
is code:
memcpy(a, b, 3*sizeof(float));
faster than this one?
a[0] = b[0];
a[1] = b[1];
a[2] = b[2];
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
效率不应该是您关心的问题。
编写干净、可维护的代码。
令我困扰的是,这么多答案表明 memcpy() 效率低下。它被设计为复制内存块(对于 C 程序)的最有效方法。
因此,我编写了以下内容作为测试:
然后比较代码生成:
这导致:(手动添加的注释)
添加了在
1000000000
循环内运行上述内容的计时结果。Efficiency should not be your concern.
Write clean maintainable code.
It bothers me that so many answers indicate that the memcpy() is inefficient. It is designed to be the most efficient way of copy blocks of memory (for C programs).
So I wrote the following as a test:
Then to compare the code produces:
This resulted in: (comments added by hand)
Added Timing results for running the above inside a loop of
1000000000
.仅当您要复制的对象没有显式构造函数及其成员(所谓的 POD,“纯旧数据”)时,才可以使用 memcpy。因此,对于
float
调用memcpy
是可以的,但是对于例如std::string
来说则是错误的。但部分工作已经为您完成:
中的std::copy
专门用于内置类型(也可能适用于所有其他 POD-类型 - 取决于 STL 实现)。因此,编写std::copy(a, a + 3, b)
与memcpy
一样快(编译器优化后),但更不容易出错。You can use
memcpy
only if the objects you're copying have no explicit constructors, so as their members (so-called POD, "Plain Old Data"). So it is OK to callmemcpy
forfloat
, but it is wrong for, e.g.,std::string
.But part of the work has already been done for you:
std::copy
from<algorithm>
is specialized for built-in types (and possibly for every other POD-type - depends on STL implementation). So writingstd::copy(a, a + 3, b)
is as fast (after compiler optimization) asmemcpy
, but is less error-prone.编译器专门优化了
memcpy
调用,至少是 clang &海湾合作委员会确实如此。所以你应该尽可能选择它。Compilers specifically optimize
memcpy
calls, at least clang & gcc does. So you should prefer it wherever you can.使用 std::copy()。正如
g++
的头文件所述:也许,Visual Studio 的差别不大。按照正常的方式进行,一旦发现瓶颈就进行优化。对于简单副本,编译器可能已经为您进行了优化。
Use
std::copy()
. As the header file forg++
notes:Probably, Visual Studio's is not much different. Go with the normal way, and optimize once you're aware of a bottle neck. In the case of a simple copy, the compiler is probably already optimizing for you.
不要进行过早的微优化,例如像这样使用 memcpy。使用赋值更清晰且不易出错,任何像样的编译器都会生成适当有效的代码。当且仅当您分析了代码并发现分配是一个重大瓶颈时,您才可以考虑某种微优化,但一般来说,您应该始终首先编写清晰、健壮的代码。
Don't go for premature micro-optimisations such as using memcpy like this. Using assignment is clearer and less error-prone and any decent compiler will generate suitably efficient code. If, and only if, you have profiled the code and found the assignments to be a significant bottleneck then you can consider some kind of micro-optimisation, but in general you should always write clear, robust code in the first instance.
memcpy 的好处?大概是可读性。否则,您将不得不进行多次赋值或使用 for 循环进行复制,这两者都不像仅仅执行 memcpy 那样简单明了(当然,只要您的类型简单并且不需要构造/破坏)。
此外,memcpy 通常针对特定平台进行了相对优化,以至于它不会比简单赋值慢很多,甚至可能更快。
The benefits of memcpy? Probably readability. Otherwise, you would have to either do a number of assignments or have a for loop for copying, neither of which are as simple and clear as just doing memcpy (of course, as long as your types are simple and don't require construction/destruction).
Also, memcpy is generally relatively optimized for specific platforms, to the point that it won't be all that much slower than simple assignment, and may even be faster.
据推测,正如 Nawaz 所说,作业版本在大多数平台上应该更快。这是因为
memcpy()
将逐字节复制,而第二个版本一次可以复制 4 个字节。与往常一样,您应该始终对应用程序进行分析,以确保您期望的瓶颈与实际情况相符。
编辑
这同样适用于动态数组。既然您提到了 C++,那么在这种情况下您应该使用 std::copy() 算法。
编辑
这是使用 GCC 4.5.0 的 Windows XP 的代码输出,使用 -O3 标志编译:
我已经完成了这个函数,因为 OP 也指定了动态数组。
输出汇编如下:
当然,我假设这里所有的专家都知道rep movsb 的含义。
这是赋值版本:
产生以下代码:
一次移动 4 个字节。
Supposedly, as Nawaz said, the assignment version should be faster on most platform. That's because
memcpy()
will copy byte by byte while the second version could copy 4 bytes at a time.As it's always the case, you should always profile applications to be sure that what you expect to be the bottleneck matches the reality.
Edit
Same applies to dynamic array. Since you mention C++ you should use
std::copy()
algorithm in that case.Edit
This is code output for Windows XP with GCC 4.5.0, compiled with -O3 flag:
I have done this function because OP specified dynamic arrays too.
Output assembly is the following:
of course, I assume all of the experts here knows what
rep movsb
means.This is the assignment version:
which yields the following code:
Which moves 4 bytes at a time.