C++ memcpy函数,为什么要用*s++而不是 s[i]
我正在从头开始编写 memcpy,并且一直在查找其他人的实现...我的实现是:
void* memcpy (void *destination, const void *source, size_t num)
{
char *D = (char*)destination;
char *S = (char*)source;
for(int i = 0; i < num; i++)
D[i] = S[i];
return D;
}
我研究过的各种其他来源和参考文献让
void* memcpy (void *destination, const void *source, size_t num)
{
char *D = (char*)destination;
char *S = (char*)source;
for(int i = 0; i < num; i++)
{
*D = *S;
D++;
S++;
}
return D;
}
我无法理解其中的差异以及它们是否会产生不同的输出。让我特别困惑的部分是 D++;和S++;
I am writing memcpy from scratch and I have been looking up other peoples implementations...My implementation is:
void* memcpy (void *destination, const void *source, size_t num)
{
char *D = (char*)destination;
char *S = (char*)source;
for(int i = 0; i < num; i++)
D[i] = S[i];
return D;
}
various other sources and references that I have researched have
void* memcpy (void *destination, const void *source, size_t num)
{
char *D = (char*)destination;
char *S = (char*)source;
for(int i = 0; i < num; i++)
{
*D = *S;
D++;
S++;
}
return D;
}
I am having trouble understanding the difference and whether they would produce different outputs. The portion that confuses me specifically is the D++; and S++;
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在编写此代码时,避免向指针添加索引可能会更快。
在我自己对 x86 架构的测试中,该架构在低级指令中内置了索引模式,索引方法稍微快一些。
At the time this code was written, it was probably faster to avoid the addition of the index to the pointers.
In my own testing on the x86 architecture, which has an indexing mode built into the low-level instructions, the indexed method is slightly faster.
算法是等效的。第二个版本使用指针算术将指针前进到下一个位置而不是使用
a[i]
语法对数组进行索引。这是有效的,因为
a[i]
实际上是*(a+i)
的简写(阅读:将i
位置提前到a
并读取该位置的值)。它不是在每次迭代时执行总偏移量 (+i
),而是在每次迭代时执行部分偏移量 (++a
) 并累加结果。The algorithms are equivalent. The second version uses pointer arithmetic to advance the pointers to the next position instead of indexing the arrays using the
a[i]
syntax.This works because the
a[i]
is actually a shorthand for*(a+i)
(read: advancei
positions pasta
and read the value at that location). Instead of performing the total offset (+i
) at each iteration, it performs a partial offset (++a
) at each iteration and accumulates the result.让您感到困惑的是指针算术:
D++
、S++
。指针递增以引用下一个char
(因为它们是char*
)What seems to be confusing you is the pointer arithmetic:
D++
,S++
. The pointers are being incremented to refer to the nextchar
(as they arechar*
)虽然两者在语义上含义相同,但
*s++
版本将避免在增量复制数组字节时必须偏移初始指针值。换句话说,s[i]
的“底层”表示实际上是*(s + i*sizeof(type))
和乘法,尤其是大值的乘法code>i
比简单的小值增量慢得多,至少取决于机器架构。但最终,由于使用了与机器相关的优化内存复制程序集,
memcpy
的libc
实现将比您用 C 手写的任何内容快得多您无法通过普通 C 代码有意访问的指令。While both semantically mean the same thing, the
*s++
version will avoid having to offset the initial pointer value as your increment through the bytes of the array you're copying. In other words the "underlying" representation ofs[i]
is actually*(s + i*sizeof(type))
, and multiplications, especially of large values ofi
, are much slower than simple increments by small values, at least depending on the machine architecture.In the end though, the
libc
implementation ofmemcpy
will be much faster than anything you could hand-write in C due to the use of machine-dependent optimized memory-copying assembly instructions that you can't deliberately access through normal C-code.以下是由 GCC 编译的内部循环。我刚刚添加了
restrict
关键字并删除了返回值,并为 Core 2 编译了 32 位:第一个,数组版本:
第二个,增量版本:
如您所见,编译器正确地理解了这两种结构。
Here are the inner loops, as compiled by GCC. I just added
restrict
keywords and removed the return value and compiled 32 bit for Core 2:First one, array version:
Second one, increment version:
The compiler, as you can see, saw right through both constructs.
虽然这两种方式都没什么区别,但看到其中任何一种我都会感到有点惊讶。我期望更接近:
无论如何,大多数像样的编译器都会生成大约相同的代码。我最近没有检查过,但大多数针对 x86 的编译器曾经会将大部分循环转换为单个
rep movsd
指令。但它们可能不再存在——这不再是最佳选择。Though it makes little difference either way, I'd be a bit surprised to see either of those. I'd expect something closer to:
Most decent compilers will produce about the same code regardless. I haven't checked recently, but at one time most compilers targeting x86 would turn most of the loop into a single
rep movsd
instruction. They might not any more though -- that's no longer optimal.现代编译器会将它们优化为相同的代码。这称为强度降低。 (返回值不同除外。)
Modern compilers will optimize these to the same code. It is called strength reduction. (Except for the different return values.)
D++
和S++
正在递增指针。请记住,
D[i]
相当于*(D + i)
。因此,一种是增加指针,另一种是保持基数并添加偏移量。
现代编译器可能会编译成相同的代码。
注意:我假设第二个示例中的
return D;
是复制粘贴错误,因为它应该是return destination;
因为D
是递增的并且指向目标字节“之后”的内存。D++
andS++
is incrementing a pointer.Keep in mind that
D[i]
is equivalent to*(D + i)
.Thus one is incrementing the pointer, other is keeping base and adding offset.
Modern compilers will probably compile to the same code.
NB: I assume
return D;
in second example is a copy-paste error, as it should bereturn destination;
becauseD
is increment and point to the memory "after" destination bytes.