当前位置：文江博客话题详情

Function c strcpy

strcpy() 返回值

发布于 2024-09-16 00:48:15 字数 344 浏览 7 评论 0 原文

标准 C 库中的许多函数，尤其是用于字符串操作的函数，尤其是 strcpy()，都共享以下原型：

char *the_function (char *destination, ...)

这些函数的返回值实际上与提供的目标相同。代码>.为什么要把返回值浪费在多余的东西上呢？对于这样的函数来说，无效或返回有用的东西更有意义。

我对为什么会这样的唯一猜测是，将函数调用嵌套在另一个表达式中更容易、更方便，例如：

printf("%s\n", strcpy(dst, src));

还有其他合理的理由来证明这个习惯用法吗？

原文

A lot of the functions from the standard C library, especially the ones for string manipulation, and most notably strcpy(), share the following prototype:

char *the_function (char *destination, ...)

The return value of these functions is in fact the same as the provided destination. Why would you waste the return value for something redundant? It makes more sense for such a function to be void or return something useful.

My only guess as to why this is is that it's easier and more convenient to nest the function call in another expression, for example:

printf("%s\n", strcpy(dst, src));

Are there any other sensible reasons to justify this idiom?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

窝囊感情。 2024-09-23 00:48:15

正如埃文指出的，可以做类似的

char* s = strcpy(malloc(10), "test");

事情为 malloc()ed 内存分配一个值，而不使用辅助变量。

（这个例子不是最好的，它会在内存不足的情况下崩溃，但这个想法很明显）

as Evan pointed out, it is possible to do something like

char* s = strcpy(malloc(10), "test");

e.g. assign malloc()ed memory a value, without using helper variable.

(this example isn't the best one, it will crash on out of memory conditions, but the idea is obvious)

回复收藏 0 原文

终止放荡 2024-09-23 00:48:15

char *stpcpy(char *dest, const char *src); 返回一个指向字符串结尾的指针，并且是 POSIX.1-2008 的一部分。在此之前，它自 1992 年以来一直是 GNU libc 扩展。它于 1986 年首次出现在 Lattice C AmigaDOS 中。

gcc -O3 在某些情况下会优化 strcpy + strcat 使用 stpcpy 或 strlen + 内联复制，请参见下文。

C 的标准库设计得很早，很容易认为 str* 函数没有经过优化设计。 I/O 函数肯定是非常很早就设计的，在 1972 年，C 甚至还没有预处理器，即为什么 fopen(3) 采用模式字符串而不是像 Unix 那样的标志位图 打开(2)。

我无法找到 Mike Lesk 的“便携式 I/O 包”中包含的函数列表，因此我不知道当前形式的 strcpy 是否可以追溯到那里或者这些功能是后来添加的。（我找到的唯一真正来源是 Dennis Ritchie 广为人知的 C历史文章，非常好，但不够深入。我没有找到实际 I/O 包本身的任何文档或源代码。）

它们确实以当前的形式出现。在 K&R 第一版，1978 年。

函数应该返回它们所做的计算结果（如果它对调用者可能有用），而不是丢弃它。作为指向字符串末尾的指针，或整数长度。（指针是很自然的。）

正如@R 所说：

我们都希望这些函数返回一个指向终止空字节的指针（这会将大量 O(n) 操作减少到 O(1)）

例如调用 strcat(bigstr, newstr[i]) 在循环中从许多短（O(1) 长度）字符串构建长字符串的复杂度大约为 O(n^2)，但是 strlen/memcpy 只会查看每个字符两次（一次在 strlen 中，一次在 memcpy 中）。

仅使用 ANSI C 标准库，无法高效地仅查看每个字符一次。您可以手动编写一次一个字节的循环，但对于长度超过几个字节的字符串，这比使用现代硬件上的当前编译器（不会自动矢量化搜索循环）两次查看每个字符更糟糕，给出高效的 libc 提供的 SIMD strlen 和 memcpy。您可以使用 length = sprintf(bigstr, "%s", newstr[i]); bigstr+=length;，但是 sprintf() 必须解析其格式字符串，并且速度不快。

甚至没有一个 strcmp 或 memcmp 版本可以返回差异的位置。如果这就是您想要的，您将遇到与相同的问题为什么 python 中的字符串比较如此快？：一个优化的库函数，其运行速度比使用已编译循环执行的任何操作都要快（除非您为您关心的每个目标平台都有手动优化的 asm），您可以用于接近不同的字节，然后在接近时回退到常规循环。

看来 C 的字符串库在设计时没有考虑任何操作的 O(n) 成本，而不仅仅是查找隐式长度字符串的末尾，而 strcpy 的行为绝对不是唯一的例子。

它们基本上将隐式长度字符串视为整个不透明对象，总是返回指向开头的指针，而不是返回到结尾或在搜索或附加后返回到字符串内部的位置。

历史猜测

在 PDP-11 的早期 C 中，我怀疑 strcpy 并不比 while(*dst++ = *src++) {} （并且可能是这样实现的）。

事实上， K&R 第一版（第 101 页）显示了 strcpy 的实现并表示：

虽然乍一看这似乎很神秘，但其表示法相当方便，并且应该掌握该习惯用法，如果没有其他原因，您会在 C 程序中经常看到它。

这意味着他们完全希望程序员在需要 dst 或 src 最终值的情况下编写自己的循环。因此，也许他们没有意识到需要重新设计标准库 API，直到为手工优化的 asm 库函数公开更多有用的 API 时为时已晚。

但是返回 dst 的原始值有什么意义吗？

strcpy(dst, src) 返回 dst 类似于 x=y 计算 x< /强>。所以它使 strcpy 像字符串赋值运算符一样工作。

正如其他答案指出的那样，这允许嵌套，例如 foo( strcpy(buf,input) ); 。早期的计算机内存非常有限。 保持源代码紧凑是常见的做法。打孔卡和缓慢的终端可能是其中的一个因素。我不知道历史编码标准或风格指南，也不知道什么被认为太多而无法放在一行中。

陈旧的编译器也可能是一个因素。使用现代优化编译器， char *tmp = foo(); / bar(tmp); 并不慢于 bar(foo());，但它是与 gcc -O0 一起使用的。我不知道早期的编译器是否可以完全优化变量（不为它们保留堆栈空间），但希望它们至少可以在简单的情况下将它们保留在寄存器中（不像现代的 gcc -O0 那样）故意溢出/重新加载所有内容以进行一致的调试）。即，对于古代编译器来说，gcc -O0 并不是一个好的模型，因为它是为了一致调试而故意进行反优化的。

编译器生成 asm 的可能动机

鉴于 C 字符串库的通用 API 设计中缺乏对效率的关注，这可能不太可能。但也许有代码大小的好处。（在早期的计算机上，代码大小比 CPU 时间更具有硬性限制）。

我对早期 C 编译器的质量了解不多，但可以肯定的是，它们在优化方面并不出色，即使对于像 PDP-11 这样的简单/正交架构也是如此。

通常需要在函数调用之后使用字符串指针。在汇编级别，您（编译器）可能在调用之前将其保存在寄存器中。根据调用约定，您可以将其压入堆栈，也可以将其复制到调用约定指定第一个参数所在的右侧寄存器。（即 strcpy 所期望的位置）。或者，如果您提前计划，则指针已经位于调用约定的正确寄存器中。

但是函数调用会破坏一些寄存器，包括所有参数传递寄存器。（因此，当函数在寄存器中获取 arg 时，它可以在那里递增它，而不是复制到暂存寄存器。）

因此，作为调用者，用于在函数调用中保留某些内容的代码生成选项包括：

将其存储/重新加载到本地堆栈内存。（或者如果内存中仍有最新副本，则重新加载它）。
在整个函数的开始/结束处保存/恢复调用保留的寄存器，并在函数调用之前将指针复制到这些寄存器之一。
该函数为您返回寄存器中的值。（当然，这仅在 C 源代码编写为使用输入变量的返回值而不时才有效。例如 dst = strcpy(dst, src); 如果您没有嵌套它）。

我知道所有体系结构上的所有调用约定都会在寄存器中返回指针大小的返回值，因此在库函数中可能有一条额外的指令可以在所有想要使用该返回值的调用者中节省代码大小。

通过使用 strcpy 的返回值（已经在寄存器中），您可能会从原始的早期 C 编译器获得更好的汇编，而不是让编译器将调用周围的指针保存在调用保留的寄存器中或溢出它到堆栈。情况可能仍然如此。

顺便说一句，在许多 ISA 上，返回值寄存器不是第一个参数传递寄存器。除非您使用基址+索引寻址模式，否则 strcpy 确实会花费额外的指令（并占用另一个寄存器）来复制指针增量循环的寄存器。

PDP-11 工具链通常使用某种堆栈-args 调用约定，始终将 args 压入堆栈。我不确定有多少调用保留寄存器与调用破坏寄存器是正常的，但只有 5 或 6 个 GP 寄存器可用 (R7 是程序计数器，R6 是堆栈指针，R5 通常用作帧指针）。所以它类似于 32 位 x86，但比 32 位 x86 更局促。

char *bar(char *dst, const char *str1, const char *str2)
{
    //return strcat(strcat(strcpy(dst, str1), "separator"), str2);

    // more readable to modern eyes:
    dst = strcpy(dst, str1);
    dst = strcat(dst, "separator");
//    dst = strcat(dst, str2);
    
    return dst;  // simulates further use of dst
}

  # x86 32-bit gcc output, optimized for size (not speed)
  # gcc8.1 -Os  -fverbose-asm -m32
  # input args are on the stack, above the return address

    push    ebp     #
    mov     ebp, esp  #,      Create a stack frame.

    sub     esp, 16   #,      This looks like a missed optimization, wasted insn
    push    DWORD PTR [ebp+12]      # str1
    push    DWORD PTR [ebp+8]       # dst
    call    strcpy  #
    add     esp, 16   #,

    mov     DWORD PTR [ebp+12], OFFSET FLAT:.LC0      # store new args over our incoming args
    mov     DWORD PTR [ebp+8], eax    #  EAX = dst.
    leave   
    jmp     strcat                  # optimized tailcall of the last strcat

这比不使用 dst = 而是重用 strcat 的输入参数的版本要紧凑得多。（参见 在 Godbolt 编译器上explorer。）

-O3 输出非常不同：gcc 对于不使用返回值的版本使用 stpcpy （返回指向尾部的指针），然后使用 mov-immediate 将文字字符串数据直接存储到正确的位置。

但不幸的是，dst = strcpy(dst, src) -O3版本仍然使用常规strcpy，然后将strcat内联为strlen< /code> + mov - 立即。

是否为 C 字符串

C 隐式长度字符串并不总是本质上不好，并且具有有趣的优点（例如，后缀也是有效的字符串，而无需复制它）。

但是 C 字符串库的设计方式并未使高效代码成为可能，因为一次循环通常不会自动向量化，并且库函数会丢弃它们的工作结果。必须做的。

gcc 和 clang 永远不会自动向量化循环，除非在第一次迭代之前知道迭代计数，例如 for(int i=0; i

strncpy等等基本上都是一场灾难。例如，如果strncpy达到缓冲区大小限制，则不会复制终止'\0'，因此您需要手动arr[n] = 0;< /code> 之前或之后。但如果源字符串较短，它将用 0 字节填充到指定的长度，可能会触及不需要触及的内存页。（这也使得将短字符串复制到仍然有大量剩余空间的大缓冲区中效率非常低。）它似乎是为写入较大字符串的中间而设计的，而不是为了避免缓冲区溢出。

像 snprintf 这样的一些函数是可用的，并且总是以 nul 终止。记住哪个做哪个是很困难的，如果你记错了，风险会很大，所以你必须每次都检查是否对正确性很重要。

正如 Bruce Dawson 所说：已经停止使用 strncpy！。显然，一些 MSVC 扩展（例如 _snprintf）更糟糕。

strncat 也存在于 POSIX 中 .2001并且与strcpy无关；它会做你所希望的事情，一个总是以 0 结尾的边界检查 strcpy 。但与 strcat 一样，它仍然返回原始指针，因此对于有效地将字符串附加到缓冲区中没有用处；如果您只是在同一缓冲区上重复调用它，则每次都必须重新扫描前导部分才能找到当前结尾。手册页提到“画家 Shlemiel”。

char *stpcpy(char *dest, const char *src); returns a pointer to the end of the string, and is part of POSIX.1-2008. Before that, it was a GNU libc extension since 1992. It first appeared in Lattice C AmigaDOS in 1986.

gcc -O3 will in some cases optimize strcpy + strcat to use stpcpy or strlen + inline copying, see below.

C's standard library was designed very early, and it's very easy to argue that the str* functions are not optimally designed. The I/O functions were definitely designed very early, in 1972 before C even had a preprocessor, which is why fopen(3) takes a mode string instead of a flag bitmap like Unix open(2).

I haven't been able to find a list of functions included in Mike Lesk's "portable I/O package", so I don't know whether strcpy in its current form dates all the way back to there or if those functions were added later. (The only real source I've found is Dennis Ritchie's widely-known C History article, which is excellent but not that in depth. I didn't find any documentation or source code for the actual I/O package itself.)

They do appear in their current form in K&R first edition, 1978.

Functions should return the result of computation they do, if it's potentially useful to the caller, instead of throwing it away. Either as a pointer to the end of the string, or an integer length. (A pointer would be natural.)

As @R says:

We all wish these functions returned a pointer to the terminating null byte (which would reduce a lot of O(n) operations to O(1))

e.g. calling strcat(bigstr, newstr[i]) in a loop to build up a long string from many short (O(1) length) strings has approximately O(n^2) complexity, but strlen/memcpy will only look at each character twice (once in strlen, once in memcpy).

Using only the ANSI C standard library, there's no way to efficiently only look at every character once. You could manually write a byte-at-a-time loop, but for strings longer than a few bytes, that's worse than looking at each character twice with current compilers (which won't auto-vectorize a search loop) on modern HW, given efficient libc-provided SIMD strlen and memcpy. You could use length = sprintf(bigstr, "%s", newstr[i]); bigstr+=length;, but sprintf() has to parse its format string and is not fast.

There isn't even a version of strcmp or memcmp that returns the position of the difference. If that's what you want, you have the same problem as Why is string comparison so fast in python?: an optimized library function that runs faster than anything you can do with a compiled loop (unless you have hand-optimized asm for every target platform you care about), which you can use to get close to the differing byte before falling back to a regular loop once you get close.

It seems that C's string library was designed without regard to the O(n) cost of any operation, not just finding the end of implicit-length strings, and strcpy's behaviour is definitely not the only example.

They basically treat implicit-length strings as whole opaque objects, always returning pointers to the start, never to the end or to a position inside one after searching or appending.

History guesswork

In early C on a PDP-11, I suspect that strcpy was no more efficient than while(*dst++ = *src++) {} (and was probably implemented that way).

In fact, K&R first edition (page 101) shows that implementation of strcpy and says:

Although this may seem cryptic at first sight, the notational convenience is considerable, and the idiom should be mastered, if for no other reason than that you will see it frequently in C programs.

This implies they fully expected programmers to write their own loops in cases where you wanted the final value of dst or src. And thus maybe they didn't see a need to redesign the standard library API until it was too late to expose more useful APIs for hand-optimized asm library functions.

But does returning the original value of dst make any sense?

strcpy(dst, src) returning dst is analogous to x=y evaluating to the x. So it makes strcpy work like a string assignment operator.

As other answers point out, this allows nesting, like foo( strcpy(buf,input) );. Early computers were very memory-constrained. Keeping your source code compact was common practice. Punch cards and slow terminals were probably a factor in this. I don't know historical coding standards or style guides or what was considered too much to put on one line.

Crusty old compilers were also maybe a factor. With modern optimizing compilers, char *tmp = foo(); / bar(tmp); is no slower than bar(foo());, but it is with gcc -O0. I don't know if very early compilers could optimize variables away completely (not reserving stack space for them), but hopefully they could at least keep them in registers in simple cases (unlike modern gcc -O0 which on purpose spills/reloads everything for consistent debugging). i.e. gcc -O0 isn't a good model for ancient compilers, because it's anti-optimizing on purpose for consistent debugging.

Possible compiler-generated-asm motivation

Given the lack of care about efficiency in the general API design of the C string library, this might be unlikely. But perhaps there was a code-size benefit. (On early computers, code-size was more of a hard limit than CPU time).

I don't know much about the quality of early C compilers, but it's a safe bet that they were not awesome at optimizing, even for a nice simple / orthogonal architecture like PDP-11.

It's common to want the string pointer after the function call. At an asm level, you (the compiler) probably has it in a register before the call. Depending on calling convention, you either push it on the stack or you copy it to the right register where the calling convention says the first arg goes. (i.e. where strcpy is expecting it). Or if you're planning ahead, you already had the pointer in the right register for the calling convention.

But function calls clobber some registers, including all the arg-passing registers. (So when a function gets an arg in a register, it can increment it there instead of copying to a scratch register.)

So as the caller, your code-gen option for keeping something across a function call include:

store/reload it to local stack memory. (Or just reload it if an up-to-date copy is still in memory).
save/restore a call-preserved register at the start/end of your whole function, and copy the pointer to one of those registers before the function call.
the function returns the value in a register for you. (Of course, this only works if the C source is written to use the return value instead of the input variable. e.g. dst = strcpy(dst, src); if you aren't nesting it).

All calling conventions on all architectures I'm aware of return pointer-sized return values in a register, so having maybe one extra instruction in the library function can save code-size in all callers that want to use that return value.

You probably got better asm from primitive early C compilers by using the return value of strcpy (already in a register) than by making the compiler save the pointer around the call in a call-preserved register or spill it to the stack. This may still be the case.

BTW, on many ISAs, the return-value register is not the first arg-passing register. And unless you use base+index addressing modes, it does cost an extra instruction (and tie up another reg) for strcpy to copy the register for a pointer-increment loop.

PDP-11 toolchains normally used some kind of stack-args calling convention, always pushing args on the stack. I'm not sure how many call-preserved vs. call-clobbered registers were normal, but only 5 or 6 GP regs were available (R7 being the program counter, R6 being the stack pointer, R5 often used as a frame pointer). So it's similar to but even more cramped than 32-bit x86.

char *bar(char *dst, const char *str1, const char *str2)
{
    //return strcat(strcat(strcpy(dst, str1), "separator"), str2);

    // more readable to modern eyes:
    dst = strcpy(dst, str1);
    dst = strcat(dst, "separator");
//    dst = strcat(dst, str2);
    
    return dst;  // simulates further use of dst
}

  # x86 32-bit gcc output, optimized for size (not speed)
  # gcc8.1 -Os  -fverbose-asm -m32
  # input args are on the stack, above the return address

    push    ebp     #
    mov     ebp, esp  #,      Create a stack frame.

    sub     esp, 16   #,      This looks like a missed optimization, wasted insn
    push    DWORD PTR [ebp+12]      # str1
    push    DWORD PTR [ebp+8]       # dst
    call    strcpy  #
    add     esp, 16   #,

    mov     DWORD PTR [ebp+12], OFFSET FLAT:.LC0      # store new args over our incoming args
    mov     DWORD PTR [ebp+8], eax    #  EAX = dst.
    leave   
    jmp     strcat                  # optimized tailcall of the last strcat

This is significantly more compact than a version which doesn't use dst =, and instead reuses the input arg for the strcat. (See both on the Godbolt compiler explorer.)

The -O3 output is very different: gcc for the version that doesn't use the return value uses stpcpy (returns a pointer to the tail) and then mov-immediate to store the literal string data directly to the right place.

But unfortunately, the dst = strcpy(dst, src) -O3 version still uses regular strcpy, then inlines strcat as strlen + mov-immediate.

To C-string or not to C-string

C implicit-length strings aren't always inherently bad, and have interesting advantages (e.g. a suffix is also a valid string, without having to copy it).

But the C string library is not designed in a way that makes efficient code possible, because char-at-a-time loops typically don't auto-vectorize and the library functions throw away results of work they have to do.

gcc and clang never auto-vectorize loops unless the iteration count is known before the first iteration, e.g. for(int i=0; i<n ;i++). ICC can vectorize search loops, but it's still unlikely to do as well as hand-written asm.

strncpy and so on are basically a disaster. e.g. strncpy doesn't copy the terminating '\0' if it reaches the buffer size limit, so you need to manually arr[n] = 0; before or after. But if the source string is shorter, it pads with 0 bytes out to the specified length, potentially touching a page of memory that never needed to be touched. (Also making it very inefficient for copying short strings into a large buffer that still has lots of space left.)
It appears to have been designed for writing into the middle of larger strings, not for avoiding buffer overflows.

A few functions like snprintf are usable and do always nul-terminate. Remembering which does which is hard, and a huge risk if you remember wrong, so you have to check every time in cases where it matters for correctness.

As Bruce Dawson says: Stop using strncpy already!. Apparently some MSVC extensions like _snprintf are even worse.

strncat also exists in POSIX.2001 and is unrelated to strcpy; it does what you'd hope, a bounds-checked strcpy which always 0-terminates. But like strcat it still returns the original pointer so is not useful for efficiently appending strings into a buffer; it has to re-scan the leading part every time to find the current end if you simply call it repeatedly on the same buffer. The man page mentions "Shlemiel the painter".

回复收藏 0 原文

零時差 2024-09-23 00:48:15

我相信你的猜测是正确的，它使嵌套调用变得更容易。

回复收藏 0 原文

抠脚大汉 2024-09-23 00:48:15

它也非常容易编码。

返回值通常保留在 AX 寄存器中（这不是强制性的，但经常是这种情况）。当函数启动时，目标被放入 AX 寄存器中。
要返回目的地，程序员需要做......什么都不做！只需将值保留在原来的位置即可。

程序员可以将该函数声明为void。但返回值已经在正确的位置，只是等待返回，甚至不需要额外的指令来返回它！无论改进多么小，在某些情况下都是很方便的。

回复收藏 0 原文

丢了幸福的猪 2024-09-23 00:48:15

与流畅界面相同的概念。只是让代码更快/更容易阅读。

回复收藏 0 原文

染年凉城似染瑾 2024-09-23 00:48:15

我认为这样设置并不是为了嵌套目的，而是为了错误检查。如果内存服务没有一个 c 标准库函数自己做太多错误检查，因此更有意义的是确定 strcpy 调用期间是否出现问题。

if(strcpy(dest, source) == NULL) {
  // Something went horribly wrong, now we deal with it
}

I don't think this is really set up this way for nesting purposes, but more for error checking. If memory serves none of the c standard library functions do much error checking on their own and therefor it makes more sense that this would be to determine if something went awry during the strcpy call.

if(strcpy(dest, source) == NULL) {
  // Something went horribly wrong, now we deal with it
}

回复收藏 0 原文

~没有更多了~

关于作者

病女

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

strcpy() 返回值

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

历史猜测

编译器生成 asm 的可能动机

是否为 C 字符串

History guesswork

Possible compiler-generated-asm motivation

To C-string or not to C-string

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

strcpy() 返回值

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

历史猜测

编译器生成 asm 的可能动机

是否为 C 字符串

History guesswork

Possible compiler-generated-asm motivation

To C-string or not to C-string

关于作者

相关话题

热门标签

推荐作者

牛↙奶布丁

COSO

落叶

暗地喜欢

qq_i8qOEG

qq_Wl4Sbi

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。