使用 crt 在线性时间内连接 C 字符串
假设我们想要将 const char *s[0], s[1], ... s[n-1] 连接成 C 中的一个 long char out[]。
):
void concatManyStrings(char out[], const char *s[], size_t n);
形式上(为了简单起见,忽略缓冲区溢出 当然,这是一个简单的任务:从一个指向 out 的指针开始,并为每个字符推进它,
循环输入字符串时。
另一种方法(仍然是线性时间)是保留指向末尾的指针,
对于每个 s[i] 执行
{ strcpy(endp, s[i]); endp += strlen(s[i]); }
以下操作:但是,如果有一个标准 CRT 函数知道如何strcpy()
,
并返回复制的字符数(或等效地,指向复制后的下一个字符的指针)。
我能想到的唯一能做到这一点的 CRT 函数是 sprintf()
,但显然远非如此
与返回计数的简单 strcpy()
一样高效。
我缺少这样的功能吗?
Say we want to concatenate const char *s[0], s[1], ... s[n-1] into one long char out[] in C.
Formally (ignoring buffer overruns, for simplicity):
void concatManyStrings(char out[], const char *s[], size_t n);
It is a trivial task, of course: start with a pointer to out and advance it for every char,
while looping through the input strings.
Another approach (which is still linear-time) would be to keep a pointer to the end,
and with each s[i] do:
{ strcpy(endp, s[i]); endp += strlen(s[i]); }
But, the code would be cleaner if there was a standard CRT function that knows how to strcpy()
,
and return the number of copied chars (or equivalently, a pointer to the next char after the copied).
The only CRT function I can think of that does that is sprintf()
, but it is obviously not nearly
as efficient as a simple strcpy()
that returns count.
Is there such a function that I'm missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
不幸的是,
strlcpy()
和strlcat()
是非标准的,但如果您碰巧拥有它们,则可以使用它们来实现此目的。它们都返回可让您确定复制字符串的结束的结果,这与strcpy()
和strcat()
不同(有些无用)返回指向目的地开始的指针。strlcpy()
andstrlcat()
are non-standard, unfortunately, but if you happen to have them, you can use them for this. They both return results that let you determine the end of the copied string, unlikestrcpy()
andstrcat()
which (somewhat uselessly) return a pointer to the start of the destination.您不能忽视缓冲区溢出;这是网络世界崩溃的主要原因之一。
鉴于所示的数据结构,您可以做的事情是有限的。如果数据结构包含传递给函数的数据中每个字符串的长度,那么您可以做更多的事情。但是,在此之前,您必须确定每个字符串的长度(并提供输出缓冲区的长度),然后安排安全地复制字符串。由于在复制时您将知道字符串的长度,因此您可以使用 memmove() 或 memcpy() 来移动数据,并且您知道长度这样您就可以调整指针:
这会扫描每个字符串两次 - 一次用于长度,一次用于复制。但是,您不能使用 strncpy(),因为它的空填充行为(在这种情况下是邪恶的);事实上,它不保证空终止不会成为问题。在您知道长度是安全的之前,您不能使用
strcpy()
,这需要strlen()
。如果数据不是指向字符串的简单指针数组,而是包含字符串长度和指针的结构数组,则可以避免使用strlen()
。谨慎使用strcat()
或strncat()
可能是可行的;主要的注意事项是避免二次行为(Schlemiel 算法),这可以通过确保确定每个添加的字符串的结尾来完成。对于strncat()
,请非常小心大小参数;它与strncpy()
获取的大小不同。而且您仍然可能需要使用strlen()
因为标准函数不会报告放置最后一个字符的字符串结尾 - 这比返回指向第一个字符的指针要有用得多目标字符串的字符。据我所知,没有标准函数可以执行此操作。
You can't afford to ignore buffer overruns; that's one of the main ways the web world crashes.
Given the data structure shown, there is a limit to what you can do. If the data structure included the lengths of each of the strings in the data passed to the function, there'd be more you can do. However, until then, you have to determine the length of each string (and supply the length of the output buffer), and then arrange to safely copy the strings. Since by the time you are copying you will know the length of the string, you can use
memmove()
ormemcpy()
to move the data, and you know the length so you can adjust the pointer:This scans each string twice - once for the length and once for copying. However, you can't afford to use
strncpy()
because of its null-padding behaviour (diabolical in this context); the fact that it doesn't guarantee null termination would not be a problem. You can't usestrcpy()
until you know that the length is safe, which requires thestrlen()
. If the data was not a simple array of pointers to strings but an array of a structure that included the length of the string as well as the pointer, then thestrlen()
could be avoided. With caution, it might be feasible to usestrcat()
orstrncat()
; the primary caution would be to avoid quadratic behaviour (Schlemiel's Algorithm), which can be done by ensuring you determine the end of each added string. In the case ofstrncat()
, be very careful with the size parameter; it is different from whatstrncpy()
gets as a size. And you are still likely to need to usestrlen()
as the standard functions do not report the end of string where they placed the last character - which would be vastly more helpful than returning a pointer to the first character of the target string.There isn't a standard function to do this that I know of.
使用
snprintf
,这基本上始终是有关组装字符串的任何问题的正确答案:不幸的是,这不适用于“任意
n
” 作为输入字符串计数;为此,只需编写自己的 for 循环...Use
snprintf
, which is basically always the right answer to any question about assembling strings:Unfortunately this does not work for "arbitrary
n
" as the input string count; for that just write your own for loop...