Memcpy、字符串和终止符
我必须编写一个函数,用字符串的内容填充指定长度的 char* 缓冲区。如果绳子太长,我只能把它剪掉。该缓冲区不是由我分配的,而是由我的函数的用户分配的。我尝试了这样的事情:
int writebuff(char* buffer, int length){
string text="123456789012345";
memcpy(buffer, text.c_str(),length);
//buffer[length]='\0';
return 1;
}
int main(){
char* buffer = new char[10];
writebuff(buffer,10);
cout << "After: "<<buffer<<endl;
}
我的问题是关于终结者:它应该存在还是不存在?此函数用于更广泛的代码中,有时当需要剪切字符串时,我似乎会遇到奇怪字符的问题。
有关应遵循的正确程序的任何提示吗?
I have to write a function that fills a char* buffer for an assigned length with the content of a string. If the string is too long, I just have to cut it. The buffer is not allocated by me but by the user of my function. I tried something like this:
int writebuff(char* buffer, int length){
string text="123456789012345";
memcpy(buffer, text.c_str(),length);
//buffer[length]='\0';
return 1;
}
int main(){
char* buffer = new char[10];
writebuff(buffer,10);
cout << "After: "<<buffer<<endl;
}
my question is about the terminator: should it be there or not? This function is used in a much wider code and sometimes it seems I get problems with strange characters when the string needs to be cut.
Any hints on the correct procedure to follow?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
C 样式字符串必须以零字符
'\0'
终止。此外,您的代码还有另一个问题 - 它可能会尝试从源字符串末尾之外进行复制。这是典型的未定义行为。它可能看起来像是有效的,直到有一次字符串被分配在堆内存块的末尾,并且副本进入内存的受保护区域并彻底失败。您应该仅复制到缓冲区长度或字符串长度的最小值。
PS 为了完整起见,这是您的函数的一个很好的版本。感谢 Naveen 指出终止中的差一错误无效的。我冒昧地使用您的返回值来指示返回字符串的长度,或者如果传入的长度 <= 0,则指示所需的字符数。
A C-style string must be terminated with a zero character
'\0'
.In addition you have another problem with your code - it may try to copy from beyond the end of your source string. This is classic undefined behavior. It may look like it works, until the one time that the string is allocated at the end of a heap memory block and the copy goes off into a protected area of memory and fails spectacularly. You should copy only until the minimum of the length of the buffer or the length of the string.
P.S. For completeness here's a good version of your function. Thanks to Naveen for pointing out the off-by-one error in your terminating null. I've taken the liberty of using your return value to indicate the length of the returned string, or the number of characters required if the length passed in was <= 0.
如果您想将缓冲区视为字符串,则应以 NULL 终止它。为此,您需要使用
memcpy
复制length-1
字符并将length-1
字符设置为\0
。If you want to treat the buffer as a string you should NULL terminate it. For this you need to copy
length-1
characters usingmemcpy
and set thelength-1
character as\0
.看来您正在使用 C++ - 鉴于此,最简单的方法是(假设接口规范需要 NUL 终止)
it seems you are using C++ - given that, the simplest approach is (assuming that NUL termination is required by the interface spec)
char * 缓冲区必须以 null 终止,除非您在任何地方显式地传递长度并说明缓冲区不是以 null 终止。
char * Buffers must be null terminated unless you are explicitly passing out the length with it everywhere and saying so that the buffer is not null terminated.
是否应使用
\0
终止字符串取决于writebuff
函数的规范。如果调用函数后buffer
中的内容应该是有效的 C 风格字符串,则应使用\0
终止它。但请注意,
c_str()
将以\0
结尾,因此您可以使用text.size() + 1
作为源字符串的大小。另请注意,如果length
大于字符串的大小,则复制的内容将比text
为当前代码提供的内容更远(您可以使用min(length - 2, text.size() + 1/*trailing \0*/)
来防止这种情况发生,并设置buffer[length - 1] = 0
将其关闭)。顺便说一句,在
main
中分配的buffer
已泄漏Whether or not you should terminate the string with a
\0
depends on the specification of yourwritebuff
function. If what you have inbuffer
should be a valid C-style string after calling your function, you should terminate it with a\0
.Note, though, that
c_str()
will terminate with a\0
for you, so you could usetext.size() + 1
as the size of the source string. Also note that iflength
is larger than the size of the string, you will copy further than whattext
provides with your current code (you can usemin(length - 2, text.size() + 1/*trailing \0*/)
to prevent that, and setbuffer[length - 1] = 0
to cap it off).The
buffer
allocated inmain
is leaked, btw是的。它应该在那里。否则你后来怎么知道字符串在哪里结束呢?
cout
如何知道?它会一直打印垃圾,直到遇到值恰好为\0
的垃圾。您的程序甚至可能崩溃。作为旁注,您的程序正在泄漏内存。它不会释放它分配的内存。但由于您是从
main()
退出,所以这并不重要;毕竟,一旦程序结束,所有内存都会返回操作系统,无论您是否释放它。但如果您不忘记自己释放内存(或任何其他资源),那么总的来说这是一个很好的做法。Yes. It should be there. Otherwise how would you later know where the string ends? And how would
cout
would know? It would keep printing garbage till it encounters a garbage whose value happens to be\0
. Your program might even crash.As a sidenote, your program is leaking memory. It doesn't free the memory it allocates. But since you're exiting from the
main()
, it doesn't matter much; after all once the program ends, all the memory would go back to the OS, whether you deallocate it or not. But its good practice in general, if you don't forget deallocating memory (or any other resource ) yourself.我同意 Necrolis 的观点,strncpy 是可行的方法,但如果字符串太长,它不会得到空终止符。您放置显式终止符的想法是正确的,但正如您所写的,您的代码将其放在了末尾。 (这是用 C 语言编写的,因为您似乎用 C 语言编写的内容多于 C++ 语言?)
I agree with Necrolis that strncpy is the way to go, but it will not get the null terminator if the string is too long. You had the right idea in putting an explicit terminator, but as written your code puts it one past the end. (This is in C, since you seemed to be doing more C than C++?)
它绝对应该在那里*,这可以防止字符串太长而无法完全填充缓冲区并在稍后访问时导致溢出。尽管在我看来,应该使用
strncpy
而不是memcpy
,但您仍然需要 null 终止它。 (你的例子也会泄漏内存)。*如果您有疑问,请走最安全的路线!
It should most defiantly be there*, this prevents strings that are too long for the buffer from filling it completely and causing an overflow later on when its accessed. though imo,
strncpy
should be used instead ofmemcpy
, but you'll still have to null terminate it. (also your example leaks memory).*if you're ever in doubt, go the safest route!
首先,我不知道 writerbuff 是否应该终止字符串。这是一个设计问题,由决定 writebuff 应该存在的人来回答。
其次,从你的具体例子来看,有两个问题。一是您将未终止的字符串传递给operator<<(ostream, char*)。第二个是注释掉的行写入超出指示缓冲区末尾的内容。这两者都会调用未定义的行为。
(第三个是设计缺陷——你知道
length
总是小于text
的长度吗?)试试这个:
First, I don't know whether
writerbuff
should terminate the string or not. That is a design question, to be answered by the person who decided thatwritebuff
should exist at all.Second, taking your specific example as a whole, there are two problems. One is that you pass an unterminated string to
operator<<(ostream, char*)
. Second is the commented-out line writes beyond the end of the indicated buffer. Both of these invoke undefined behavior.(Third is a design flaw -- can you know that
length
is always less than the length oftext
?)Try this:
在
main()
中,您应该删除
使用new.
分配的缓冲区,或者静态分配它(char buf [10]
)。是的,它只有 10 个字节,是的,它是一个内存“池”,而不是泄漏,因为它是一次性分配,是的,您在程序的整个运行时间内都需要该内存。但这仍然是一个好习惯。在 C/C++ 中,字符缓冲区的一般约定是它们以 null 终止,因此我会包含它,除非我被明确告知不要这样做。如果我这样做了,我会对其进行注释,甚至可能在
char *
参数上使用 typedef 或名称,表明结果是一个不以 null 结尾的字符串。In
main()
, you shoulddelete
the buffer you allocated withnew.
, or allocate it statically (char buf[10]
). Yes, it's only 10 bytes, and yes, it's a memory "pool," not a leak, since it's a one-time allocations, and yes, you need that memory around for the entire running time of the program. But it's still a good habit to be into.In C/C++ the general contract with character buffers is that they be null-terminiated, so I would include it unless I had been explicitly told not to do it. And if I did, I would comment it, and maybe even use a typedef or name on the
char *
parameter indicating that the result is a string that is not null terminated.