Memcpy、字符串和终止符

发布于 2024-11-05 21:18:09 字数 505 浏览 0 评论 0原文

我必须编写一个函数,用字符串的内容填充指定长度的 char* 缓冲区。如果绳子太长,我只能把它剪掉。该缓冲区不是由我分配的,而是由我的函数的用户分配的。我尝试了这样的事情:

int writebuff(char* buffer, int length){
    string text="123456789012345";
    memcpy(buffer, text.c_str(),length);
    //buffer[length]='\0';
    return 1;
}


int main(){
    char* buffer = new char[10];
    writebuff(buffer,10);
    cout << "After: "<<buffer<<endl;
}

我的问题是关于终结者:它应该存在还是不存在?此函数用于更广泛的代码中,有时当需要剪切字符串时,我似乎会遇到奇怪字符的问题。

有关应遵循的正确程序的任何提示吗?

I have to write a function that fills a char* buffer for an assigned length with the content of a string. If the string is too long, I just have to cut it. The buffer is not allocated by me but by the user of my function. I tried something like this:

int writebuff(char* buffer, int length){
    string text="123456789012345";
    memcpy(buffer, text.c_str(),length);
    //buffer[length]='\0';
    return 1;
}


int main(){
    char* buffer = new char[10];
    writebuff(buffer,10);
    cout << "After: "<<buffer<<endl;
}

my question is about the terminator: should it be there or not? This function is used in a much wider code and sometimes it seems I get problems with strange characters when the string needs to be cut.

Any hints on the correct procedure to follow?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

荒岛晴空 2024-11-12 21:18:09

C 样式字符串必须以零字符'\0' 终止。

此外,您的代码还有另一个问题 - 它可能会尝试从源字符串末尾之外进行复制。这是典型的未定义行为。它可能看起来像是有效的,直到有一次字符串被分配在堆内存块的末尾,并且副本进入内存的受保护区域并彻底失败。您应该仅复制到缓冲区长度或字符串长度的最小值

PS 为了完整起见,这是您的函数的一个很好的版本。感谢 Naveen 指出终止中的差一错误无效的。我冒昧地使用您的返回值来指示返回字符串的长度,或者如果传入的长度 <= 0,则指示所需的字符数。

int writebuff(char* buffer, int length)
{
    string text="123456789012345";
    if (length <= 0)
        return text.size();
    if (text.size() < length)
    {
        memcpy(buffer, text.c_str(), text.size()+1);
        return text.size();
    }
    memcpy(buffer, text.c_str(), length-1);
    buffer[length-1]='\0';
    return length-1;
}

A C-style string must be terminated with a zero character '\0'.

In addition you have another problem with your code - it may try to copy from beyond the end of your source string. This is classic undefined behavior. It may look like it works, until the one time that the string is allocated at the end of a heap memory block and the copy goes off into a protected area of memory and fails spectacularly. You should copy only until the minimum of the length of the buffer or the length of the string.

P.S. For completeness here's a good version of your function. Thanks to Naveen for pointing out the off-by-one error in your terminating null. I've taken the liberty of using your return value to indicate the length of the returned string, or the number of characters required if the length passed in was <= 0.

int writebuff(char* buffer, int length)
{
    string text="123456789012345";
    if (length <= 0)
        return text.size();
    if (text.size() < length)
    {
        memcpy(buffer, text.c_str(), text.size()+1);
        return text.size();
    }
    memcpy(buffer, text.c_str(), length-1);
    buffer[length-1]='\0';
    return length-1;
}
ま昔日黯然 2024-11-12 21:18:09

如果您想将缓冲区视为字符串,则应以 NULL 终止它。为此,您需要使用 memcpy 复制 length-1 字符并将 length-1 字符设置为 \0

If you want to treat the buffer as a string you should NULL terminate it. For this you need to copy length-1 characters using memcpy and set the length-1 character as \0.

山田美奈子 2024-11-12 21:18:09

看来您正在使用 C++ - 鉴于此,最简单的方法是(假设接口规范需要 NUL 终止)

int writebuff(char* buffer, int length)
{
  string text = "123456789012345";
  std::fill_n(buffer, length, 0); // reset the entire buffer
  // use the built-in copy method from std::string, it will decide what's best.
  text.copy(buffer, length);
  // only over-write the last character if source is greater than length
  if (length < text.size())
    buffer[length-1] = 0;
  return 1; // eh?
}

it seems you are using C++ - given that, the simplest approach is (assuming that NUL termination is required by the interface spec)

int writebuff(char* buffer, int length)
{
  string text = "123456789012345";
  std::fill_n(buffer, length, 0); // reset the entire buffer
  // use the built-in copy method from std::string, it will decide what's best.
  text.copy(buffer, length);
  // only over-write the last character if source is greater than length
  if (length < text.size())
    buffer[length-1] = 0;
  return 1; // eh?
}
滿滿的愛 2024-11-12 21:18:09

char * 缓冲区必须以 null 终止,除非您在任何地方显式地传递长度并说明缓冲区不是以 null 终止。

char * Buffers must be null terminated unless you are explicitly passing out the length with it everywhere and saying so that the buffer is not null terminated.

度的依靠╰つ 2024-11-12 21:18:09

是否应使用 \0 终止字符串取决于 writebuff 函数的规范。如果调用函数后 buffer 中的内容应该是有效的 C 风格字符串,则应使用 \0 终止它。

但请注意,c_str() 将以 \0 结尾,因此您可以使用 text.size() + 1 作为源字符串的大小。另请注意,如果 length 大于字符串的大小,则复制的内容将比 text 为当前代码提供的内容更远(您可以使用 min(length - 2, text.size() + 1/*trailing \0*/) 来防止这种情况发生,并设置 buffer[length - 1] = 0 将其关闭)。

顺便说一句,在 main 中分配的 buffer 已泄漏

Whether or not you should terminate the string with a \0 depends on the specification of your writebuff function. If what you have in buffer should be a valid C-style string after calling your function, you should terminate it with a \0.

Note, though, that c_str() will terminate with a \0 for you, so you could use text.size() + 1 as the size of the source string. Also note that if length is larger than the size of the string, you will copy further than what text provides with your current code (you can use min(length - 2, text.size() + 1/*trailing \0*/) to prevent that, and set buffer[length - 1] = 0 to cap it off).

The buffer allocated in main is leaked, btw

青春如此纠结 2024-11-12 21:18:09

我的问题是关于终结符:它应该存在还是不存在?

是的。它应该在那里。否则你后来怎么知道字符串在哪里结束呢? cout 如何知道?它会一直打印垃圾,直到遇到值恰好为 \0 的垃圾。您的程序甚至可能崩溃。

作为旁注,您的程序正在泄漏内存。它不会释放它分配的内存。但由于您是从 main() 退出,所以这并不重要;毕竟,一旦程序结束,所有内存都会返回操作系统,无论您是否释放它。但如果您不忘记自己释放内存(或任何其他资源),那么总的来说这是一个很好的做法。

my question is about the terminator: should it be there or not?

Yes. It should be there. Otherwise how would you later know where the string ends? And how would cout would know? It would keep printing garbage till it encounters a garbage whose value happens to be \0. Your program might even crash.

As a sidenote, your program is leaking memory. It doesn't free the memory it allocates. But since you're exiting from the main(), it doesn't matter much; after all once the program ends, all the memory would go back to the OS, whether you deallocate it or not. But its good practice in general, if you don't forget deallocating memory (or any other resource ) yourself.

总攻大人 2024-11-12 21:18:09

我同意 Necrolis 的观点,strncpy 是可行的方法,但如果字符串太长,它不会得到空终止符。您放置显式终止符的想法是正确的,但正如您所写的,您的代码将其放在了末尾。 (这是用 C 语言编写的,因为您似乎用 C 语言编写的内容多于 C++ 语言?)

int writebuff(char* buffer, int length){
    char* text="123456789012345";
    strncpy(buffer, text, length);
    buffer[length-1]='\0';
   return 1;
}

I agree with Necrolis that strncpy is the way to go, but it will not get the null terminator if the string is too long. You had the right idea in putting an explicit terminator, but as written your code puts it one past the end. (This is in C, since you seemed to be doing more C than C++?)

int writebuff(char* buffer, int length){
    char* text="123456789012345";
    strncpy(buffer, text, length);
    buffer[length-1]='\0';
   return 1;
}
最佳男配角 2024-11-12 21:18:09

它绝对应该在那里*,这可以防止字符串太长而无法完全填充缓冲区并在稍后访问时导致溢出。尽管在我看来,应该使用 strncpy 而不是memcpy,但您仍然需要 null 终止它。 (你的例子也会泄漏内存)。

*如果您有疑问,请走最安全的路线!

It should most defiantly be there*, this prevents strings that are too long for the buffer from filling it completely and causing an overflow later on when its accessed. though imo, strncpy should be used instead of memcpy, but you'll still have to null terminate it. (also your example leaks memory).

*if you're ever in doubt, go the safest route!

莳間冲淡了誓言ζ 2024-11-12 21:18:09

首先,我不知道 writerbuff 是否应该终止字符串。这是一个设计问题,由决定 writebuff 应该存在的人来回答。

其次,从你的具体例子来看,有两个问题。一是您将未终止的字符串传递给operator<<(ostream, char*)。第二个是注释掉的行写入超出指示缓冲区末尾的内容。这两者都会调用未定义的行为。

(第三个是设计缺陷——你知道 length 总是小于 text 的长度吗?)

试试这个:

int writebuff(char* buffer, int length){
  string text="123456789012345";
  memcpy(buffer, text.c_str(),length);
  buffer[length-1]='\0';
  return 1;
}


int main(){
  char* buffer = new char[10];
  writebuff(buffer,10);
  cout << "After: "<<buffer<<endl;
}

First, I don't know whether writerbuff should terminate the string or not. That is a design question, to be answered by the person who decided that writebuff should exist at all.

Second, taking your specific example as a whole, there are two problems. One is that you pass an unterminated string to operator<<(ostream, char*). Second is the commented-out line writes beyond the end of the indicated buffer. Both of these invoke undefined behavior.

(Third is a design flaw -- can you know that length is always less than the length of text?)

Try this:

int writebuff(char* buffer, int length){
  string text="123456789012345";
  memcpy(buffer, text.c_str(),length);
  buffer[length-1]='\0';
  return 1;
}


int main(){
  char* buffer = new char[10];
  writebuff(buffer,10);
  cout << "After: "<<buffer<<endl;
}
|煩躁 2024-11-12 21:18:09
  1. main() 中,您应该删除使用new.分配的缓冲区,或者静态分配它(char buf [10])。是的,它只有 10 个字节,是的,它是一个内存“池”,而不是泄漏,因为它是一次性分配,是的,您在程序的整个运行时间内都需要该内存。但这仍然是一个好习惯。

  2. 在 C/C++ 中,字符缓冲区的一般约定是它们以 null 终止,因此我会包含它,除非我被明确告知不要这样做。如果我这样做了,我会对其进行注释,甚至可能在 char * 参数上使用 typedef 或名称,表明结果是一个不以 null 结尾的字符串。

  1. In main(), you should delete the buffer you allocated with new., or allocate it statically (char buf[10]). Yes, it's only 10 bytes, and yes, it's a memory "pool," not a leak, since it's a one-time allocations, and yes, you need that memory around for the entire running time of the program. But it's still a good habit to be into.

  2. In C/C++ the general contract with character buffers is that they be null-terminiated, so I would include it unless I had been explicitly told not to do it. And if I did, I would comment it, and maybe even use a typedef or name on the char * parameter indicating that the result is a string that is not null terminated.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文