C strcpy() - 邪恶?

发布于 2024-07-14 07:52:48 字数 1000 浏览 13 评论 0原文

有些人似乎认为 C 的 strcpy() 函数是坏的或邪恶的。 虽然我承认通常最好使用 strncpy() 来避免缓冲区溢出,但以下内容(对于那些不够幸运的人来说是 strdup() 函数的实现)拥有它)安全地使用 strcpy() 并且永远不会溢出:

char *strdup(const char *s1)
{
  char *s2 = malloc(strlen(s1)+1);
  if(s2 == NULL)
  {
    return NULL;
  }
  strcpy(s2, s1);
  return s2;
}

*s2 保证有足够的空间来存储 *s1< /code>,并且使用 strcpy() 使我们不必将 strlen() 结果存储在另一个函数中,以便稍后用作不必要的(在本例中)长度参数到strncpy()。 然而,有些人用 strncpy() 甚至 memcpy() 编写这个函数,它们都需要长度参数。 我想知道人们对此有何看法。 如果您认为 strcpy() 在某些情况下是安全的,请说出来。 如果您有充分的理由在这种情况下不使用 strcpy(),请给出 - 我想知道为什么使用 strncpy() 可能会更好或在这种情况下使用 memcpy() 。 如果您认为 strcpy() 可以,但不在这里,请解释一下。

基本上,我只是想知道为什么有些人使用 memcpy(),而另一些人使用 strcpy(),而还有一些人使用普通的 strncpy()。 是否有任何逻辑比三个更喜欢一个(忽略前两个的缓冲区检查)?

Some people seem to think that C's strcpy() function is bad or evil. While I admit that it's usually better to use strncpy() in order to avoid buffer overflows, the following (an implementation of the strdup() function for those not lucky enough to have it) safely uses strcpy() and should never overflow:

char *strdup(const char *s1)
{
  char *s2 = malloc(strlen(s1)+1);
  if(s2 == NULL)
  {
    return NULL;
  }
  strcpy(s2, s1);
  return s2;
}

*s2 is guaranteed to have enough space to store *s1, and using strcpy() saves us from having to store the strlen() result in another function to use later as the unnecessary (in this case) length parameter to strncpy(). Yet some people write this function with strncpy(), or even memcpy(), which both require a length parameter. I would like to know what people think about this. If you think strcpy() is safe in certain situations, say so. If you have a good reason not to use strcpy() in this situation, please give it - I'd like to know why it might be better to use strncpy() or memcpy() in situations like this. If you think strcpy() is okay, but not here, please explain.

Basically, I just want to know why some people use memcpy() when others use strcpy() and still others use plain strncpy(). Is there any logic to preferring one over the three (disregarding the buffer checks of the first two)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(17

空气里的味道 2024-07-21 07:52:49

我同意。 不过,我建议不要使用 strncpy() ,因为它总是会将您的输出填充到指定的长度。 这是一个历史性的决定,我认为这确实很不幸,因为它严重恶化了性能。

考虑这样的代码:

char buf[128];
strncpy(buf, "foo", sizeof buf);

这不会将预期的四个字符写入 buf,而是写入“foo”,后跟 125 个零字符。 例如,如果您收集大量短字符串,这将意味着您的实际性能远远低于预期。

如果可用,我更喜欢使用 snprintf(),将上面的内容写成:

snprintf(buf, sizeof buf, "foo");

如果复制非常量字符串,则按如下方式完成:

snprintf(buf, sizeof buf, "%s", input);

这很重要,因为 if input 包含 % 字符 snprintf() 会解释它们,打开一整架的蠕虫罐头。

I agree. I would recommend against strncpy() though, since it will always pad your output to the indicated length. This is some historical decision, which I think was really unfortunate as it seriously worsens the performance.

Consider code like this:

char buf[128];
strncpy(buf, "foo", sizeof buf);

This will not write the expected four characters to buf, but will instead write "foo" followed by 125 zero characters. If you're for instance collecting a lot of short strings, this will mean your actual performance is far worse than expected.

If available, I prefer to use snprintf(), writing the above like:

snprintf(buf, sizeof buf, "foo");

If instead copying a non-constant string, it's done like this:

snprintf(buf, sizeof buf, "%s", input);

This is important, since if input contains % characters snprintf() would interpret them, opening up whole shelvefuls of cans of worms.

徒留西风 2024-07-21 07:52:49

我认为 strncpy 也很邪恶。

为了真正保护自己免受此类编程错误的影响,您需要避免编写出 (a) 看起来不错和 (b) 超出缓冲区的代码。

这意味着您需要一个真正的字符串抽象,它不透明地存储缓冲区和容量,将它们永远绑定在一起,并检查边界。 否则,您最终会将字符串及其容量传递到整个商店。 一旦您进行了真正的字符串操作,例如修改字符串的中间部分,将错误的长度传递到 strncpy (尤其是 strncat)几乎与使用太小的目标调用 strcpy 一样容易。

当然,您可能仍然会问是使用 strncpy 还是 strcpy 来实现该抽象:只要您完全理解它的作用,strncpy 就更安全。 但在字符串处理应用程序代码中,依靠 strncpy 来防止缓冲区溢出就像戴了半个避孕套。

所以,你的 strdup-replacement 可能看起来像这样(定义的顺序改变了,让你保持悬念):

string *string_dup(const string *s1) {
    string *s2 = string_alloc(string_len(s1));
    if (s2 != NULL) {
        string_set(s2,s1);
    }
    return s2;
}

static inline size_t string_len(const string *s) {
    return strlen(s->data);
}

static inline void string_set(string *dest, const string *src) {
    // potential (but unlikely) performance issue: strncpy 0-fills dest,
    // even if the src is very short. We may wish to optimise
    // by switching to memcpy later. But strncpy is better here than
    // strcpy, because it means we can use string_set even when
    // the length of src is unknown.
    strncpy(dest->data, src->data, dest->capacity);
}

string *string_alloc(size_t maxlen) {
    if (maxlen > SIZE_MAX - sizeof(string) - 1) return NULL;
    string *self = malloc(sizeof(string) + maxlen + 1);
    if (self != NULL) {
        // empty string
        self->data[0] = '\0';
        // strncpy doesn't NUL-terminate if it prevents overflow, 
        // so exclude the NUL-terminator from the capacity, set it now,
        // and it can never be overwritten.
        self->capacity = maxlen;
        self->data[maxlen] = '\0';
    }
    return self;
}

typedef struct string {
    size_t capacity;
    char data[0];
} string;

这些字符串抽象的问题是,没有人能就其中一个达成一致(例如,上面评论中提到的 strncpy 的特性是否好或好)不好,您是否需要在创建子字符串时共享缓冲区的不可变和/或写时复制字符串等)。 因此,虽然理论上您应该只从货架上拿一个,但最终每个项目都可以拥有一个。

I think strncpy is evil too.

To truly protect yourself from programming errors of this kind, you need to make it impossible to write code that (a) looks OK, and (b) overruns a buffer.

This means you need a real string abstraction, which stores the buffer and capacity opaquely, binds them together, forever, and checks bounds. Otherwise, you end up passing strings and their capacities all over the shop. Once you get to real string ops, like modifying the middle of a string, it's almost as easy to pass the wrong length into strncpy (and especially strncat), as it is to call strcpy with a too-small destination.

Of course you might still ask whether to use strncpy or strcpy in implementing that abstraction: strncpy is safer there provided you fully grok what it does. But in string-handling application code, relying on strncpy to prevent buffer overflows is like wearing half a condom.

So, your strdup-replacement might look something like this (order of definitions changed to keep you in suspense):

string *string_dup(const string *s1) {
    string *s2 = string_alloc(string_len(s1));
    if (s2 != NULL) {
        string_set(s2,s1);
    }
    return s2;
}

static inline size_t string_len(const string *s) {
    return strlen(s->data);
}

static inline void string_set(string *dest, const string *src) {
    // potential (but unlikely) performance issue: strncpy 0-fills dest,
    // even if the src is very short. We may wish to optimise
    // by switching to memcpy later. But strncpy is better here than
    // strcpy, because it means we can use string_set even when
    // the length of src is unknown.
    strncpy(dest->data, src->data, dest->capacity);
}

string *string_alloc(size_t maxlen) {
    if (maxlen > SIZE_MAX - sizeof(string) - 1) return NULL;
    string *self = malloc(sizeof(string) + maxlen + 1);
    if (self != NULL) {
        // empty string
        self->data[0] = '\0';
        // strncpy doesn't NUL-terminate if it prevents overflow, 
        // so exclude the NUL-terminator from the capacity, set it now,
        // and it can never be overwritten.
        self->capacity = maxlen;
        self->data[maxlen] = '\0';
    }
    return self;
}

typedef struct string {
    size_t capacity;
    char data[0];
} string;

The problem with these string abstractions is that nobody can ever agree on one (for instance whether strncpy's idiosyncrasies mentioned in comments above are good or bad, whether you need immutable and/or copy-on-write strings that share buffers when you create a substring, etc). So although in theory you should just take one off the shelf, you can end up with one per project.

十二 2024-07-21 07:52:49

如果我已经计算了长度,我倾向于使用 memcpy ,尽管 strcpy 通常针对机器字进行了优化,但感觉您应该为库提供 as尽可能多的信息,这样它就可以使用最优化的复制机制。

但对于你给出的例子来说,这并不重要 - 如果它会失败,它将在最初的 strlen 中,所以 strncpy 不会给你带来任何安全方面的东西(并且大概< code>strncpy 速度较慢,因为它必须检查边界和 nul),并且 memcpystrcpy 之间的任何差异都不值得推测性地更改代码。

I'd tend to use memcpy if I have already calculated the length, although strcpy is usually optimised to work on machine words, it feels that you should provide the library with as much information as you can, so it can use the most optimal copying mechanism.

But for the example you give, it doesn't matter - if it's going to fail, it will be in the initial strlen, so strncpy doesn't buy you anything in terms of safety (and presumbly strncpy is slower as it has to both check bounds and for nul), and any difference between memcpy and strcpy isn't worth changing code for speculatively.

各自安好 2024-07-21 07:52:49

当人们这样使用它时,邪恶就来了(尽管下面是超级简化的):

void BadFunction(char *input)
{
    char buffer[1024]; //surely this will **always** be enough

    strcpy(buffer, input);

    ...
}

这是一种经常发生的令人惊讶的情况。

但是,是的,在为目标缓冲区分配内存并且已经使用 strlen 来查找长度的任何情况下,strcpy 与 strncpy 一样好。

The evil comes when people use it like this (although the below is super simplified):

void BadFunction(char *input)
{
    char buffer[1024]; //surely this will **always** be enough

    strcpy(buffer, input);

    ...
}

Which is a situation that happens suprising often.

But yeah, strcpy is as good as strncpy in any situation where you are allocating memory for the destination buffer and have already used strlen to find the length.

笑看君怀她人 2024-07-21 07:52:49

strlen 找到最后一个空终止位置。

但实际上缓冲区并不是以空值终止的。

这就是人们使用不同功能的原因。

strlen finds upto last null terminating place.

But in reality buffers are not null terminated.

that's why people use different functions.

☆獨立☆ 2024-07-21 07:52:49

好吧,strcpy() 并不像 strdup() 那么邪恶——至少 strcpy() 是标准 C 的一部分。

Well, strcpy() is not as evil as strdup() - at least strcpy() is part of Standard C.

毁虫ゝ 2024-07-21 07:52:49

在您描述的情况下,strcpy 是一个不错的选择。 仅当 s1 不以“\0”结尾时,此 strdup 才会遇到麻烦。

我会添加一条评论,说明为什么 strcpy 没有问题,以防止其他人(以及一年后的您自己)长时间怀疑它的正确性。

strncpy 通常看起来很安全,但可能会给您带来麻烦。 如果源“字符串”短于 count,它将用 '\0' 填充目标,直到达到 count。 这可能对性能不利。 如果源字符串长于 count,strncpy 不会将 '\0' 附加到目标字符串。 当您稍后期望以“\0”结尾的“字符串”时,这肯定会给您带来麻烦。 所以strncpy也要谨慎使用!

如果我不使用以 '\0' 结尾的字符串,我只会使用 memcpy,但这似乎是一个品味问题。

In the situation you describe, strcpy is a good choice. This strdup will only get into trouble if the s1 was not ended with a '\0'.

I would add a comment indicating why there are no problems with strcpy, to prevent others (and yourself one year from now) wondering about its correctness for too long.

strncpy often seems safe, but may get you into trouble. If the source "string" is shorter than count, it pads the target with '\0' until it reaches count. That may be bad for performance. If the source string is longer than count, strncpy does not append a '\0' to the target. That is bound to get you into trouble later on when you expect a '\0' terminated "string". So strncpy should also be used with caution!

I would only use memcpy if I was not working with '\0' terminated strings, but that seems to be a matter of taste.

满地尘埃落定 2024-07-21 07:52:49
char *strdup(const char *s1)
{
  char *s2 = malloc(strlen(s1)+1);
  if(s2 == NULL)
  {
    return NULL;
  }
  strcpy(s2, s1);
  return s2;
}

问题:

  1. s1未终止,strlen导致访问未分配的内存,程序崩溃。
  2. s1 未终止,strlen 同时不会导致从应用程序的其他部分访问未分配的内存访问内存。 它返回给用户(安全问题)或由程序的另一部分解析(出现heisenbug)。
  3. s1 未终止,strlen 导致系统无法满足的 malloc,返回 NULL。 strcpy 传递 NULL,程序崩溃。
  4. s1 未终止,strlen 导致 malloc 非常大,系统分配太多内存来执行手头的任务,变得不稳定。
  5. 在最好的情况下,代码效率很低,strlen 需要访问字符串中的每个元素。

可能还有其他问题...看,空终止并不总是一个坏主意。 在某些情况下,为了计算效率或减少存储需求,这是有意义的。

对于编写通用代码(例如业务逻辑)有意义吗? 不。

char *strdup(const char *s1)
{
  char *s2 = malloc(strlen(s1)+1);
  if(s2 == NULL)
  {
    return NULL;
  }
  strcpy(s2, s1);
  return s2;
}

Problems:

  1. s1 is unterminated, strlen causes the access of unallocated memory, program crashes.
  2. s1 is unterminated, strlen while not causing the access of unallocated memory access memory from another part of your application. It's returned to the user (security issue) or parsed by another part of your program (heisenbug appears).
  3. s1 is unterminated, strlen results in a malloc which the system can't satisfy, returns NULL. strcpy is passed NULL, program crashes.
  4. s1 is unterminated, strlen results in a malloc which is very large, system allocs far too much memory to perform the task at hand, becomes unstable.
  5. In the best case the code is inefficient, strlen requires access to every element in the string.

There are probably other problems... Look, null termination isn't always a bad idea. There are situations where, for computational efficiency, or to reduce storage requirements it makes sense.

For writing general purpose code, e.g. business logic does it make sense? No.

无人问我粥可暖 2024-07-21 07:52:49
char* dupstr(char* str)
{
   int full_len; // includes null terminator
   char* ret;
   char* s = str;

#ifdef _DEBUG
   if (! str)
      toss("arg 1 null", __WHENCE__);
#endif

   full_len = strlen(s) + 1;
   if (! (ret = (char*) malloc(full_len)))
      toss("out of memory", __WHENCE__);
   memcpy(ret, s, full_len); // already know len, so strcpy() would be slower

   return ret;
}
char* dupstr(char* str)
{
   int full_len; // includes null terminator
   char* ret;
   char* s = str;

#ifdef _DEBUG
   if (! str)
      toss("arg 1 null", __WHENCE__);
#endif

   full_len = strlen(s) + 1;
   if (! (ret = (char*) malloc(full_len)))
      toss("out of memory", __WHENCE__);
   memcpy(ret, s, full_len); // already know len, so strcpy() would be slower

   return ret;
}
北座城市 2024-07-21 07:52:49

此答案使用 size_tmemcpy() 来实现快速而简单的 strdup()

最好使用 size_t 类型,因为它是从 strlen() 返回并由 malloc()memcpy()< 使用的类型/代码>。 int 不是这些操作的正确类型。

memcpy() 很少比 strcpy()strncpy() 慢,而且通常要快得多。

// Assumption: `s1` points to a C string.
char *strdup(const char *s1) {
  size_t size = strlen(s1) + 1;
  char *s2 = malloc(size);
  if(s2 != NULL) {
    memcpy(s2, s1, size);
  }
  return s2;
} 

§7.1.1 1 “字符串是由第一个空字符终止并包含第一个空字符的连续字符序列。...”

This answer uses size_t and memcpy() for a fast and simple strdup().

Best to use type size_t as that is the type returned from strlen() and used by malloc() and memcpy(). int is not the proper type for these operations.

memcpy() is rarely slower than strcpy() or strncpy() and often significantly faster.

// Assumption: `s1` points to a C string.
char *strdup(const char *s1) {
  size_t size = strlen(s1) + 1;
  char *s2 = malloc(size);
  if(s2 != NULL) {
    memcpy(s2, s1, size);
  }
  return s2;
} 

§7.1.1 1 "A string is a contiguous sequence of characters terminated by and including the first null character. ..."

思念满溢 2024-07-21 07:52:49

您的代码效率非常低,因为它运行两次字符串来复制它。

一旦进入 strlen()。

然后再次在 strcpy() 中。

并且您不会检查 s1 是否为 NULL。

将长度存储在一些额外的变量中几乎不需要花费任何成本,而运行每个字符串两次来复制它是一个大罪。

Your code is terribly inefficient because it runs through the string twice to copy it.

Once in strlen().

Then again in strcpy().

And you don't check s1 for NULL.

Storing the length in some additional variable costs you about nothing, while running through each and every string twice to copy it is a cardinal sin.

So尛奶瓶 2024-07-21 07:52:48

memcpy 可以比 strcpystrncpy 更快,因为它不必将每个复制的字节与 '\0' 进行比较,并且因为它已经知道复制对象的长度。 它可以通过 Duff 的设备以类似的方式实现,或者使用复制的汇编指令一次几个字节,如 movsw 和 movsd

memcpy can be faster than strcpy and strncpy because it does not have to compare each copied byte with '\0', and because it already knows the length of the copied object. It can be implemented in a similar way with the Duff's device, or use assembler instructions that copy several bytes at a time, like movsw and movsd

沧桑㈠ 2024-07-21 07:52:48

我遵循此处中的规则。 让我引用一下

strncpy 最初被引入 C 库中,用于处理目录条目等结构中的固定长度名称字段。 此类字段的使用方式与字符串不同:对于最大长度字段,不需要尾随 null,并且将较短名称的尾随字节设置为 null 可确保高效的字段明智比较。 strncpy 本质上并不是“有限制的 strcpy”,委员会更愿意承认现有的做法,而不是改变该函数以更好地适应这种用途。

因此,如果您按 n 未找到 '\0',则不会在字符串中获得尾随 '\0'从源字符串到目前为止。 它很容易被误用(当然,如果你知道这个陷阱,你就可以避免它)。 正如引文所说,它并不是被设计为有界的 strcpy。 如果没有必要,我宁愿不使用它。 就您而言,显然没有必要使用它,并且您证明了这一点。 那为什么要用它呢?

一般来说,编程代码也是为了减少冗余。 如果您知道有一个包含“n”个字符的字符串,为什么要告诉复制函数复制最多 n 个字符? 你做了多余的检查。 这与性能无关,而更多地与代码的一致性有关。 读者会问自己,strcpy 会做什么,可能会跨越 n 个字符,从而有必要限制复制,只是在手册中读到这不会发生。 在这种情况下。 代码的读者之间开始产生困惑。

为了合理使用 mem-str-strn-,我在上面的链接文档中选择了它们:

mem - 当我想复制原始字节时,例如结构的字节。

str- 复制空终止字符串时 - 仅当 100% 不会发生溢出时。

strn- 当将空终止字符串复制到一定长度时,用零填充剩余字节。 在大多数情况下可能不是我想要的。 人们很容易忘记尾随零填充的事实,但正如上面引用所解释的那样,这是设计使然。 因此,我只需编写自己的小循环来复制字符,添加尾随 '\0'

char * sstrcpy(char *dst, char const *src, size_t n) {
    char *ret = dst;
    while(n-- > 0) {
        if((*dst++ = *src++) == '\0')
            return ret;
    }
    *dst++ = '\0';
    return ret;
}

只需几行即可完全满足我的要求。 如果我想要“原始速度”,我仍然可以寻找一个可移植且优化的实现来完成这个有界 strcpy 工作。 一如既往,先分析,然后再进行处理。

后来,C 获得了处理宽字符的函数,称为 wcs-wcsn-(针对 C99)。 我也会同样使用它们。

I'm following the rules in here. Let me quote from it

strncpy was initially introduced into the C library to deal with fixed-length name fields in structures such as directory entries. Such fields are not used in the same way as strings: the trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter names to null assures efficient field-wise comparisons. strncpy is not by origin a ``bounded strcpy,'' and the Committee has preferred to recognize existing practice rather than alter the function to better suit it to such use.

For that reason, you will not get a trailing '\0' in a string if you hit the n not finding a '\0' from the source string so far. It's easy to misuse it (of course, if you know about that pitfall, you can avoid it). As the quote says, it wasn't designed as a bounded strcpy. And i would prefer not to use it if not necessary. In your case, clearly its use is not necessary and you proved it. Why then use it?

And generally speaking, programming code is also about reducing redundancy. If you know you have a string containing 'n' characters, why tell the copying function to copy maximal n characters? You do redundant checking. It's little about performance, but much more about consistent code. Readers will ask themselves what strcpy could do that could cross the n characters and which makes it necessary to limit the copying, just to read in manuals that this cannot happen in that case. And there the confusion start happen among readers of the code.

For the rational to use mem-, str- or strn-, i chose among them like in the above linked document:

mem- when i want to copy raw bytes, like bytes of a structure.

str- when copying a null terminated string - only when 100% no overflow could happen.

strn- when copying a null terminated string up to some length, filling the remaining bytes with zero. Probably not what i want in most cases. It's easy to forget the fact with the trailing zero-fill, but it's by design as the above quote explains. So, i would just code my own small loop that copies characters, adding a trailing '\0':

char * sstrcpy(char *dst, char const *src, size_t n) {
    char *ret = dst;
    while(n-- > 0) {
        if((*dst++ = *src++) == '\0')
            return ret;
    }
    *dst++ = '\0';
    return ret;
}

Just a few lines that do exactly what i want. If i wanted "raw speed" i can still look out for a portable and optimized implementation that does exactly this bounded strcpy job. As always, profile first and then mess with it.

Later, C got functions for working with wide characters, called wcs- and wcsn- (for C99). I would use them likewise.

星星的軌跡 2024-07-21 07:52:48

人们使用 strncpy 而不是 strcpy 的原因是因为字符串并不总是以 null 结尾,并且很容易溢出缓冲区(使用 strcpy 为字符串分配的空间)并覆盖一些不相关的内存位。

对于 strcpy,这种情况可能发生,对于 strncpy,这种情况永远不会发生。 这就是为什么 strcpy 被认为是不安全的。 邪恶可能有点强。

The reason why people use strncpy not strcpy is because strings are not always null terminated and it's very easy to overflow the buffer (the space you have allocated for the string with strcpy) and overwrite some unrelated bit of memory.

With strcpy this can happen, with strncpy this will never happen. That is why strcpy is considered unsafe. Evil might be a little strong.

秉烛思 2024-07-21 07:52:48

坦率地说,如果您在 C 中进行大量字符串处理,您不应该问自己是否应该使用 strcpystrncpymemcpy。 您应该找到或编写一个提供更高级别抽象的字符串库。 例如,它可以跟踪每个字符串的长度,为您分配内存,并提供您需要的所有字符串操作。

这几乎肯定可以保证您很少犯通常与 C 字符串处理相关的错误,例如缓冲区溢出、忘记以 NUL 字节终止字符串等等。

该库可能具有如下功能:

typedef struct MyString MyString;
MyString *mystring_new(const char *c_str);
MyString *mystring_new_from_buffer(const void *p, size_t len);
void mystring_free(MyString *s);
size_t mystring_len(MyString *s);
int mystring_char_at(MyString *s, size_t offset);
MyString *mystring_cat(MyString *s1, ...); /* NULL terminated list */
MyString *mystring_copy_substring(MyString *s, size_t start, size_t max_chars);
MyString *mystring_find(MyString *s, MyString *pattern);
size_t mystring_find_char(MyString *s, int c);
void mystring_copy_out(void *output, MyString *s, size_t max_chars);
int mystring_write_to_fd(int fd, MyString *s);
int mystring_write_to_file(FILE *f, MyString *s);

我为 Kannel 项目编写了一个函数,请参阅 gwlib/octstr.h 文件。 它使我们的生活变得更加简单。 另一方面,这样的库编写起来相当简单,因此您可以自己编写一个库,即使只是作为练习。

Frankly, if you are doing much string handling in C, you should not ask yourself whether you should use strcpy or strncpy or memcpy. You should find or write a string library that provides a higher level abstraction. For example, one that keeps track of the length of each string, allocates memory for you, and provides all the string operations you need.

This will almost certainly guarantee you make very few of the kinds of mistakes usually associated with C string handling, such as buffer overflows, forgetting to terminate a string with a NUL byte, and so on.

The library might have functions such as these:

typedef struct MyString MyString;
MyString *mystring_new(const char *c_str);
MyString *mystring_new_from_buffer(const void *p, size_t len);
void mystring_free(MyString *s);
size_t mystring_len(MyString *s);
int mystring_char_at(MyString *s, size_t offset);
MyString *mystring_cat(MyString *s1, ...); /* NULL terminated list */
MyString *mystring_copy_substring(MyString *s, size_t start, size_t max_chars);
MyString *mystring_find(MyString *s, MyString *pattern);
size_t mystring_find_char(MyString *s, int c);
void mystring_copy_out(void *output, MyString *s, size_t max_chars);
int mystring_write_to_fd(int fd, MyString *s);
int mystring_write_to_file(FILE *f, MyString *s);

I wrote one for the Kannel project, see the gwlib/octstr.h file. It made life much simpler for us. On the other hand, such a library is fairly simple to write, so you might write one for yourself, even if only as an exercise.

莳間冲淡了誓言ζ 2024-07-21 07:52:48

没有人提到strlcpy由 Todd C. Miller 和 Theo de Raadt 开发。 正如他们在论文中所说:

最常见的误解是
strncpy() NUL 终止
目标字符串。 这只是事实,
但是,如果源的长度
字符串小于大小
范围。 这可能会有问题
复制用户输入时可能是
任意长度变为固定大小
缓冲。 最安全的使用方式
strncpy() 在这种情况下是通过
它比
目标字符串,然后终止
用手拉绳子。 这样你就是
保证总是有一个
以 NUL 结尾的目标字符串。

对于使用 strlcpy 有一些反对意见; 维基百科页面指出

Drepper 认为 strlcpy
strlcat 使截断错误更容易
让程序员忽略,从而
可能会引入比它们更多的错误
删除。*

但是,我相信这除了手动调整 strncpy 的参数之外,只是迫使那些知道自己在做什么的人添加手动 NULL 终止。 使用 strlcpy 可以更轻松地避免缓冲区溢出,因为您未能以 NULL 终止缓冲区。

另请注意,glibc 或 Microsoft 的库中缺少 strlcpy 不应成为使用障碍; 您可以在任何 BSD 发行版中找到 strlcpy 及其朋友的源代码,并且该许可证可能对您的商业/非商业项目友好。 请参阅 strlcpy.c 顶部的注释。

No one has mentioned strlcpy, developed by Todd C. Miller and Theo de Raadt. As they say in their paper:

The most common misconception is that
strncpy() NUL-terminates the
destination string. This is only true,
however, if length of the source
string is less than the size
parameter. This can be problematic
when copying user input that may be of
arbitrary length into a fixed size
buffer. The safest way to use
strncpy() in this situation is to pass
it one less than the size of the
destination string, and then terminate
the string by hand. That way you are
guaranteed to always have a
NUL-terminated destination string.

There are counter-arguments for the use of strlcpy; the Wikipedia page makes note that

Drepper argues that strlcpy and
strlcat make truncation errors easier
for a programmer to ignore and thus
can introduce more bugs than they
remove.*

However, I believe that this just forces people that know what they're doing to add a manual NULL termination, in addition to a manual adjustment to the argument to strncpy. Use of strlcpy makes it much easier to avoid buffer overruns because you failed to NULL terminate your buffer.

Also note that the lack of strlcpy in glibc or Microsoft's libraries should not be a barrier to use; you can find the source for strlcpy and friends in any BSD distribution, and the license is likely friendly to your commercial/non-commercial project. See the comment at the top of strlcpy.c.

萌面超妹 2024-07-21 07:52:48

我个人的想法是,如果代码能够被证明是有效的并且如此迅速地完成,那么它是完全可以接受的。 也就是说,如果代码很简单并且显然是正确的,那么就可以了。

但是,您的假设似乎是,当您的函数正在执行时,没有其他线程会修改 s1 指向的字符串。 如果此函数在成功内存分配(以及对 strlen 的调用)后被中断,字符串增长,并且 bam以来出现缓冲区溢出情况,会发生什么情况strcpy 复制到 NULL 字节。

下面的情况可能会更好:

char *
strdup(const char *s1) {
  int s1_len = strlen(s1);
  char *s2 = malloc(s1_len+1);
  if(s2 == NULL) {
    return NULL;
  }

  strncpy(s2, s1, s1_len);
  return s2;
}

现在,绳子可以因您自己的错误而增长,并且您很安全。 结果不会是重复,但也不会是任何疯狂的溢出。

您提供的代码实际上成为错误的可能性非常低(如果您在不支持线程的环境中工作,则几乎不存在,甚至不存在) 。 这只是需要思考的事情。

预计到达时间:这是一个稍微好一点的实现:

char *
strdup(const char *s1, int *retnum) {
  int s1_len = strlen(s1);
  char *s2 = malloc(s1_len+1);
  if(s2 == NULL) {
    return NULL;
  }

  strncpy(s2, s1, s1_len);
  retnum = s1_len;
  return s2;
}

返回字符数。 您还可以:

char *
strdup(const char *s1) {
  int s1_len = strlen(s1);
  char *s2 = malloc(s1_len+1);
  if(s2 == NULL) {
    return NULL;
  }

  strncpy(s2, s1, s1_len);
  s2[s1_len+1] = '\0';
  return s2;
}

这将以 NUL 字节终止它。 无论哪种方式都比我最初快速组合的方式要好。

I personally am of the mindset that if the code can be proven to be valid—and done so quickly—it is perfectly acceptable. That is, if the code is simple and thus obviously correct, then it is fine.

However, your assumption seems to be that while your function is executing, no other thread will modify the string pointed to by s1. What happens if this function is interrupted after successful memory allocation (and thus the call to strlen), the string grows, and bam you have a buffer overflow condition since strcpy copies to the NULL byte.

The following might be better:

char *
strdup(const char *s1) {
  int s1_len = strlen(s1);
  char *s2 = malloc(s1_len+1);
  if(s2 == NULL) {
    return NULL;
  }

  strncpy(s2, s1, s1_len);
  return s2;
}

Now, the string can grow through no fault of your own and you're safe. The result will not be a dup, but it won't be any crazy overflows, either.

The probability of the code you provided actually being a bug is pretty low (pretty close to non-existent, if not non-existent, if you are working in an environment that has no support for threading whatsoever). It's just something to think about.

ETA: Here is a slightly better implementation:

char *
strdup(const char *s1, int *retnum) {
  int s1_len = strlen(s1);
  char *s2 = malloc(s1_len+1);
  if(s2 == NULL) {
    return NULL;
  }

  strncpy(s2, s1, s1_len);
  retnum = s1_len;
  return s2;
}

There the number of characters is being returned. You can also:

char *
strdup(const char *s1) {
  int s1_len = strlen(s1);
  char *s2 = malloc(s1_len+1);
  if(s2 == NULL) {
    return NULL;
  }

  strncpy(s2, s1, s1_len);
  s2[s1_len+1] = '\0';
  return s2;
}

Which will terminate it with a NUL byte. Either way is better than the one that I quickly put together originally.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文