arrays c data-structures flexible-array-member

在 C 中使用灵活的数组成员是不好的做法吗？

发布于 2024-07-08 06:06:34 字数 304 浏览 15 评论 0原文

我最近读到，在 C 中使用灵活的数组成员是糟糕的软件工程实践。然而，该声明没有得到任何论据的支持。这是一个公认的事实吗？

（灵活数组成员是 C99 中引入的一项 C 功能，可以将最后一个元素声明为未指定大小的数组，例如：）

struct header {
    size_t len;
    unsigned char data[];
};

原文

I recently read that using flexible array members in C was poor software engineering practice. However, that statement was not backed by any argument. Is this an accepted fact?

(Flexible array members are a C feature introduced in C99 whereby one can declare the last element to be an array of unspecified size. For example: )

struct header {
    size_t len;
    unsigned char data[];
};

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

⒈起吃苦の倖褔 2024-07-15 06:06:34

使用 goto 是糟糕的软件工程实践，这是一个公认的“事实”。但这并不代表它是真的。有时 goto 很有用，特别是在处理清理和从汇编程序移植时。

灵活的数组成员让我印象深刻，因为它有一个主要用途，即映射传统数据格式，例如 RiscOS 上的窗口模板格式。大约 15 年前，它们对此非常有用，而且我确信仍然有人在处理此类事情，他们会发现它们很有用。

如果使用灵活的数组成员是不好的做法，那么我建议我们都去告诉 C99 规范的作者这一点。我怀疑他们可能有不同的答案。

回复收藏 0 原文

青春如此纠结 2024-07-15 06:06:34

不，在 C 中使用灵活数组成员并不是一个坏习惯。

该语言功能首先在 ISO C99, 6.7.2.1 (16) 中标准化。在后续修订版 ISO C11 中，第 6.7.2.1 (18) 节对其进行了规定。

您可以这样使用它们：

struct Header {
    size_t d;
    long v[];
};
typedef struct Header Header;
size_t n = 123; // can dynamically change during program execution
// ...
Header *h = malloc(sizeof(Header) + sizeof(long[n]));
h->n = n;

或者，您可以这样分配：

Header *h = malloc(sizeof *h + n * sizeof h->v[0]);

sizeof(Header) 包括最终的填充字节，因此，以下分配是不正确的，并且可能会产生缓冲区溢出：

Header *h = malloc(sizeof(size_t) + sizeof(long[n])); // invalid!

请注意，灵活的数组成员将其分配数量减少了 1/2，即，您只需要 1 次分配，而不是为一个结构对象分配 2 次。这意味着内存分配器簿记开销所占用的内存更少。此外，还可以节省一个额外指针的存储空间。因此，如果您必须分配大量此类结构实例，则可以显着提高程序的运行时间和内存使用率（通过一个常数因子）。

与此相反，对灵活数组成员使用非标准化构造会产生未定义的行为（例如，long v[0]; 或 long v[1];）显然这是不好的做法。因此，与任何未定义的行为一样，应该避免这种情况。

自 ISO C99 于 1999 年发布以来，距今已有 20 多年的历史，争取 ISO C89 兼容性已成为一个无力的论据。

No, using flexible array members in C is not bad practice.

This language feature was first standardized in ISO C99, 6.7.2.1 (16). In the following revision, ISO C11, it is specified in Section 6.7.2.1 (18).

You can use them like this:

struct Header {
    size_t d;
    long v[];
};
typedef struct Header Header;
size_t n = 123; // can dynamically change during program execution
// ...
Header *h = malloc(sizeof(Header) + sizeof(long[n]));
h->n = n;

Alternatively, you can allocate like this:

Header *h = malloc(sizeof *h + n * sizeof h->v[0]);

Note that sizeof(Header) includes eventual padding bytes, thus, the following allocation is incorrect and may yield a buffer overflow:

Header *h = malloc(sizeof(size_t) + sizeof(long[n])); // invalid!

A struct with a flexible array members reduces the number of allocations for it by 1/2, i.e. instead of 2 allocations for one struct object you need just 1. Meaning less effort and less memory occupied by memory allocator bookkeeping overhead. Furthermore, you save the storage for one additional pointer. Thus, if you have to allocate a large number of such struct instances you measurably improve the runtime and memory usage of your program (by a constant factor).

In contrast to that, using non-standardized constructs for flexible array members that yield undefined behavior (e.g. as in long v[0]; or long v[1];) obviously is bad practice. Thus, as any undefined-behaviour this should be avoided.

Since ISO C99 was released in 1999, more than 20 years ago, striving for ISO C89 compatibility is a weak argument.

回复收藏 0 原文

指尖上得阳光 2024-07-15 06:06:34

请仔细阅读此答案下面的评论

随着 C 标准化的推进，没有理由再使用 [1]。

我不这样做的原因是它为了使用此功能而将代码绑定到 C99 是不值得的。

关键是您始终可以使用以下惯用语：

struct header {
  size_t len;
  unsigned char data[1];
};

那是完全可移植的。然后，在为数组 data 中的 n 个元素分配内存时，您可以考虑 1：

ptr = malloc(sizeof(struct header) + (n-1));

如果您已经将 C99 作为出于任何其他原因构建代码的要求，或者您的目标是特定的编译器，我认为没有什么坏处。

PLEASE READ CAREFULLY THE COMMENTS BELOW THIS ANSWER

As C Standardization move forward there is no reason to use [1] anymore.

The reason I would give for not doing it is that it's not worth it to tie your code to C99 just to use this feature.

The point is that you can always use the following idiom:

struct header {
  size_t len;
  unsigned char data[1];
};

That is fully portable. Then you can take the 1 into account when allocating the memory for n elements in the array data :

ptr = malloc(sizeof(struct header) + (n-1));

If you already have C99 as requirement to build your code for any other reason or you are target a specific compiler, I see no harm.

回复收藏 0 原文

一直在等你来 2024-07-15 06:06:34

你的意思是...

struct header
{
 size_t len;
 unsigned char data[];
};

在C 语言中，这是一个常见的习惯用法。我认为许多编译器也接受：

  unsigned char data[0];

是的，它很危险，但话又说回来，它实际上并不比普通的 C 数组更危险 - 即，非常危险;-)。请谨慎使用它，并且仅在您确实需要未知大小的数组的情况下使用它。确保正确地分配和释放内存，使用如下方法：-

  foo = malloc(sizeof(header) + N * sizeof(data[0]));
  foo->len = N;

另一种方法是使数据只是指向元素的指针。然后，您可以根据需要将数据 realloc() 到正确的大小。

  struct header
    {
     size_t len;
     unsigned char *data;
    };

当然，如果您询问的是 C++，那么这两种做法都不是好的做法。那么您通常会使用 STL 向量。

You meant...

struct header
{
 size_t len;
 unsigned char data[];
};

In C, that's a common idiom. I think many compilers also accept:

  unsigned char data[0];

Yes, it's dangerous, but then again, it's really no more dangerous than normal C arrays - i.e., VERY dangerous ;-) . Use it with care and only in circumstances where you truly need an array of unknown size. Make sure you malloc and free the memory correctly, using something like:-

  foo = malloc(sizeof(header) + N * sizeof(data[0]));
  foo->len = N;

An alternative is to make data just be a pointer to the elements. You can then realloc() data to the correct size as required.

  struct header
    {
     size_t len;
     unsigned char *data;
    };

Of course, if you were asking about C++, either of these would be bad practice. Then you'd typically use STL vectors instead.

回复收藏 0 原文

万劫不复 2024-07-15 06:06:34

我见过这样的事情：
来自C接口和实现。

  struct header {
    size_t len;
    unsigned char *data;
};

   struct header *p;
   p = malloc(sizeof(*p) + len + 1 );
   p->data = (unsigned char*) (p + 1 );  // memory after p is mine!

注意：数据不必是最后一个成员。

I've seen something like this:
from C interface and implementation.

  struct header {
    size_t len;
    unsigned char *data;
};

   struct header *p;
   p = malloc(sizeof(*p) + len + 1 );
   p->data = (unsigned char*) (p + 1 );  // memory after p is mine!

Note: data need not be last member.

回复收藏 0 原文

笔落惊风雨 2024-07-15 06:06:34

作为旁注，为了 C89 兼容性，这样的结构应该像这样分配：

struct header *my_header
  = malloc(offsetof(struct header, data) + n * sizeof my_header->data);

或者使用宏：

#define FLEXIBLE_SIZE SIZE_MAX /* or whatever maximum length for an array */
#define SIZEOF_FLEXIBLE(type, member, length) \
  ( offsetof(type, member) + (length) * sizeof ((type *)0)->member[0] )

struct header {
  size_t len;
  unsigned char data[FLEXIBLE_SIZE];
};

...

size_t n = 123;
struct header *my_header = malloc(SIZEOF_FLEXIBLE(struct header, data, n));

将 FLEXIBLE_SIZE 设置为 SIZE_MAX 几乎确保这会失败：

struct header *my_header = malloc(sizeof *my_header);

As a side note, for C89 compatibility, such structure should be allocated like :

struct header *my_header
  = malloc(offsetof(struct header, data) + n * sizeof my_header->data);

Or with macros :

#define FLEXIBLE_SIZE SIZE_MAX /* or whatever maximum length for an array */
#define SIZEOF_FLEXIBLE(type, member, length) \
  ( offsetof(type, member) + (length) * sizeof ((type *)0)->member[0] )

struct header {
  size_t len;
  unsigned char data[FLEXIBLE_SIZE];
};

...

size_t n = 123;
struct header *my_header = malloc(SIZEOF_FLEXIBLE(struct header, data, n));

Setting FLEXIBLE_SIZE to SIZE_MAX almost ensures this will fail :

struct header *my_header = malloc(sizeof *my_header);

回复收藏 0 原文

若能看破又如何 2024-07-15 06:06:34

有时，结构的使用方式存在一些缺点，如果您不仔细考虑其含义，可能会很危险。

对于您的示例，如果您启动一个函数：

void test(void) {
  struct header;
  char *p = &header.data[0];

  ...
}

那么结果是未定义的（因为没有为数据分配存储空间）。这是您通常会意识到的事情，但在某些情况下，C 程序员可能习惯于能够使用结构的值语义，这会以各种其他方式分解。

例如，如果我定义：

struct header2 {
  int len;
  char data[MAXLEN]; /* MAXLEN some appropriately large number */
}

那么我可以简单地通过赋值来复制两个实例，即：

struct header2 inst1 = inst2;

或者如果它们被定义为指针：

struct header2 *inst1 = *inst2;

但是，这对于灵活的数组成员不起作用，因为它们的内容不会被复制。您想要的是动态分配结构的大小并使用 memcpy 或等效方法复制数组。

struct header3 {
  int len;
  char data[]; /* flexible array member */
}

同样，编写接受 struct header3 的函数也行不通，因为函数调用中的参数再次按值复制，因此您得到的可能只是灵活数组的第一个元素成员。

 void not_good ( struct header3 ) ;

这并不是一个坏主意，但您必须记住始终动态分配这些结构并仅将它们作为指针传递。

 void good ( struct header3 * ) ;

There are some downsides related to how structs are sometimes used, and it can be dangerous if you don't think through the implications.

For your example, if you start a function:

void test(void) {
  struct header;
  char *p = &header.data[0];

  ...
}

Then the results are undefined (since no storage was ever allocated for data). This is something that you will normally be aware of, but there are cases where C programmers are likely used to being able to use value semantics for structs, which breaks down in various other ways.

For instance, if I define:

struct header2 {
  int len;
  char data[MAXLEN]; /* MAXLEN some appropriately large number */
}

Then I can copy two instances simply by assignment, i.e.:

struct header2 inst1 = inst2;

Or if they are defined as pointers:

struct header2 *inst1 = *inst2;

This however won't work for flexible array members, since their content is not copied over. What you want is to dynamically malloc the size of the struct and copy over the array with memcpy or equivalent.

struct header3 {
  int len;
  char data[]; /* flexible array member */
}

Likewise, writing a function that accepts a struct header3 will not work, since arguments in function calls are, again, copied by value, and thus what you will get is likely only the first element of your flexible array member.

 void not_good ( struct header3 ) ;

This does not make it a bad idea to use, but you do have to keep in mind to always dynamically allocate these structures and only pass them around as pointers.