为什么AREN' t变量长度阵列部分C++标准?

发布于 2025-01-29 05:02:33 字数 1211 浏览 3 评论 0 原文

在过去的几年中,我没有使用太多C。当我阅读问题今天我遇到了一些我不熟悉的C语法。

显然,在 c99 以下语法是有效的:

void foo(int n) {
    int values[n]; //Declare a variable length array
}

这似乎是一个非常有用的功能。是否有关于将其添加到C ++标准中的讨论,如果是这样,为什么会省略它?

一些潜在的原因:

  • 编译器供应商实现
  • 与标准功能的其他部分不兼容的
  • 毛茸茸,可以使用其他C ++构造模拟

C ++标准状态,即阵列大小必须是恒定的表达式(8.3.4.1)。

是的,当然,我意识到在玩具示例中,人们可以使用 std :: vector< int>值(M); ,但这是从堆中分配内存,而不是堆栈。而且,如果我想要一个多维数组,例如:

void foo(int x, int y, int z) {
    int values[x][y][z]; // Declare a variable length array
}

vector 版本变得相当笨拙:

void foo(int x, int y, int z) {
    vector< vector< vector<int> > > values( /* Really painful expression here. */);
}

切片,行和列也可能会在整个内存中散布。

查看 comp.std.c ++ 的讨论很明显,这个问题在参数两面都非常有争议,其中一些非常重量的名称。当然, std :: vector 始终是一个更好的解决方案,肯定不明显。

I haven't used C very much in the last few years. When I read this question today I came across some C syntax which I wasn't familiar with.

Apparently in C99 the following syntax is valid:

void foo(int n) {
    int values[n]; //Declare a variable length array
}

This seems like a pretty useful feature. Was there ever a discussion about adding it to the C++ standard, and if so, why it was omitted?

Some potential reasons:

  • Hairy for compiler vendors to implement
  • Incompatible with some other part of the standard
  • Functionality can be emulated with other C++ constructs

The C++ standard states that array size must be a constant expression (8.3.4.1).

Yes, of course I realize that in the toy example one could use std::vector<int> values(m);, but this allocates memory from the heap and not the stack. And if I want a multidimensional array like:

void foo(int x, int y, int z) {
    int values[x][y][z]; // Declare a variable length array
}

the vector version becomes pretty clumsy:

void foo(int x, int y, int z) {
    vector< vector< vector<int> > > values( /* Really painful expression here. */);
}

The slices, rows and columns will also potentially be spread all over memory.

Looking at the discussion at comp.std.c++ it's clear that this question is pretty controversial with some very heavyweight names on both sides of the argument. It's certainly not obvious that a std::vector is always a better solution.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

傲性难收 2025-02-05 05:02:33

(背景:我有实施C和C ++编译器的经验。)

C99中的可变长度数组基本上是一个失误。为了支持VLA,C99必须对常识进行以下优惠:

  • sizeof x 不再总是一个编译时常数;编译器有时必须生成代码才能在运行时评估 sizeof 表达。

  • 允许二维VLA( int a [x] [y] )需要一个新的语法,以声明将2D VLA作为参数的声明函数: void foo(int n,int n,int n,int a [] [*])

  • 在C ++世界中不太重要,但对于C的嵌入式系统程序员的目标受众来说,宣布VLA非常重要,这意味着将堆栈的任意大块。这是保证的堆栈流和崩溃。 (每当您声明 int a [n] 时,您都隐含地断言您有2GB的堆栈可以备用。毕竟,如果您知道“ n 肯定比1000在这里”,然后您只会声明 int a [1000] 。代替32位整数 n 1000 是您的录取不知道您的程序的行为应该是什么。)

好吧,现在让我们现在谈论C ++。在C ++中,我们在“类型系统”和“值系统”之间具有相同的强烈区别,与C89相比……但是我们确实以C没有的方式开始依靠它。例如:

template<typename T> struct S { ... };
int A[n];
S<decltype(A)> s;  // equivalently, S<int[n]> s;

如果 n 不是编译时常数(即,如果 a 是可变修改的类型),那么地球的类型是 s ? s 的类型是否也仅在运行时确定

那又如何:

template<typename T> bool myfunc(T& t1, T& t2) { ... };
int A1[n1], A2[n2];
myfunc(A1, A2);

编译器必须生成代码,以实例化 myFunc 。该代码应该是什么样的?如果我们不知道编译时间 a1 的类型,我们如何在静态生成该代码?

更糟糕的是,如果在运行时发现 n1!= n2 ,那么!std :: is_same&lt; exptype(a1),exptype(a2)&gt;() ?在这种情况下,呼叫 myFunc 甚至不应该编译,因为模板类型扣除应该失败!我们如何在运行时模仿这种行为?

基本上,C ++正在朝着将越来越多的决策推向 compile time 的方向:模板代码生成, constexpr 函数评估等等。同时,C99忙于将传统上推动 compile time 的决策(例如 sizeof Run> Run> Runtime 。考虑到这一点,花费任何努力尝试将C99风格的VLA集成到C ++上真的很有意义吗?

正如其他所有答案器已经指出的那样,C ++提供了许多堆分配机制( std :: simelod_ptr&lt&lt&gt; a = new int [n]; or std :::::: vector&lt; int&gt; a(n); 是显而易见的)当您真正想传达“我不知道我可能需要多少ram”的想法。 C ++提供了一个漂亮的异常处理模型,以应对您所需的RAM量的不可避免的情况大于您所需的RAM量。但是希望这个答案可以使您一个很好的理解,即C99风格的VLA 不是不适合C ++,甚至不太适合C99。 ;)


有关该主题的更多信息,请参见,Bjarne Stroustrup 2013年10月关于VLAS的论文。 Bjarne的POV与我的POV完全不同。 N3810更多地着重于为事物找到一个良好的C ++ ISH 语法,并劝阻在C ++中使用RAW数组的使用,而我更多地关注对元编程和类型系统的影响。我不知道他是否认为解决,可解决的或仅仅没有兴趣的元编程/型拼写系统含义。


一篇很好的博客文章,其中许多相同的观点是“合法使用可变长度阵列” (Chris Wellons,2019-10-27)。

(Background: I have some experience implementing C and C++ compilers.)

Variable-length arrays in C99 were basically a misstep. In order to support VLAs, C99 had to make the following concessions to common sense:

  • sizeof x is no longer always a compile-time constant; the compiler must sometimes generate code to evaluate a sizeof-expression at runtime.

  • Allowing two-dimensional VLAs (int A[x][y]) required a new syntax for declaring functions that take 2D VLAs as parameters: void foo(int n, int A[][*]).

  • Less importantly in the C++ world, but extremely important for C's target audience of embedded-systems programmers, declaring a VLA means chomping an arbitrarily large chunk of your stack. This is a guaranteed stack-overflow and crash. (Anytime you declare int A[n], you're implicitly asserting that you have 2GB of stack to spare. After all, if you know "n is definitely less than 1000 here", then you would just declare int A[1000]. Substituting the 32-bit integer n for 1000 is an admission that you have no idea what the behavior of your program ought to be.)

Okay, so let's move to talking about C++ now. In C++, we have the same strong distinction between "type system" and "value system" that C89 does… but we've really started to rely on it in ways that C has not. For example:

template<typename T> struct S { ... };
int A[n];
S<decltype(A)> s;  // equivalently, S<int[n]> s;

If n weren't a compile-time constant (i.e., if A were of variably modified type), then what on earth would be the type of S? Would S's type also be determined only at runtime?

What about this:

template<typename T> bool myfunc(T& t1, T& t2) { ... };
int A1[n1], A2[n2];
myfunc(A1, A2);

The compiler must generate code for some instantiation of myfunc. What should that code look like? How can we statically generate that code, if we don't know the type of A1 at compile time?

Worse, what if it turns out at runtime that n1 != n2, so that !std::is_same<decltype(A1), decltype(A2)>()? In that case, the call to myfunc shouldn't even compile, because template type deduction should fail! How could we possibly emulate that behavior at runtime?

Basically, C++ is moving in the direction of pushing more and more decisions into compile-time: template code generation, constexpr function evaluation, and so on. Meanwhile, C99 was busy pushing traditionally compile-time decisions (e.g. sizeof) into the runtime. With this in mind, does it really even make sense to expend any effort trying to integrate C99-style VLAs into C++?

As every other answerer has already pointed out, C++ provides lots of heap-allocation mechanisms (std::unique_ptr<int[]> A = new int[n]; or std::vector<int> A(n); being the obvious ones) when you really want to convey the idea "I have no idea how much RAM I might need." And C++ provides a nifty exception-handling model for dealing with the inevitable situation that the amount of RAM you need is greater than the amount of RAM you have. But hopefully this answer gives you a good idea of why C99-style VLAs were not a good fit for C++ — and not really even a good fit for C99. ;)


For more on the topic, see N3810 "Alternatives for Array Extensions", Bjarne Stroustrup's October 2013 paper on VLAs. Bjarne's POV is very different from mine; N3810 focuses more on finding a good C++ish syntax for the things, and on discouraging the use of raw arrays in C++, whereas I focused more on the implications for metaprogramming and the typesystem. I don't know if he considers the metaprogramming/typesystem implications solved, solvable, or merely uninteresting.


A good blog post that hits many of these same points is "Legitimate Use of Variable Length Arrays" (Chris Wellons, 2019-10-27).

执笏见 2025-02-05 05:02:33

最近在Usenet中进行了讨论:

我同意那些似乎同意必须在堆栈上创建一个潜在的大型阵列的人(通常只有很少的空间)并不好。参数是,如果您事先知道大小,则可以使用静态数组。而且,如果您不知道大小,则将编写不安全的代码。

C99 VLA可以提供一个很小的好处,即能够创建小阵列而不会浪费空间或呼叫未使用元素的构造函数,但是它们会对类型系统引入相当大的更改(您需要能够根据运行时值指定类型 - 这除新的操作员类型分布符外,当前C ++尚不存在,但是它们经过专门处理,以便运行时度不逃脱 new 操作员)。

您可以使用 std :: vector ,但是它并不完全相同,因为它使用动态内存,并且使其使用自己的堆栈 - 造型器并不容易(对齐也是一个问题,也是一个问题)。它也不能解决相同的问题,因为向量是可重大的容器,而VLA是固定尺寸的。

There recently was a discussion about this kicked off in usenet: Why no VLAs in C++0x.

I agree with those people that seem to agree that having to create a potential large array on the stack, which usually has only little space available, isn't good. The argument is, if you know the size beforehand, you can use a static array. And if you don't know the size beforehand, you will write unsafe code.

C99 VLAs could provide a small benefit of being able to create small arrays without wasting space or calling constructors for unused elements, but they will introduce rather large changes to the type system (you need to be able to specify types depending on runtime values - this does not yet exist in current C++, except for new operator type-specifiers, but they are treated specially, so that the runtime-ness doesn't escape the scope of the new operator).

You can use std::vector, but it is not quite the same, as it uses dynamic memory, and making it use one's own stack-allocator isn't exactly easy (alignment is an issue, too). It also doesn't solve the same problem, because a vector is a resizable container, whereas VLAs are fixed-size. The C++ Dynamic Array proposal is intended to introduce a library based solution, as alternative to a language based VLA. However, it's not going to be part of C++0x, as far as I know.

蓦然回首 2025-02-05 05:02:33

如果您愿意,您可以随时使用Alloca()在运行时分配堆栈上的内存:

void foo (int n)
{
    int *values = (int *)alloca(sizeof(int) * n);
}

在堆栈上分配意味着当堆栈放开时会自动释放它。

快速注意:如Alloca(3)的Mac OS X Man页面中所述,“ Alloca()函数是机器和编译器依赖性的;其用途是删除的。”只是你知道。

You could always use alloca() to allocate memory on the stack at runtime, if you wished:

void foo (int n)
{
    int *values = (int *)alloca(sizeof(int) * n);
}

Being allocated on the stack implies that it will automatically be freed when the stack unwinds.

Quick note: As mentioned in the Mac OS X man page for alloca(3), "The alloca() function is machine and compiler dependent; its use is dis-couraged." Just so you know.

自由如风 2025-02-05 05:02:33

在我自己的工作中,我意识到每次我想要诸如可变长度自动阵列或Alloca()之类的东西时,我并不真正在乎内存物理位于CPU堆栈上,只是它来自一些堆栈分配器不会引起一般堆的缓慢行程。因此,我有一个每线程对象,该对象拥有一些内存,它可以从中推出/pop变量大小的缓冲区。在某些平台上,我可以通过MMU生长。其他平台具有固定尺寸(通常还伴有固定尺寸的CPU堆栈,因为没有MMU)。我与(手持游戏机)合作的一个平台无论如何都有珍贵的CPU堆栈,因为它稀少,快速记忆。

我并不是说不需要将可变大小的缓冲区推向CPU堆栈。老实说,当我发现这不是标准时,我感到惊讶,因为这肯定看起来很适合该语言。但是,对我来说,要求“可变大小”和“必须物理位于CPU堆栈上”的需求从未在一起。这是关于速度的,所以我做了自己的“数据缓冲区的并行堆栈”。

In my own work, I've realized that every time I've wanted something like variable-length automatic arrays or alloca(), I didn't really care that the memory was physically located on the cpu stack, just that it came from some stack allocator that didn't incur slow trips to the general heap. So I have a per-thread object that owns some memory from which it can push/pop variable sized buffers. On some platforms I allow this to grow via mmu. Other platforms have a fixed size (usually accompanied by a fixed size cpu stack as well because no mmu). One platform I work with (a handheld game console) has precious little cpu stack anyway because it resides in scarce, fast memory.

I'm not saying that pushing variable-sized buffers onto the cpu stack is never needed. Honestly I was surprised back when I discovered this wasn't standard, as it certainly seems like the concept fits into the language well enough. For me though, the requirements "variable size" and "must be physically located on the cpu stack" have never come up together. It's been about speed, so I made my own sort of "parallel stack for data buffers".

许久 2025-02-05 05:02:33

似乎可以在C ++ 14:

https://en.wikipedia .org/wiki/c%2B%2B14#Runtime-sized_one_dimensional_arrays

更新:它没有将其纳入C ++14。

Seems it will be available in C++14:

https://en.wikipedia.org/wiki/C%2B%2B14#Runtime-sized_one_dimensional_arrays

Update: It did not make it into C++14.

土豪 2025-02-05 05:02:33

在某些情况下,与执行的操作相比,分配堆内存非常昂贵。一个示例是矩阵数学。如果您使用小型矩阵,则说5至10个要素,并且要做很多算术,那么Malloc的开销将非常重要。同时,使尺寸成为编译时间常数的确确实很浪费和僵化。

我认为C ++本身是如此不安全,以至于“尝试不添加更多不安全功能”的论点不是很强。另一方面,由于C ++可以说是最有效的编程语言功能,这使其更加有用:编写绩效关键程序的人在很大程度上将使用C ++,并且他们需要尽可能多的性能。将东西从堆转移到堆栈就是一种可能性。减少堆块的数量是另一个。允许VLA作为对象成员将是一种实现这一目标的方法。我正在努力这样的建议。实施,诚然,这有点复杂,但这似乎很可行。

There are situations where allocating heap memory is very expensive compared to the operations performed. An example is matrix math. If you work with smallish matrices say 5 to 10 elements and do a lot of arithmetics the malloc overhead will be really significant. At the same time making the size a compile time constant does seem very wasteful and inflexible.

I think that C++ is so unsafe in itself that the argument to "try to not add more unsafe features" is not very strong. On the other hand, as C++ is arguably the most runtime efficient programming language features which makes it more so are always useful: People who write performance critical programs will to a large extent use C++, and they need as much performance as possible. Moving stuff from heap to stack is one such possibility. Reducing the number of heap blocks is another. Allowing VLAs as object members would one way to achieve this. I'm working on such a suggestion. It is a bit complicated to implement, admittedly, but it seems quite doable.

不气馁 2025-02-05 05:02:33

这被认为是包含在C ++/1x,对我之前所说的更正)。

无论如何,它在C ++中的有用程度不大,因为我们已经拥有 std :: vector 来填补此角色。

This was considered for inclusion in C++/1x, but was dropped (this is a correction to what I said earlier).

It would be less useful in C++ anyway since we already have std::vector to fill this role.

半山落雨半山空 2025-02-05 05:02:33

VLA是一个较大的可变修饰类型家族的一部分。
这种类型家族非常特别,因为它们具有运行时组件。

编译器将代码:

int A[n];

视为:

typedef int T[n];
T A;

请注意,数组的运行时大小不绑定到变量 a ,而是变量的 type

没有什么可以阻止对这种类型的新变量进行的:

T B,C,D;

或者的指针或数组

T *p, Z[10];

,指针允许一个人使用动态存储创建VLA。

T *p = malloc(sizeof(T));
...
free(p);

消除了流行的神话,只能在堆栈中分配VLA。

回到问题。

此运行时组件与类型扣除型不太合作,这是具有C ++打字系统的基础之一。不可能使用模板,扣除和超载。

C ++打字系统是静态的,所有类型必须在汇编期间完全定义或推导
VM类型仅在程序执行期间完成
简单地认为,将VM类型引入已经是地狱复杂的C ++的其他复杂性被认为是不合理的。主要是因为他们的主要实际应用
是自动vlas( int a [n]; ),具有 std :: vector 的替代方案。

这有点可悲,因为VM类型为处理多维阵列的程序提供了非常优雅和高效的解决方案。

在C中可以简单地写:

void foo(int n, int A[n][n][n]) {
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < n; ++j)
      for (int k = 0; k < n; ++k)
        A[i][j][k] = i * j * k;
}

...

int A[5][5][5], B[10][10][10];
foo(5, A);
foo(10, B);

现在尝试在C ++中提供高效,优雅的解决方案。

VLAs are a part of a larger family of Variably Modified types.
This family of types is very special because they have runtime components.

The code:

int A[n];

Is seen by compiler as:

typedef int T[n];
T A;

Note that the runtime size of array is not bound to the variable A but to the type of the variable.

Nothing prevents one from making new variables of this type:

T B,C,D;

or the pointers or arrays

T *p, Z[10];

Moreover, pointers allow one to create VLAs with dynamic storage.

T *p = malloc(sizeof(T));
...
free(p);

What dispels a popular myth that VLAs can only be allocated on stack.

Back to the question.

This runtime component does not work well with type deduction which is one of the bases with C++ typing system. It would not possible to use templates, deduction and overloading.

C++ typing system is static, all types must be fully defined or deduced during compilation.
VM types are completed only during program execution.
Additional complexity introducing VM types to already hellishly complex C++ was simply considered unjustified. Mainly because their main practical application
are automatic VLAs (int A[n];) which have an alternative in form of std::vector.

It a bit sad because VM types provides very elegant and efficient solutions to programs handling multidimensional arrays.

In C one can simply write:

void foo(int n, int A[n][n][n]) {
  for (int i = 0; i < n; ++i)
    for (int j = 0; j < n; ++j)
      for (int k = 0; k < n; ++k)
        A[i][j][k] = i * j * k;
}

...

int A[5][5][5], B[10][10][10];
foo(5, A);
foo(10, B);

Now try to provide as efficient and elegant solution in C++.

浅笑轻吟梦一曲 2025-02-05 05:02:33

像这样的数组是C99的一部分,但不是标准C ++的一部分。正如其他人所说的那样,向量始终是一个更好的解决方案,这可能就是为什么变量大小数组不在C ++ standatrd中的原因(或提议的C ++ 0x标准中)。

顺便说一句,有关“为什么” C ++标准的问题,请调节的Usenet NewsGroup comp.std.c ++ 是去的地方。

Arrays like this are part of C99, but not part of standard C++. as others have said, a vector is always a much better solution, which is probably why variable sized arrays are not in the C++ standatrd (or in the proposed C++0x standard).

BTW, for questions on "why" the C++ standard is the way it is, the moderated Usenet newsgroup comp.std.c++ is the place to go to.

淡淡離愁欲言轉身 2025-02-05 05:02:33

为此使用std :: vector。例如:

std::vector<int> values;
values.resize(n);

内存将分配在堆上,但这仅具有一个小的性能弊端。此外,明智的做法是不要在堆栈上分配大型数据,因为它的尺寸相当有限。

Use std::vector for this. For example:

std::vector<int> values;
values.resize(n);

The memory will be allocated on the heap, but this holds only a small performance drawback. Furthermore, it is wise not to allocate large datablocks on the stack, as it is rather limited in size.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文