如何在 C++ 中使用数组?

发布于 2024-10-14 14:50:48 字数 1527 浏览 8 评论 0原文

C++ 从 C 继承了数组,它们几乎无处不在。 C++ 提供了更易于使用且不易出错的抽象(自 C++98 和 std::arrayC++11),因此对数组的需求并不像 C 中那样频繁出现。但是,当您阅读遗留代码或进行交互时有了用 C 编写的库,您应该牢牢掌握数组的工作原理。

此常见问题解答分为五个部分:

  1. 类型级别的数组和访问元素
  2. 数组创建和初始化
  3. 赋值和参数传递
  4. 多维数组和指针数组
  5. 使用数组时的常见陷阱

如果您觉得此常见问题解答中缺少一些重要内容,请写下答案并将其链接到此处作为附加部分。

在下面的文本中,“array”表示“C 数组”,而不是类模板 std::array。假设您具备 C 声明符语法的基本知识。请注意,如下所示,手动使用 newdelete 在遇到异常时极其危险,但这就是 另一个常见问题解答


(注意:这是 Stack Overflow 的 C++ FAQ 的条目。如果您想要批评以这种形式提供常见问题解答的想法,然后 开始这一切的元上的帖子将是执行该操作的地方,该问题的答案将在C++ 聊天室,FAQ 想法最初是从这里开始的,因此您的答案很可能会被提出该想法的人阅读。)

C++ inherited arrays from C where they are used virtually everywhere. C++ provides abstractions that are easier to use and less error-prone (std::vector<T> since C++98 and std::array<T, n> since C++11), so the need for arrays does not arise quite as often as it does in C. However, when you read legacy code or interact with a library written in C, you should have a firm grasp on how arrays work.

This FAQ is split into five parts:

  1. arrays on the type level and accessing elements
  2. array creation and initialization
  3. assignment and parameter passing
  4. multidimensional arrays and arrays of pointers
  5. common pitfalls when using arrays

If you feel something important is missing in this FAQ, write an answer and link it here as an additional part.

In the following text, "array" means "C array", not the class template std::array. Basic knowledge of the C declarator syntax is assumed. Note that the manual usage of new and delete as demonstrated below is extremely dangerous in the face of exceptions, but that is the topic of another FAQ.


(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

愿得七秒忆 2024-10-21 14:50:48

类型级别的数组

数组类型表示为 T[n],其中 T元素类型n > 是正数大小,即数组中元素的数量。数组类型是元素类型和大小的乘积类型。如果这些成分中的一个或两个不同,您将得到一个不同的类型:

#include <type_traits>

static_assert(!std::is_same<int[8], float[8]>::value, "distinct element type");
static_assert(!std::is_same<int[8],   int[9]>::value, "distinct size");

请注意,大小是类型的一部分,也就是说,不同大小的数组类型是不兼容的类型,彼此之间完全没有任何关系。 sizeof(T[n]) 相当于 n * sizeof(T)

数组到指针的衰减

T[n]T[m] 之间的唯一“联系”是这两种类型都可以隐式转换T*,此转换的结果是指向数组第一个元素的指针。也就是说,任何需要 T* 的地方,您都可以提供 T[n],编译器将默默地提供该指针:

                  +---+---+---+---+---+---+---+---+
the_actual_array: |   |   |   |   |   |   |   |   |   int[8]
                  +---+---+---+---+---+---+---+---+
                    ^
                    |
                    |
                    |
                    |  pointer_to_the_first_element   int*

这种转换称为“数组- to-pointer 衰减”,这是造成混乱的一个主要根源。在此过程中数组的大小会丢失,因为它不再是类型的一部分 (T*)。优点:在类型级别上忘记数组的大小允许指针指向任何大小的数组的第一个元素。缺点:给定一个指向数组第一个(或任何其他)元素的指针,无法检测该数组有多大或指针相对于数组边界到底指向哪里。 指针极其愚蠢

数组不是指针

只要认为有用,即每当操作在数组上失败但在指针上成功时,编译器就会默默地生成一个指向数组第一个元素的指针。从数组到指针的转换很简单,因为生成的指针只是数组的地址。请注意,指针存储为数组本身的一部分(或内存中的任何其他位置)。 数组不是指针。

static_assert(!std::is_same<int[8], int*>::value, "an array is not a pointer");

数组不会衰减为指向其第一个元素的指针的一个重要上下文是当&运算符应用于它。在这种情况下,& 运算符会生成一个指向整个数组的指针,而不仅仅是指向其第一个元素的指针。尽管在这种情况下,值(地址)相同,但指向数组第一个元素的指针和指向整个数组的指针是完全不同的类型:

static_assert(!std::is_same<int*, int(*)[8]>::value, "distinct element type");

以下 ASCII 艺术解释了这种区别:

      +-----------------------------------+
      | +---+---+---+---+---+---+---+---+ |
+---> | |   |   |   |   |   |   |   |   | | int[8]
|     | +---+---+---+---+---+---+---+---+ |
|     +---^-------------------------------+
|         |
|         |
|         |
|         |  pointer_to_the_first_element   int*
|
|  pointer_to_the_entire_array              int(*)[8]

请注意,指向第一个元素的指针仅指向单个整数(描绘为小方框),而指向整个数组的指针则指向包含 8 个整数的数组(描绘为大方框)。

同样的情况也出现在课堂上,而且可能更为明显。指向对象的指针和指向其第一个数据成员的指针具有相同的值(相同的地址),但它们是完全不同的类型。

如果您不熟悉 C 声明符语法,则类型 int(*)[8] 中的括号是必不可少的:

  • int(*)[8] 是指向8 个整数的数组。
  • int*[8] 是一个由 8 个指针组成的数组,每个元素的类型为 int*

访问元素

C++ 提供了两种语法变体来访问数组的各个元素。
它们之间没有谁比谁优越,您应该熟悉两者。

指针算术

给定一个指向数组第一个元素的指针 p,表达式 p+i 将生成一个指向数组第 i 个元素的指针。之后通过取消引用该指针,可以访问单个元素:

std::cout << *(x+3) << ", " << *(x+7) << std::endl;

如果 x 表示一个数组,那么数组到指针的衰减就会开始,因为添加一个数组和一个整数没有意义(数组上没有加法运算),但是指针和整数相加就有意义了:(

   +---+---+---+---+---+---+---+---+
x: |   |   |   |   |   |   |   |   |   int[8]
   +---+---+---+---+---+---+---+---+
     ^           ^               ^
     |           |               |
     |           |               |
     |           |               |
x+0  |      x+3  |          x+7  |     int*

注意隐式生成的指针没有名字,所以我写了x+0以便识别另一方面

,如果 x 表示指向数组第一个(或任何其他)元素的指针,则数组到指针的衰减不是必要的,因为要添加的 i 指针已经存在:

   +---+---+---+---+---+---+---+---+
   |   |   |   |   |   |   |   |   |   int[8]
   +---+---+---+---+---+---+---+---+
     ^           ^               ^
     |           |               |
     |           |               |
   +-|-+         |               |
x: | | |    x+3  |          x+7  |     int*
   +---+

请注意,在所描述的情况下,x 是一个指针变量(可以通过 x 旁边的小框辨别),但它也可能是返回指针(或任何其他类型 T* 的其他表达式)的函数的结果。

索引运算符

由于语法 *(x+i) 有点笨拙,C++ 提供了替代语法 x[i]

std::cout << x[3] << ", " << x[7] << std::endl;

由于加法是可交换的,因此以下代码的作用完全相同:

std::cout << 3[x] << ", " << 7[x] << std::endl;

索引运算符的定义导致以下有趣的等价:

&x[i]  ==  &*(x+i)  ==  x+i

但是,&x[0] 通常等于x。前者是指针,后者是数组。只有当上下文触发数组到指针的衰减时,x&x[0] 才能互换使用。例如:

T* p = &array[0];  // rewritten as &*(array+0), decay happens due to the addition
T* q = array;      // decay happens due to the assignment

在第一行,编译器检测到从指针到指针的赋值,这很容易成功。在第二行,它检测从数组到指针的赋值。由于这是毫无意义的(但是指针到指针的赋值是有意义的),因此数组到指针的衰减像往常一样开始。

范围

T[n] 类型的数组有 n 个元素,索引从 0n-1;没有元素n。然而,为了支持半开范围(其中开头包含,结尾排除),C++允许计算指向(不存在的)的指针第 n 个元素,但取消引用该指针是非法的:

   +---+---+---+---+---+---+---+---+....
x: |   |   |   |   |   |   |   |   |   .   int[8]
   +---+---+---+---+---+---+---+---+....
     ^                               ^
     |                               |
     |                               |
     |                               |
x+0  |                          x+8  |     int*

例如,如果要对数组进行排序,则以下两种方法同样有效:

std::sort(x + 0, x + n);
std::sort(&x[0], &x[0] + n);

请注意,提供 &x[n] 是非法的 作为第二个参数,因为这相当于 &*(x+n),并且子表达式 *(x+n) 技术上调用 < a href="https://stackoverflow.com/questions/3144904/">C++ 中的未定义行为(但 C99 中则不然)。

另请注意,您可以简单地提供 x 作为第一个参数。这对我来说有点太简洁了,而且它也使编译器的模板参数推导变得有点困难,因为在这种情况下,第一个参数是一个数组,但第二个参数是一个指针。 (数组到指针的衰减再次开始。)

Arrays on the type level

An array type is denoted as T[n] where T is the element type and n is a positive size, the number of elements in the array. The array type is a product type of the element type and the size. If one or both of those ingredients differ, you get a distinct type:

#include <type_traits>

static_assert(!std::is_same<int[8], float[8]>::value, "distinct element type");
static_assert(!std::is_same<int[8],   int[9]>::value, "distinct size");

Note that the size is part of the type, that is, array types of different size are incompatible types that have absolutely nothing to do with each other. sizeof(T[n]) is equivalent to n * sizeof(T).

Array-to-pointer decay

The only "connection" between T[n] and T[m] is that both types can implicitly be converted to T*, and the result of this conversion is a pointer to the first element of the array. That is, anywhere a T* is required, you can provide a T[n], and the compiler will silently provide that pointer:

                  +---+---+---+---+---+---+---+---+
the_actual_array: |   |   |   |   |   |   |   |   |   int[8]
                  +---+---+---+---+---+---+---+---+
                    ^
                    |
                    |
                    |
                    |  pointer_to_the_first_element   int*

This conversion is known as "array-to-pointer decay", and it is a major source of confusion. The size of the array is lost in this process, since it is no longer part of the type (T*). Pro: Forgetting the size of an array on the type level allows a pointer to point to the first element of an array of any size. Con: Given a pointer to the first (or any other) element of an array, there is no way to detect how large that array is or where exactly the pointer points to relative to the bounds of the array. Pointers are extremely stupid.

Arrays are not pointers

The compiler will silently generate a pointer to the first element of an array whenever it is deemed useful, that is, whenever an operation would fail on an array but succeed on a pointer. This conversion from array to pointer is trivial, since the resulting pointer value is simply the address of the array. Note that the pointer is not stored as part of the array itself (or anywhere else in memory). An array is not a pointer.

static_assert(!std::is_same<int[8], int*>::value, "an array is not a pointer");

One important context in which an array does not decay into a pointer to its first element is when the & operator is applied to it. In that case, the & operator yields a pointer to the entire array, not just a pointer to its first element. Although in that case the values (the addresses) are the same, a pointer to the first element of an array and a pointer to the entire array are completely distinct types:

static_assert(!std::is_same<int*, int(*)[8]>::value, "distinct element type");

The following ASCII art explains this distinction:

      +-----------------------------------+
      | +---+---+---+---+---+---+---+---+ |
+---> | |   |   |   |   |   |   |   |   | | int[8]
|     | +---+---+---+---+---+---+---+---+ |
|     +---^-------------------------------+
|         |
|         |
|         |
|         |  pointer_to_the_first_element   int*
|
|  pointer_to_the_entire_array              int(*)[8]

Note how the pointer to the first element only points to a single integer (depicted as a small box), whereas the pointer to the entire array points to an array of 8 integers (depicted as a large box).

The same situation arises in classes and is maybe more obvious. A pointer to an object and a pointer to its first data member have the same value (the same address), yet they are completely distinct types.

If you are unfamiliar with the C declarator syntax, the parenthesis in the type int(*)[8] are essential:

  • int(*)[8] is a pointer to an array of 8 integers.
  • int*[8] is an array of 8 pointers, each element of type int*.

Accessing elements

C++ provides two syntactic variations to access individual elements of an array.
Neither of them is superior to the other, and you should familiarize yourself with both.

Pointer arithmetic

Given a pointer p to the first element of an array, the expression p+i yields a pointer to the i-th element of the array. By dereferencing that pointer afterwards, one can access individual elements:

std::cout << *(x+3) << ", " << *(x+7) << std::endl;

If x denotes an array, then array-to-pointer decay will kick in, because adding an array and an integer is meaningless (there is no plus operation on arrays), but adding a pointer and an integer makes sense:

   +---+---+---+---+---+---+---+---+
x: |   |   |   |   |   |   |   |   |   int[8]
   +---+---+---+---+---+---+---+---+
     ^           ^               ^
     |           |               |
     |           |               |
     |           |               |
x+0  |      x+3  |          x+7  |     int*

(Note that the implicitly generated pointer has no name, so I wrote x+0 in order to identify it.)

If, on the other hand, x denotes a pointer to the first (or any other) element of an array, then array-to-pointer decay is not necessary, because the pointer on which i is going to be added already exists:

   +---+---+---+---+---+---+---+---+
   |   |   |   |   |   |   |   |   |   int[8]
   +---+---+---+---+---+---+---+---+
     ^           ^               ^
     |           |               |
     |           |               |
   +-|-+         |               |
x: | | |    x+3  |          x+7  |     int*
   +---+

Note that in the depicted case, x is a pointer variable (discernible by the small box next to x), but it could just as well be the result of a function returning a pointer (or any other expression of type T*).

Indexing operator

Since the syntax *(x+i) is a bit clumsy, C++ provides the alternative syntax x[i]:

std::cout << x[3] << ", " << x[7] << std::endl;

Due to the fact that addition is commutative, the following code does exactly the same:

std::cout << 3[x] << ", " << 7[x] << std::endl;

The definition of the indexing operator leads to the following interesting equivalence:

&x[i]  ==  &*(x+i)  ==  x+i

However, &x[0] is generally not equivalent to x. The former is a pointer, the latter an array. Only when the context triggers array-to-pointer decay can x and &x[0] be used interchangeably. For example:

T* p = &array[0];  // rewritten as &*(array+0), decay happens due to the addition
T* q = array;      // decay happens due to the assignment

On the first line, the compiler detects an assignment from a pointer to a pointer, which trivially succeeds. On the second line, it detects an assignment from an array to a pointer. Since this is meaningless (but pointer to pointer assignment makes sense), array-to-pointer decay kicks in as usual.

Ranges

An array of type T[n] has n elements, indexed from 0 to n-1; there is no element n. And yet, to support half-open ranges (where the beginning is inclusive and the end is exclusive), C++ allows the computation of a pointer to the (non-existent) n-th element, but it is illegal to dereference that pointer:

   +---+---+---+---+---+---+---+---+....
x: |   |   |   |   |   |   |   |   |   .   int[8]
   +---+---+---+---+---+---+---+---+....
     ^                               ^
     |                               |
     |                               |
     |                               |
x+0  |                          x+8  |     int*

For example, if you want to sort an array, both of the following would work equally well:

std::sort(x + 0, x + n);
std::sort(&x[0], &x[0] + n);

Note that it is illegal to provide &x[n] as the second argument since this is equivalent to &*(x+n), and the sub-expression *(x+n) technically invokes undefined behavior in C++ (but not in C99).

Also note that you could simply provide x as the first argument. That is a little too terse for my taste, and it also makes template argument deduction a bit harder for the compiler, because in that case the first argument is an array but the second argument is a pointer. (Again, array-to-pointer decay kicks in.)

烟花肆意 2024-10-21 14:50:48

程序员经常将多维数组与指针数组混淆。

多维数组

大多数程序员都熟悉命名多维数组,但许多人不知道多维数组也可以匿名创建。多维数组通常称为“数组的数组”或“真正多维数组”。

命名多维数组

当使用命名多维数组时,所有维度必须在编译时已知:

int H = read_int();
int W = read_int();

int connect_four[6][7];   // okay

int connect_four[H][7];   // ISO C++ forbids variable length array
int connect_four[6][W];   // ISO C++ forbids variable length array
int connect_four[H][W];   // ISO C++ forbids variable length array

这就是命名多维数组在内存中的样子:

              +---+---+---+---+---+---+---+
connect_four: |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+

请注意,如上所述的 2D 网格仅仅是有用的可视化。从 C++ 的角度来看,内存是一个“扁平”的字节序列。多维数组的元素按行优先顺序存储。也就是说,connect_four[0][6]connect_four[1][0] 是内存中的邻居。事实上,connect_four[0][7]connect_four[1][0] 表示同一个元素!这意味着您可以采用多维数组并将它们视为大型一维数组:

int* p = &connect_four[0][0];
int* q = p + 42;
some_int_sequence_algorithm(p, q);

匿名多维数组

对于匿名多维数组,除第一个维度外的所有维度都必须在编译时已知:

int (*p)[7] = new int[6][7];   // okay
int (*p)[7] = new int[H][7];   // okay

int (*p)[W] = new int[6][W];   // ISO C++ forbids variable length array
int (*p)[W] = new int[H][W];   // ISO C++ forbids variable length array

这是匿名多维数组在内存中的样子:

              +---+---+---+---+---+---+---+
        +---> |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |
      +-|-+
   p: | | |
      +---+

请注意,数组本身仍然在内存中分配为单个块。

指针数组

您可以通过引入另一层间接寻址来克服固定宽度的限制。

命名的指针数组

这是一个由五个指针组成的命名数组,它们使用不同长度的匿名数组进行初始化:

int* triangle[5];
for (int i = 0; i < 5; ++i)
{
    triangle[i] = new int[5 - i];
}

// ...

for (int i = 0; i < 5; ++i)
{
    delete[] triangle[i];
}

这是它在内存中的样子:

          +---+---+---+---+---+
          |   |   |   |   |   |
          +---+---+---+---+---+
            ^
            | +---+---+---+---+
            | |   |   |   |   |
            | +---+---+---+---+
            |   ^
            |   | +---+---+---+
            |   | |   |   |   |
            |   | +---+---+---+
            |   |   ^
            |   |   | +---+---+
            |   |   | |   |   |
            |   |   | +---+---+
            |   |   |   ^
            |   |   |   | +---+
            |   |   |   | |   |
            |   |   |   | +---+
            |   |   |   |   ^
            |   |   |   |   |
            |   |   |   |   |
          +-|-+-|-+-|-+-|-+-|-+
triangle: | | | | | | | | | | |
          +---+---+---+---+---+

由于现在每行都是单独分配的,因此将 2D 数组视为 1D 数组不再起作用。

匿名指针数组

这是一个由 5 个(或任何其他数量)指针组成的匿名数组,它们是用不同长度的匿名数组初始化的:

int n = calculate_five();   // or any other number
int** p = new int*[n];
for (int i = 0; i < n; ++i)
{
    p[i] = new int[n - i];
}

// ...

for (int i = 0; i < n; ++i)
{
    delete[] p[i];
}
delete[] p;   // note the extra delete[] !

下面是它在内存中的样子:

          +---+---+---+---+---+
          |   |   |   |   |   |
          +---+---+---+---+---+
            ^
            | +---+---+---+---+
            | |   |   |   |   |
            | +---+---+---+---+
            |   ^
            |   | +---+---+---+
            |   | |   |   |   |
            |   | +---+---+---+
            |   |   ^
            |   |   | +---+---+
            |   |   | |   |   |
            |   |   | +---+---+
            |   |   |   ^
            |   |   |   | +---+
            |   |   |   | |   |
            |   |   |   | +---+
            |   |   |   |   ^
            |   |   |   |   |
            |   |   |   |   |
          +-|-+-|-+-|-+-|-+-|-+
          | | | | | | | | | | |
          +---+---+---+---+---+
            ^
            |
            |
          +-|-+
       p: | | |
          +---+

转换

数组到指针的衰减自然延伸到数组数组和指针数组:

int array_of_arrays[6][7];
int (*pointer_to_array)[7] = array_of_arrays;

int* array_of_pointers[6];
int** pointer_to_pointer = array_of_pointers;

但是,没有从 T[h][w]T** 的隐式转换。如果确实存在这样的隐式转换,则结果将是指向指向 Th 指针数组的第一个元素的指针(每个指针都指向一行的第一个元素)在原始的二维数组中),但是该指针数组在内存中的任何位置都不存在。如果您想要进行此类转换,则必须手动创建并填充所需的指针数组:

int connect_four[6][7];

int** p = new int*[6];
for (int i = 0; i < 6; ++i)
{
    p[i] = connect_four[i];
}

// ...

delete[] p;

请注意,这会生成原始多维数组的视图。如果您需要副本,则必须创建额外的数组并自行复制数据:

int connect_four[6][7];

int** p = new int*[6];
for (int i = 0; i < 6; ++i)
{
    p[i] = new int[7];
    std::copy(connect_four[i], connect_four[i + 1], p[i]);
}

// ...

for (int i = 0; i < 6; ++i)
{
    delete[] p[i];
}
delete[] p;

Programmers often confuse multidimensional arrays with arrays of pointers.

Multidimensional arrays

Most programmers are familiar with named multidimensional arrays, but many are unaware of the fact that multidimensional array can also be created anonymously. Multidimensional arrays are often referred to as "arrays of arrays" or "true multidimensional arrays".

Named multidimensional arrays

When using named multidimensional arrays, all dimensions must be known at compile time:

int H = read_int();
int W = read_int();

int connect_four[6][7];   // okay

int connect_four[H][7];   // ISO C++ forbids variable length array
int connect_four[6][W];   // ISO C++ forbids variable length array
int connect_four[H][W];   // ISO C++ forbids variable length array

This is how a named multidimensional array looks like in memory:

              +---+---+---+---+---+---+---+
connect_four: |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+
              |   |   |   |   |   |   |   |
              +---+---+---+---+---+---+---+

Note that 2D grids such as the above are merely helpful visualizations. From the point of view of C++, memory is a "flat" sequence of bytes. The elements of a multidimensional array are stored in row-major order. That is, connect_four[0][6] and connect_four[1][0] are neighbors in memory. In fact, connect_four[0][7] and connect_four[1][0] denote the same element! This means that you can take multi-dimensional arrays and treat them as large, one-dimensional arrays:

int* p = &connect_four[0][0];
int* q = p + 42;
some_int_sequence_algorithm(p, q);

Anonymous multidimensional arrays

With anonymous multidimensional arrays, all dimensions except the first must be known at compile time:

int (*p)[7] = new int[6][7];   // okay
int (*p)[7] = new int[H][7];   // okay

int (*p)[W] = new int[6][W];   // ISO C++ forbids variable length array
int (*p)[W] = new int[H][W];   // ISO C++ forbids variable length array

This is how an anonymous multidimensional array looks like in memory:

              +---+---+---+---+---+---+---+
        +---> |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |     |   |   |   |   |   |   |   |
        |     +---+---+---+---+---+---+---+
        |
      +-|-+
   p: | | |
      +---+

Note that the array itself is still allocated as a single block in memory.

Arrays of pointers

You can overcome the restriction of fixed width by introducing another level of indirection.

Named arrays of pointers

Here is a named array of five pointers which are initialized with anonymous arrays of different lengths:

int* triangle[5];
for (int i = 0; i < 5; ++i)
{
    triangle[i] = new int[5 - i];
}

// ...

for (int i = 0; i < 5; ++i)
{
    delete[] triangle[i];
}

And here is how it looks like in memory:

          +---+---+---+---+---+
          |   |   |   |   |   |
          +---+---+---+---+---+
            ^
            | +---+---+---+---+
            | |   |   |   |   |
            | +---+---+---+---+
            |   ^
            |   | +---+---+---+
            |   | |   |   |   |
            |   | +---+---+---+
            |   |   ^
            |   |   | +---+---+
            |   |   | |   |   |
            |   |   | +---+---+
            |   |   |   ^
            |   |   |   | +---+
            |   |   |   | |   |
            |   |   |   | +---+
            |   |   |   |   ^
            |   |   |   |   |
            |   |   |   |   |
          +-|-+-|-+-|-+-|-+-|-+
triangle: | | | | | | | | | | |
          +---+---+---+---+---+

Since each line is allocated individually now, viewing 2D arrays as 1D arrays does not work anymore.

Anonymous arrays of pointers

Here is an anonymous array of 5 (or any other number of) pointers which are initialized with anonymous arrays of different lengths:

int n = calculate_five();   // or any other number
int** p = new int*[n];
for (int i = 0; i < n; ++i)
{
    p[i] = new int[n - i];
}

// ...

for (int i = 0; i < n; ++i)
{
    delete[] p[i];
}
delete[] p;   // note the extra delete[] !

And here is how it looks like in memory:

          +---+---+---+---+---+
          |   |   |   |   |   |
          +---+---+---+---+---+
            ^
            | +---+---+---+---+
            | |   |   |   |   |
            | +---+---+---+---+
            |   ^
            |   | +---+---+---+
            |   | |   |   |   |
            |   | +---+---+---+
            |   |   ^
            |   |   | +---+---+
            |   |   | |   |   |
            |   |   | +---+---+
            |   |   |   ^
            |   |   |   | +---+
            |   |   |   | |   |
            |   |   |   | +---+
            |   |   |   |   ^
            |   |   |   |   |
            |   |   |   |   |
          +-|-+-|-+-|-+-|-+-|-+
          | | | | | | | | | | |
          +---+---+---+---+---+
            ^
            |
            |
          +-|-+
       p: | | |
          +---+

Conversions

Array-to-pointer decay naturally extends to arrays of arrays and arrays of pointers:

int array_of_arrays[6][7];
int (*pointer_to_array)[7] = array_of_arrays;

int* array_of_pointers[6];
int** pointer_to_pointer = array_of_pointers;

However, there is no implicit conversion from T[h][w] to T**. If such an implicit conversion did exist, the result would be a pointer to the first element of an array of h pointers to T (each pointing to the first element of a line in the original 2D array), but that pointer array does not exist anywhere in memory yet. If you want such a conversion, you must create and fill the required pointer array manually:

int connect_four[6][7];

int** p = new int*[6];
for (int i = 0; i < 6; ++i)
{
    p[i] = connect_four[i];
}

// ...

delete[] p;

Note that this generates a view of the original multidimensional array. If you need a copy instead, you must create extra arrays and copy the data yourself:

int connect_four[6][7];

int** p = new int*[6];
for (int i = 0; i < 6; ++i)
{
    p[i] = new int[7];
    std::copy(connect_four[i], connect_four[i + 1], p[i]);
}

// ...

for (int i = 0; i < 6; ++i)
{
    delete[] p[i];
}
delete[] p;
小镇女孩 2024-10-21 14:50:48

赋值

没有特殊原因,数组之间不能相互赋值。使用 std::copy 代替:

#include <algorithm>

// ...

int a[8] = {2, 3, 5, 7, 11, 13, 17, 19};
int b[8];
std::copy(a + 0, a + 8, b);

这比真正的数组赋值更灵活,因为可以将较大数组的切片复制到较小的数组中。
std::copy 通常专门用于原始类型以提供最大性能。 std::memcpy 不太可能表现得更好。如有疑问,请进行测量。

虽然您不能直接分配数组,但您可以分配包含数组成员的结构和类。这是因为编译器默认提供的赋值运算符数组成员按成员复制。如果您为自己的结构或类类型手动定义赋值运算符,则必须回退到手动复制数组成员。

参数传递

数组不能按值传递。您可以通过指针或引用传递它们。

通过指针传递

由于数组本身不能按值传递,因此通常会按值传递指向其第一个元素的指针。这通常称为“通过指针传递”。由于无法通过该指针检索数组的大小,因此必须传递第二个参数来指示数组的大小(经典的 C 解决方案)或指向数组最后一个元素之后的第二个指针(C++ 迭代器解决方案) :

#include <numeric>
#include <cstddef>

int sum(const int* p, std::size_t n)
{
    return std::accumulate(p, p + n, 0);
}

int sum(const int* p, const int* q)
{
    return std::accumulate(p, q, 0);
}

作为一种语法替代,您还可以将参数声明为 T p[],它与参数上下文中的 T* p 含义完全相同仅列表:

int sum(const int p[], std::size_t n)
{
    return std::accumulate(p, p + n, 0);
}

您可以将编译器视为将 T p[] 重写为 T *p 仅在参数列表的上下文中。这个特殊规则是造成数组和指针混乱的部分原因。在所有其他上下文中,将某些内容声明为数组或指针都会产生巨大的差异。

不幸的是,您还可以在数组参数中提供一个大小,该大小会被编译器默默地忽略。也就是说,以下三个签名完全等效,如编译器错误所示:

int sum(const int* p, std::size_t n)

// error: redefinition of 'int sum(const int*, size_t)'
int sum(const int p[], std::size_t n)

// error: redefinition of 'int sum(const int*, size_t)'
int sum(const int p[8], std::size_t n)   // the 8 has no meaning here

按引用传递

数组也可以按引用传递:

int sum(const int (&a)[8])
{
    return std::accumulate(a + 0, a + 8, 0);
}

在这种情况下,数组大小很重要。由于编写只接受恰好 8 个元素的数组的函数没什么用处,因此程序员通常将此类函数编写为模板:

template <std::size_t n>
int sum(const int (&a)[n])
{
    return std::accumulate(a + 0, a + n, 0);
}

请注意,只能使用实际的整数数组调用此类函数模板,而不能使用指向整数的指针。数组的大小是自动推断的,对于每个大小n,都会从模板实例化一个不同的函数。您还可以编写相当有用的函数模板,从元素类型和大小中进行抽象。

Assignment

For no particular reason, arrays cannot be assigned to one another. Use std::copy instead:

#include <algorithm>

// ...

int a[8] = {2, 3, 5, 7, 11, 13, 17, 19};
int b[8];
std::copy(a + 0, a + 8, b);

This is more flexible than what true array assignment could provide because it is possible to copy slices of larger arrays into smaller arrays.
std::copy is usually specialized for primitive types to give maximum performance. It is unlikely that std::memcpy performs better. If in doubt, measure.

Although you cannot assign arrays directly, you can assign structs and classes which contain array members. That is because array members are copied memberwise by the assignment operator which is provided as a default by the compiler. If you define the assignment operator manually for your own struct or class types, you must fall back to manual copying for the array members.

Parameter passing

Arrays cannot be passed by value. You can either pass them by pointer or by reference.

Pass by pointer

Since arrays themselves cannot be passed by value, usually a pointer to their first element is passed by value instead. This is often called "pass by pointer". Since the size of the array is not retrievable via that pointer, you have to pass a second parameter indicating the size of the array (the classic C solution) or a second pointer pointing after the last element of the array (the C++ iterator solution):

#include <numeric>
#include <cstddef>

int sum(const int* p, std::size_t n)
{
    return std::accumulate(p, p + n, 0);
}

int sum(const int* p, const int* q)
{
    return std::accumulate(p, q, 0);
}

As a syntactic alternative, you can also declare parameters as T p[], and it means the exact same thing as T* p in the context of parameter lists only:

int sum(const int p[], std::size_t n)
{
    return std::accumulate(p, p + n, 0);
}

You can think of the compiler as rewriting T p[] to T *p in the context of parameter lists only. This special rule is partly responsible for the whole confusion about arrays and pointers. In every other context, declaring something as an array or as a pointer makes a huge difference.

Unfortunately, you can also provide a size in an array parameter which is silently ignored by the compiler. That is, the following three signatures are exactly equivalent, as indicated by the compiler errors:

int sum(const int* p, std::size_t n)

// error: redefinition of 'int sum(const int*, size_t)'
int sum(const int p[], std::size_t n)

// error: redefinition of 'int sum(const int*, size_t)'
int sum(const int p[8], std::size_t n)   // the 8 has no meaning here

Pass by reference

Arrays can also be passed by reference:

int sum(const int (&a)[8])
{
    return std::accumulate(a + 0, a + 8, 0);
}

In this case, the array size is significant. Since writing a function that only accepts arrays of exactly 8 elements is of little use, programmers usually write such functions as templates:

template <std::size_t n>
int sum(const int (&a)[n])
{
    return std::accumulate(a + 0, a + n, 0);
}

Note that you can only call such a function template with an actual array of integers, not with a pointer to an integer. The size of the array is automatically inferred, and for every size n, a different function is instantiated from the template. You can also write quite useful function templates that abstract from both the element type and from the size.

寂寞花火° 2024-10-21 14:50:48

5. 使用数组时的常见陷阱。

5.1 陷阱:信任类型不安全的链接。

好吧,你已经被告知,或者你自己发现了,全局变量(命名空间
可以在翻译单元之外访问的范围变量)是
邪恶™。但您知道它们有多真实吗?考虑
下面的程序,由两个文件 [main.cpp] 和 [numbers.cpp] 组成:

// [main.cpp]
#include <iostream>

extern int* numbers;

int main()
{
    using namespace std;
    for( int i = 0;  i < 42;  ++i )
    {
        cout << (i > 0? ", " : "") << numbers[i];
    }
    cout << endl;
}
// [numbers.cpp]
int numbers[42] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

在 Windows 7 中,该程序可以与 MinGW g++ 4.4.1 和
视觉C++10.0。

由于类型不匹配,程序在运行时会崩溃。

Windows 7 崩溃对话框

正式解释:该程序具有未定义行为 (UB),而是
因此,崩溃时它可能会挂起,或者什么都不做,或者它
可以向美国、俄罗斯、印度总统发送威胁电子邮件,
中国和瑞士,让鼻恶魔从你的鼻子里飞出来。

实践说明:在main.cpp中,数组被视为指针,放置在
与数组位于同一地址。对于 32 位可执行文件,这意味着第一个
数组中的int值被视为指针。即,在 main.cpp
numbers 变量包含或似乎包含 (int*)1。这导致
程序访问地址空间最底部的内存,即
传统上保留并导致陷阱。结果:你会崩溃。

编译器完全有权不诊断此错误,
因为 C++11 §3.5/10 说,关于兼容类型的要求
对于声明,

[N3290 §3.5/10]
违反此类型标识规则不需要诊断。

同一段详细说明了允许的变化:

…数组对象的声明可以指定数组类型
因是否存在主数组边界而有所不同 (8.3.4)。

这种允许的变化不包括将名称声明为一个数组
翻译单元,并作为另一个翻译单元中的指针。

5.2 陷阱:过早优化(memset 和朋友)。

尚未编写

5.3 陷阱:使用 C 惯用法获取元素数量。

凭借深厚的 C 经验,很自然地可以这样写……

#define N_ITEMS( array )   (sizeof( array )/sizeof( array[0] ))

由于数组会在需要时衰减为指向第一个元素的指针,因此
表达式 sizeof(a)/sizeof(a[0]) 也可以写成
sizeof(a)/sizeof(*a)。不管怎样,意思都是一样的
它是用于查找数组元素的C 惯用法

主要陷阱:C 习惯用法不是类型安全的。例如,代码
...

#include <stdio.h>

#define N_ITEMS( array ) (sizeof( array )/sizeof( *array ))

void display( int const a[7] )
{
    int const   n = N_ITEMS( a );          // Oops.
    printf( "%d elements.\n", n );
}

int main()
{
    int const   moohaha[]   = {1, 2, 3, 4, 5, 6, 7};

    printf( "%d elements, calling display...\n", N_ITEMS( moohaha ) );
    display( moohaha );
}

传递一个指向N_ITEMS的指针,因此很可能会产生错误
结果。在 Windows 7 中编译为 32 位可执行文件,它会生成……

7个元素,调用显示...
1 个元素。

  1. 编译器将 int const a[7] 重写为 int const a[]
  2. 编译器将 int const a[] 重写为 int const* a
  3. 因此,N_ITEMS 是通过指针调用的。
  4. 对于 32 位可执行文件,sizeof(array)(指针的大小)为 4。
  5. sizeof(*array) 相当于 sizeof(int)< /code>,对于 32 位可执行文件也是 4。

为了在运行时检测此错误,您可以执行以下操作:

#include <assert.h>
#include <typeinfo>

#define N_ITEMS( array )       (                               \
    assert((                                                    \
        "N_ITEMS requires an actual array as argument",        \
        typeid( array ) != typeid( &*array )                    \
        )),                                                     \
    sizeof( array )/sizeof( *array )                            \
    )

7个元素,调用显示...
断言失败:(“N_ITEMS 需要一个实际数组作为参数”,typeid( a ) != typeid( &*a ) ),文件 runtime_detect
ion.cpp,第 16 行

此应用程序已请求运行时以异常方式终止它。
请联系应用程序的支持团队以获取更多信息。

运行时错误检测比不检测好,但是有点浪费
处理器时间,也许还有更多的程序员时间。更好地检测
编译时间!如果您很高兴不支持 C++98 的本地类型数组,
那么你可以这样做:

#include <stddef.h>

typedef ptrdiff_t   Size;

template< class Type, Size n >
Size n_items( Type (&)[n] ) { return n; }

#define N_ITEMS( array )       n_items( array )

用 g++ 编译这个定义并代入第一个完整的程序,
我得到了……

M:\count>; g++ 编译时检测.cpp
compile_time_detection.cpp:在函数“void display(const int*)”中:
compile_time_detection.cpp:14:错误:没有匹配的函数可用于调用“n_items(const int *&)”

M:\count>; _

工作原理:数组通过引用传递给n_items,所以它确实如此
不会衰减到指向第一个元素的指针,并且该函数可以只返回
类型指定的元素数量。

使用 C++11,您也可以将其用于本地类型的数组,并且它是类型安全的
用于查找数组元素数量的C++ 习惯用法

5.4 C++11 - C++20 陷阱:使用 constexpr 数组大小函数。

使用 C++11 及更高版本,很自然地实现数组大小函数,如下所示:

// Similar in C++03, but not constexpr.
template< class Type, std::size_t N > 
constexpr std::size_t size( Type (&)[N] ) { return N; }

这会生成数组中的元素数量作为编译时间常量。该函数甚至被标准化为 std::size

例如,size() 可用于声明与另一个数组大小相同的数组:

// Example 1
void foo()
{
    int const x[] = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};
    int y[ size(x) ] = {};
}

但请考虑使用 constexpr 版本的代码:

// Example 2
template< class Collection >
void foo( Collection const& c )
{
    constexpr int n = size( c ); // error prior to C++23
    // ...
}

int main()
{
    int x[42];
    foo( x );
}

陷阱:直到 C++ 23 不允许使用引用c na常量表达式,并且所有主要编译器都拒绝此代码。来自 C++20 标准,[expr.const] p5.12

表达式E核心常量表达式,除非E的计算遵循抽象机的规则,将计算以下之一以下内容:

  • [...]
  • 引用引用类型的变量或数据成员的id-表达式,除非该引用具有先前的初始化并且
    • 它可用于常量表达式或
    • 它的生命周期开始于 E 的评估;

c 既不能在常量表达式中使用,其生命周期也不在 constexpr int n = ... 内开始,因此计算 c 不是核心常数表达式。对于 C++23,这些限制已由 P2280:在常量表达式中使用未知的指针和引用c 被视为对未指定对象的引用绑定 ([ expr.const] p8)。

5.4.1 解决方法:C++20 兼容的 constexpr 大小函数

std::extentstd::extentconstexpr decltype( c ) >::value; 不是一个可行的解决方法,因为如果 Collection 不是数组,它就会失败。

为了处理可以是非数组的集合,需要一个
size 函数,而且,对于编译时使用,需要编译时
数组大小的表示。以及经典的 C++03 解决方案,效果很好
同样在 C++11 和 C++14 中,是让函数报告其结果而不是值
但通过其函数结果类型。例如这样:

// Example 3 - OK (not ideal, but portable and safe)

#include <array>
#include <cstddef>

// No implementation, these functions are never evaluated.
template< class Type, std::size_t N >
auto static_n_items( Type (&)[N] )
  -> char(&)[N]; // return a reference to an array of N chars

template< class Type, std::size_t N >
auto static_n_items( std::array<Type, N> const& )
  -> char(&)[N];

#define STATIC_N_ITEMS( c ) ( sizeof( static_n_items( c )) )

template< class Collection >
void foo( Collection const& c )
{
    constexpr std::size_t n = STATIC_N_ITEMS( c );
    // ...
}

int main()
{
    int x[42];
    std::array<int, 43> y;
    foo( x );
    foo( y );
}

关于 static_n_items 返回类型的选择:此代码不使用 std::integral_constant
因为使用 std::integral_constant 表示结果
直接作为 constexpr 值,重新引入原始问题。

关于命名:此解决方案的一部分是 constexpr-invalid-due-to-reference
问题是要明确选择编译时间常数。

在 C++23 之前,像上面的 STATIC_N_ITEMS 这样的宏会产生可移植性,
例如,对于 clang 和 Visual C++ 编译器,保留类型安全。

相关:宏不尊重范围,因此为了避免名称冲突,它可以是
使用名称前缀是个好主意,例如MYLIB_STATIC_N_ITEMS

5. Common pitfalls when using arrays.

5.1 Pitfall: Trusting type-unsafe linking.

OK, you’ve been told, or have found out yourself, that globals (namespace
scope variables that can be accessed outside the translation unit) are
Evil™. But did you know how truly Evil™ they are? Consider the
program below, consisting of two files [main.cpp] and [numbers.cpp]:

// [main.cpp]
#include <iostream>

extern int* numbers;

int main()
{
    using namespace std;
    for( int i = 0;  i < 42;  ++i )
    {
        cout << (i > 0? ", " : "") << numbers[i];
    }
    cout << endl;
}
// [numbers.cpp]
int numbers[42] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

In Windows 7 this compiles and links fine with both MinGW g++ 4.4.1 and
Visual C++ 10.0.

Since the types don't match, the program crashes when you run it.

The Windows 7 crash dialog

In-the-formal explanation: the program has Undefined Behavior (UB), and instead
of crashing it can therefore just hang, or perhaps do nothing, or it
can send threating e-mails to the presidents of the USA, Russia, India,
China and Switzerland, and make Nasal Daemons fly out of your nose.

In-practice explanation: in main.cpp the array is treated as a pointer, placed
at the same address as the array. For 32-bit executable this means that the first
int value in the array, is treated as a pointer. I.e., in main.cpp the
numbers variable contains, or appears to contain, (int*)1. This causes the
program to access memory down at very bottom of the address space, which is
conventionally reserved and trap-causing. Result: you get a crash.

The compilers are fully within their rights to not diagnose this error,
because C++11 §3.5/10 says, about the requirement of compatible types
for the declarations,

[N3290 §3.5/10]
A violation of this rule on type identity does not require a diagnostic.

The same paragraph details the variation that is allowed:

… declarations for an array object can specify array types that
differ by the presence or absence of a major array bound (8.3.4).

This allowed variation does not include declaring a name as an array in one
translation unit, and as a pointer in another translation unit.

5.2 Pitfall: Doing premature optimization (memset & friends).

Not written yet

5.3 Pitfall: Using the C idiom to get number of elements.

With deep C experience it’s natural to write …

#define N_ITEMS( array )   (sizeof( array )/sizeof( array[0] ))

Since an array decays to pointer to first element where needed, the
expression sizeof(a)/sizeof(a[0]) can also be written as
sizeof(a)/sizeof(*a). It means the same, and no matter how it’s
written it is the C idiom for finding the number elements of array.

Main pitfall: the C idiom is not typesafe. For example, the code

#include <stdio.h>

#define N_ITEMS( array ) (sizeof( array )/sizeof( *array ))

void display( int const a[7] )
{
    int const   n = N_ITEMS( a );          // Oops.
    printf( "%d elements.\n", n );
}

int main()
{
    int const   moohaha[]   = {1, 2, 3, 4, 5, 6, 7};

    printf( "%d elements, calling display...\n", N_ITEMS( moohaha ) );
    display( moohaha );
}

passes a pointer to N_ITEMS, and therefore most likely produces a wrong
result. Compiled as a 32-bit executable in Windows 7 it produces …

7 elements, calling display...
1 elements.

  1. The compiler rewrites int const a[7] to just int const a[].
  2. The compiler rewrites int const a[] to int const* a.
  3. N_ITEMS is therefore invoked with a pointer.
  4. For a 32-bit executable sizeof(array) (size of a pointer) is then 4.
  5. sizeof(*array) is equivalent to sizeof(int), which for a 32-bit executable is also 4.

In order to detect this error at run time you can do …

#include <assert.h>
#include <typeinfo>

#define N_ITEMS( array )       (                               \
    assert((                                                    \
        "N_ITEMS requires an actual array as argument",        \
        typeid( array ) != typeid( &*array )                    \
        )),                                                     \
    sizeof( array )/sizeof( *array )                            \
    )

7 elements, calling display...
Assertion failed: ( "N_ITEMS requires an actual array as argument", typeid( a ) != typeid( &*a ) ), file runtime_detect
ion.cpp, line 16

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

The runtime error detection is better than no detection, but it wastes a little
processor time, and perhaps much more programmer time. Better with detection at
compile time! And if you're happy to not support arrays of local types with C++98,
then you can do that:

#include <stddef.h>

typedef ptrdiff_t   Size;

template< class Type, Size n >
Size n_items( Type (&)[n] ) { return n; }

#define N_ITEMS( array )       n_items( array )

Compiling this definition substituted into the first complete program, with g++,
I got …

M:\count> g++ compile_time_detection.cpp
compile_time_detection.cpp: In function 'void display(const int*)':
compile_time_detection.cpp:14: error: no matching function for call to 'n_items(const int*&)'

M:\count> _

How it works: the array is passed by reference to n_items, and so it does
not decay to pointer to first element, and the function can just return the
number of elements specified by the type.

With C++11 you can use this also for arrays of local type, and it's the type safe
C++ idiom for finding the number of elements of an array.

5.4 C++11 - C++20 pitfall: Using a constexpr array size function.

With C++11 and later, it's natural to implement an array size function as follows:

// Similar in C++03, but not constexpr.
template< class Type, std::size_t N > 
constexpr std::size_t size( Type (&)[N] ) { return N; }

This yields the amount of elements in an array as a compile time constant. This function has even been standardized as std::size in C++17.

For example, size() can be used to declare an array of the same size as another:

// Example 1
void foo()
{
    int const x[] = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};
    int y[ size(x) ] = {};
}

But consider this code using the constexpr version:

// Example 2
template< class Collection >
void foo( Collection const& c )
{
    constexpr int n = size( c ); // error prior to C++23
    // ...
}

int main()
{
    int x[42];
    foo( x );
}

The pitfall: until C++23 using the reference c n a constant expression is not allowed, and all major compilers reject this code. From the C++20 standard, [expr.const] p5.12:

An expression E is a core constant expression unless the evaluation of E, following the rules of the abstract machine, would evaluate one of the following:

  • [...]
  • an id-expression that refers to a variable or data member of reference type unless the reference has a preceding initialization and either
    • it is usable in constant expressions or
    • its lifetime began within the evaluation of E;

c is neither usable in a constant expression nor did its lifetime begin within constexpr int n = ..., so evaluating c is not a core constant expression. These restrictions have been lifted for C++23 by P2280: Using unknown pointers and references in constant expressions. c is treated a reference binding to an unspecified object ([expr.const] p8).

5.4.1 Workaround: C++20-compatible constexpr size function

std::extent< decltype( c ) >::value; is not a viable workaround because it would fail if Collection was not an array.

To deal with collections that can be non-arrays one needs the overloadability of an
size function, but also, for compile time use one needs a compile time
representation of the array size. And the classic C++03 solution, which works fine
also in C++11 and C++14, is to let the function report its result not as a value
but via its function result type. For example like this:

// Example 3 - OK (not ideal, but portable and safe)

#include <array>
#include <cstddef>

// No implementation, these functions are never evaluated.
template< class Type, std::size_t N >
auto static_n_items( Type (&)[N] )
  -> char(&)[N]; // return a reference to an array of N chars

template< class Type, std::size_t N >
auto static_n_items( std::array<Type, N> const& )
  -> char(&)[N];

#define STATIC_N_ITEMS( c ) ( sizeof( static_n_items( c )) )

template< class Collection >
void foo( Collection const& c )
{
    constexpr std::size_t n = STATIC_N_ITEMS( c );
    // ...
}

int main()
{
    int x[42];
    std::array<int, 43> y;
    foo( x );
    foo( y );
}

About the choice of return type for static_n_items: this code doesn't use std::integral_constant
because with std::integral_constant the result is represented
directly as a constexpr value, reintroducing the original problem.

About the naming: part of this solution to the constexpr-invalid-due-to-reference
problem is to make the choice of compile time constant explicit.

Until C++23, a macro like the STATIC_N_ITEMS above yields portability,
e.g. to the clang and Visual C++ compilers, retaining type safety.

Related: macros do not respect scopes, so to avoid name collisions it can be a
good idea to use a name prefix, e.g. MYLIB_STATIC_N_ITEMS.

七月上 2024-10-21 14:50:48

数组创建和初始化

与任何其他类型的 C++ 对象一样,数组可以直接存储在命名变量中(那么大小必须是编译时常量;C++ 不支持 VLA),或者可以将它们匿名存储在堆上并通过指针间接访问(只有这样才能在运行时计算大小)。

自动数组

每次控制流经过非静态局部数组变量的定义时,都会创建自动数组(“位于堆栈上”的数组):

void foo()
{
    int automatic_array[8];
}

按升序执行初始化。请注意,初始值取决于元素类型 T

  • 如果 TPOD (如上例中的int),不进行初始化。
  • 否则,T 的默认构造函数将初始化所有元素。
  • 如果 T 未提供可访问的默认构造函数,则程序无法编译。

或者,可以在数组初始值设定项中显式指定初始值,这是一个用大括号括起来的逗号分隔列表:

    int primes[8] = {2, 3, 5, 7, 11, 13, 17, 19};

因为在这种情况下,数组初始值设定项中的元素数量等于数组初始值设定项的大小数组,手动指定大小是多余的。它可以由编译器自动推导:

    int primes[] = {2, 3, 5, 7, 11, 13, 17, 19};   // size 8 is deduced

还可以指定大小并提供较短的数组初始值设定项:

    int fibonacci[50] = {0, 1, 1};   // 47 trailing zeros are deduced

在这种情况下,其余元素为 零初始化。请注意,C++ 允许使用空数组初始值设定项(所有元素都初始化为零),而 C89 不允许(至少需要一个值)。另请注意,数组初始值设定项只能用于初始化数组;它们以后不能在作业中使用。

静态数组

静态数组(位于“数据段”中的数组)是使用 static 关键字定义的局部数组变量和命名空间范围内的数组变量(“全局变量”):(

int global_static_array[8];

void foo()
{
    static int local_static_array[8];
}

请注意,命名空间范围内的变量是隐式静态的。将 static 关键字添加到其定义中具有完全不同的、已弃用的含义。)

以下是静态数组与自动数组的不同之处:

  • 没有数组初始值设定项的静态数组在任何进一步的潜在初始化之前都会进行零初始化。
  • 静态 POD 数组仅初始化一次,并且初始值通常被烘焙到可执行文件中,在这种情况下,运行时没有初始化成本。然而,这并不总是最节省空间的解决方案,而且标准也没有要求。
  • 静态非 POD 数组在控制流第一次通过其定义时被初始化。对于本地静态数组,如果从未调用该函数,则可能永远不会发生这种情况。

(以上都不是特定于数组的。这些规则同样适用于其他类型的静态对象。)

数组数据成员

数组数据成员是在创建其所属对象时创建的。不幸的是,C++03 没有提供初始化成员初始值设定项列表中的数组的方法,因此必须通过赋值来伪造初始化:

class Foo
{
    int primes[8];

public:

    Foo()
    {
        primes[0] = 2;
        primes[1] = 3;
        primes[2] = 5;
        // ...
    }
};

或者,您可以在构造函数主体中定义一个自动数组并复制元素:

class Foo
{
    int primes[8];

public:

    Foo()
    {
        int local_array[] = {2, 3, 5, 7, 11, 13, 17, 19};
        std::copy(local_array + 0, local_array + 8, primes + 0);
    }
};

在 C++0x 中,由于 统一初始化

class Foo
{
    int primes[8];

public:

    Foo() : primes { 2, 3, 5, 7, 11, 13, 17, 19 }
    {
    }
};

这是唯一适用于没有默认构造函数的元素类型的解决方案。

动态数组

动态数组没有名称,因此访问它们的唯一方法是通过指针。因为它们没有名字,所以从现在开始我将它们称为“匿名数组”。

在 C 中,匿名数组是通过 malloc 等创建的。在 C++ 中,匿名数组是使用 new T[size] 语法创建的,该语法返回指向匿名数组第一个元素的指针:

std::size_t size = compute_size_at_runtime();
int* p = new int[size];

以下 ASCII 艺术描述了大小计算为 8 时的内存布局运行时:

             +---+---+---+---+---+---+---+---+
(anonymous)  |   |   |   |   |   |   |   |   |
             +---+---+---+---+---+---+---+---+
               ^
               |
               |
             +-|-+
          p: | | |                               int*
             +---+

显然,由于必须单独存储额外的指针,匿名数组比命名数组需要更多的内存。 (免费存储上还有一些额外的开销。)

请注意,这里没有发生数组到指针的衰减。尽管计算 new int[size] 实际上会创建一个整数数组,但表达式 new int[size] 的结果是 >已经指向单个整数(第一个元素)的指针,不是整数数组或指向未知大小的整数数组的指针。这是不可能的,因为静态类型系统要求数组大小是编译时常量。 (因此,我没有在图中用静态类型信息注释匿名数组。)

关于元素的默认值,匿名数组的行为类似于自动数组。
通常,匿名 POD 数组不会被初始化,但是有一个特殊语法可以触发值初始化:(

int* p = new int[some_computed_size]();

注意尾随的一对分号之前的括号。)同样,C++0x 简化了规则,并允许通过统一初始化为匿名数组指定初始值:

int* p = new int[8] { 2, 3, 5, 7, 11, 13, 17, 19 };

如果使用完匿名数组,则必须将其释放回系统:

delete[] p;

您必须每个匿名数组只释放一次,然后再也不会碰它。根本不释放它会导致内存泄漏(或者更一般地说,根据元素类型,资源泄漏),并且尝试多次释放它会导致未定义的行为。使用非数组形式delete(或free)而不是delete[]来释放数组也是未定义的行为

Array creation and initialization

As with any other kind of C++ object, arrays can be stored either directly in named variables (then the size must be a compile-time constant; C++ does not support VLAs), or they can be stored anonymously on the heap and accessed indirectly via pointers (only then can the size be computed at runtime).

Automatic arrays

Automatic arrays (arrays living "on the stack") are created each time the flow of control passes through the definition of a non-static local array variable:

void foo()
{
    int automatic_array[8];
}

Initialization is performed in ascending order. Note that the initial values depend on the element type T:

  • If T is a POD (like int in the above example), no initialization takes place.
  • Otherwise, the default-constructor of T initializes all the elements.
  • If T provides no accessible default-constructor, the program does not compile.

Alternatively, the initial values can be explicitly specified in the array initializer, a comma-separated list surrounded by curly brackets:

    int primes[8] = {2, 3, 5, 7, 11, 13, 17, 19};

Since in this case the number of elements in the array initializer is equal to the size of the array, specifying the size manually is redundant. It can automatically be deduced by the compiler:

    int primes[] = {2, 3, 5, 7, 11, 13, 17, 19};   // size 8 is deduced

It is also possible to specify the size and provide a shorter array initializer:

    int fibonacci[50] = {0, 1, 1};   // 47 trailing zeros are deduced

In that case, the remaining elements are zero-initialized. Note that C++ allows an empty array initializer (all elements are zero-initialized), whereas C89 does not (at least one value is required). Also note that array initializers can only be used to initialize arrays; they cannot later be used in assignments.

Static arrays

Static arrays (arrays living "in the data segment") are local array variables defined with the static keyword and array variables at namespace scope ("global variables"):

int global_static_array[8];

void foo()
{
    static int local_static_array[8];
}

(Note that variables at namespace scope are implicitly static. Adding the static keyword to their definition has a completely different, deprecated meaning.)

Here is how static arrays behave differently from automatic arrays:

  • Static arrays without an array initializer are zero-initialized prior to any further potential initialization.
  • Static POD arrays are initialized exactly once, and the initial values are typically baked into the executable, in which case there is no initialization cost at runtime. This is not always the most space-efficient solution, however, and it is not required by the standard.
  • Static non-POD arrays are initialized the first time the flow of control passes through their definition. In the case of local static arrays, that may never happen if the function is never called.

(None of the above is specific to arrays. These rules apply equally well to other kinds of static objects.)

Array data members

Array data members are created when their owning object is created. Unfortunately, C++03 provides no means to initialize arrays in the member initializer list, so initialization must be faked with assignments:

class Foo
{
    int primes[8];

public:

    Foo()
    {
        primes[0] = 2;
        primes[1] = 3;
        primes[2] = 5;
        // ...
    }
};

Alternatively, you can define an automatic array in the constructor body and copy the elements over:

class Foo
{
    int primes[8];

public:

    Foo()
    {
        int local_array[] = {2, 3, 5, 7, 11, 13, 17, 19};
        std::copy(local_array + 0, local_array + 8, primes + 0);
    }
};

In C++0x, arrays can be initialized in the member initializer list thanks to uniform initialization:

class Foo
{
    int primes[8];

public:

    Foo() : primes { 2, 3, 5, 7, 11, 13, 17, 19 }
    {
    }
};

This is the only solution that works with element types that have no default constructor.

Dynamic arrays

Dynamic arrays have no names, hence the only means of accessing them is via pointers. Because they have no names, I will refer to them as "anonymous arrays" from now on.

In C, anonymous arrays are created via malloc and friends. In C++, anonymous arrays are created using the new T[size] syntax which returns a pointer to the first element of an anonymous array:

std::size_t size = compute_size_at_runtime();
int* p = new int[size];

The following ASCII art depicts the memory layout if the size is computed as 8 at runtime:

             +---+---+---+---+---+---+---+---+
(anonymous)  |   |   |   |   |   |   |   |   |
             +---+---+---+---+---+---+---+---+
               ^
               |
               |
             +-|-+
          p: | | |                               int*
             +---+

Obviously, anonymous arrays require more memory than named arrays due to the extra pointer that must be stored separately. (There is also some additional overhead on the free store.)

Note that there is no array-to-pointer decay going on here. Although evaluating new int[size] does in fact create an array of integers, the result of the expression new int[size] is already a pointer to a single integer (the first element), not an array of integers or a pointer to an array of integers of unknown size. That would be impossible, because the static type system requires array sizes to be compile-time constants. (Hence, I did not annotate the anonymous array with static type information in the picture.)

Concerning default values for elements, anonymous arrays behave similar to automatic arrays.
Normally, anonymous POD arrays are not initialized, but there is a special syntax that triggers value-initialization:

int* p = new int[some_computed_size]();

(Note the trailing pair of parenthesis right before the semicolon.) Again, C++0x simplifies the rules and allows specifying initial values for anonymous arrays thanks to uniform initialization:

int* p = new int[8] { 2, 3, 5, 7, 11, 13, 17, 19 };

If you are done using an anonymous array, you have to release it back to the system:

delete[] p;

You must release each anonymous array exactly once and then never touch it again afterwards. Not releasing it at all results in a memory leak (or more generally, depending on the element type, a resource leak), and trying to release it multiple times results in undefined behavior. Using the non-array form delete (or free) instead of delete[] to release the array is also undefined behavior.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文