通过下标获取末尾一位数组元素的地址:在 C++ 中合法 标准还是不标准?

发布于 2024-07-23 08:47:36 字数 341 浏览 10 评论 0原文

我已经多次看到它断言 C++ 标准不允许使用以下代码:

int array[5];
int *array_begin = &array[0];
int *array_end = &array[5];

在此上下文中 &array[5] 合法的 C++ 代码吗?

如果可能的话,我希望得到一个参考标准的答案。

了解它是否符合 C 标准也很有趣。 如果它不是标准 C++,为什么决定将其与 array + 5&array[4] + 1 区别对待?

I have seen it asserted several times now that the following code is not allowed by the C++ Standard:

int array[5];
int *array_begin = &array[0];
int *array_end = &array[5];

Is &array[5] legal C++ code in this context?

I would like an answer with a reference to the Standard if possible.

It would also be interesting to know if it meets the C standard. And if it isn't standard C++, why was the decision made to treat it differently from array + 5 or &array[4] + 1?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

小傻瓜 2024-07-30 08:47:36

是的,这是合法的。 来自 C99 标准草案

§6.5。 2.1,第 2 段:

后缀表达式后跟方括号中的表达式[]是下标
数组对象的元素的指定。 下标运算符的定义[]
E1[E2](*((E1)+(E2))) 相同。 由于转换规则
如果 E1 是一个数组对象(相当于一个指向
数组对象的初始元素),E2 是一个整数,E1[E2] 指定第 E2
E1 的元素(从零开始计数)。

§6.5.3.2,第 3 段(强调我的):

一元&运算符产生其操作数的地址。 如果操作数的类型为“type”,
结果的类型为“指向类型的指针”。 如果操作数是一元 * 运算符的结果,
该运算符和 & 运算符都不会被求值,结果就像两者都被计算一样
省略,除了对运算符的约束仍然适用并且结果不是
左值。 类似地,如果操作数是 [] 运算符的结果,则 & 和 运算符或 [] 隐含的一元 * 都会被求值,结果就像 & 运算符一样
被删除,[] 运算符更改为 + 运算符。 否则,结果是
指向由其操作数指定的对象或函数的指针。

§6.5.6,第 8 段:

当一个整数类型的表达式与指针相加或相减时,
结果具有指针操作数的类型。 如果指针操作数指向一个元素
一个数组对象,并且数组足够大,结果指向从
原始元素,使得结果和原始元素的下标之差
数组元素等于整数表达式。 换句话说,如果表达式 P 指向
数组对象的第 i 个元素,表达式 (P)+N(等效于 N+(P))和
(P)-N(其中 N 的值为 n)分别指向 i+n第 - 个和 i−n 个元素
数组对象,前提是它们存在。 此外,如果表达式P指向最后一个
数组对象的元素,表达式 (P)+1 指向数组对象的最后一个元素
数组对象,如果表达式 Q 指向数组对象的最后一个元素,
表达式(Q)-1指向数组对象的最后一个元素。 如果两个指针
操作数和结果指向同一个数组对象的元素,或者指向最后一个元素
数组对象的元素,求值不得产生溢出; 否则,
行为未定义。 如果结果指向数组对象的最后一个元素,则它
不得用作所计算的一元 * 运算符的操作数。

请注意,标准明确允许指针指向超出数组末尾的一个元素,前提是它们没有被取消引用。 到 6.5.2.1 和 6.5.3.2,表达式 &array[5] 等价于 &*(array + 5),相当于 (array+5),它指向数组末尾之后的一个。 这不会导致取消引用(根据 6.5.3.2),因此它是合法的。

Yes, it's legal. From the C99 draft standard:

§6.5.2.1, paragraph 2:

A postfix expression followed by an expression in square brackets [] is a subscripted
designation of an element of an array object. The definition of the subscript operator []
is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that
apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the
initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th
element of E1 (counting from zero).

§6.5.3.2, paragraph 3 (emphasis mine):

The unary & operator yields the address of its operand. If the operand has type ‘‘type’’,
the result has type ‘‘pointer to type’’. If the operand is the result of a unary * operator,
neither that operator nor the & operator is evaluated and the result is as if both were
omitted, except that the constraints on the operators still apply and the result is not an
lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator
were removed and the [] operator were changed to a + operator
. Otherwise, the result is
a pointer to the object or function designated by its operand.

§6.5.6, paragraph 8:

When an expression that has integer type is added to or subtracted from a pointer, the
result has the type of the pointer operand. If the pointer operand points to an element of
an array object, and the array is large enough, the result points to an element offset from
the original element such that the difference of the subscripts of the resulting and original
array elements equals the integer expression. In other words, if the expression P points to
the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and
(P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of
the array object, provided they exist. Moreover, if the expression P points to the last
element of an array object, the expression (P)+1 points one past the last element of the
array object, and if the expression Q points one past the last element of an array object,
the expression (Q)-1 points to the last element of the array object. If both the pointer
operand and the result point to elements of the same array object, or one past the last
element of the array object, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined. If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is evaluated.

Note that the standard explicitly allows pointers to point one element past the end of the array, provided that they are not dereferenced. By 6.5.2.1 and 6.5.3.2, the expression &array[5] is equivalent to &*(array + 5), which is equivalent to (array+5), which points one past the end of the array. This does not result in a dereference (by 6.5.3.2), so it is legal.

っ左 2024-07-30 08:47:36

您的示例是合法的,但只是因为您实际上并未使用越界指针。

让我们首先处理越界指针(因为这就是我最初解释你的问题的方式,在我注意到该示例使用了一个过去的指针之前):

一般来说,你甚至不允许 创建一个越界指针。 指针必须指向数组中的一个元素,或者指向末尾的元素。 无处。

该指针甚至不允许存在,这意味着您显然也不允许取消引用它。

以下是该标准关于该主题的规定:

5.7:5:

当表达式具有积分时
类型被添加到或从中减去
指针,结果的类型为
指针操作数。 如果指针
操作数指向一个元素
数组对象,且数组很大
足够了,结果表明
元素相对于原​​始元素的偏移量
元素使得差异
结果的下标和
原始数组元素等于
积分表达式。 换句话说,
如果表达式 P 指向第 i 个
数组对象的元素,
表达式 (P)+N (等价地,
N+(P)) 和 (P)-N(其中 N 具有
值 n) 分别指向
第 i+n 个和第 i−n 个元素
数组对象,前提是它们存在。
此外,如果表达式 P 指向
到数组的最后一个元素
对象,表达式 (P)+1 点
超过数组的最后一个元素
对象,并且如果表达式 Q 指向
超过数组的最后一个元素
对象,表达式 (Q)-1 指向
数组对象的最后一个元素。
如果指针操作数和
结果指向相同的元素
数组对象,或最后一个对象
数组对象的元素,
评估不应产生
溢出; 否则,行为是
未定义

(强调我的)

当然,这是针对operator+的。 所以为了确定一下,以下是标准对数组下标的规定:

5.2.1:1:

表达式E1[E2](根据定义)与*((E1)+(E2))

相同

当然,有一个明显的警告:您的示例不实际上显示了一个越界指针。 它使用“一过末尾”指针,这是不同的。 指针是允许存在的(如上所述),但据我所知,标准没有提到取消引用它。 我能找到的最接近的是 3.9.2:3:

[注意:例如,数组末尾的地址 (5.7) 将被视为
指向可能位于该地址的数组元素类型的不相关对象。 ——尾注]

在我看来,这意味着是的,您可以合法地取消引用它,但读取或写入该位置的结果未指定。

感谢 ilproxyil 更正了这里的最后一点,回答了问题的最后部分:

  • array + 5 实际上并不
    取消引用任何东西,它只是
    创建一个指向末尾的指针
    数组。
  • &array[4] + 1 取消引用
    array+4(这是完全安全的),
    获取该左值的地址,并且
    该地址加一,即
    结果是一个尾数指针
    (但是那个指针永远不会得到
    取消引用。
  • &array[5] 取消引用 array+5
    (据我所知这是合法的,
    并导致“一个不相关的对象
    数组的元素类型”,作为
    上面说的),然后取
    该元素的地址,也
    似乎足够合法。

所以他们不会做完全相同的事情,尽管在这种情况下,最终结果是相同的。

Your example is legal, but only because you're not actually using an out of bounds pointer.

Let's deal with out of bounds pointers first (because that's how I originally interpreted your question, before I noticed that the example uses a one-past-the-end pointer instead):

In general, you're not even allowed to create an out-of-bounds pointer. A pointer must point to an element within the array, or one past the end. Nowhere else.

The pointer is not even allowed to exist, which means you're obviously not allowed to dereference it either.

Here's what the standard has to say on the subject:

5.7:5:

When an expression that has integral
type is added to or subtracted from a
pointer, the result has the type of
the pointer operand. If the pointer
operand points to an element of an
array object, and the array is large
enough, the result points to an
element offset from the original
element such that the difference of
the subscripts of the resulting and
original array elements equals the
integral expression. In other words,
if the expression P points to the i-th
element of an array object, the
expressions (P)+N (equivalently,
N+(P)) and (P)-N (where N has the
value n) point to, respectively, the
i+n-th and i−n-th elements of the
array object, provided they exist.
Moreover, if the expression P points
to the last element of an array
object, the expression (P)+1 points
one past the last element of the array
object, and if the expression Q points
one past the last element of an array
object, the expression (Q)-1 points to
the last element of the array object.
If both the pointer operand and the
result point to elements of the same
array object, or one past the last
element of the array object, the
evaluation shall not produce an
overflow; otherwise, the behavior is
undefined
.

(emphasis mine)

Of course, this is for operator+. So just to be sure, here's what the standard says about array subscripting:

5.2.1:1:

The expression E1[E2] is identical (by definition) to *((E1)+(E2))

Of course, there's an obvious caveat: Your example doesn't actually show an out-of-bounds pointer. it uses a "one past the end" pointer, which is different. The pointer is allowed to exist (as the above says), but the standard, as far as I can see, says nothing about dereferencing it. The closest I can find is 3.9.2:3:

[Note: for instance, the address one past the end of an array (5.7) would be considered to
point to an unrelated object of the array’s element type that might be located at that address. —end note ]

Which seems to me to imply that yes, you can legally dereference it, but the result of reading or writing to the location is unspecified.

Thanks to ilproxyil for correcting the last bit here, answering the last part of your question:

  • array + 5 doesn't actually
    dereference anything, it simply
    creates a pointer to one past the end
    of array.
  • &array[4] + 1 dereferences
    array+4 (which is perfectly safe),
    takes the address of that lvalue, and
    adds one to that address, which
    results in a one-past-the-end pointer
    (but that pointer never gets
    dereferenced.
  • &array[5] dereferences array+5
    (which as far as I can see is legal,
    and results in "an unrelated object
    of the array’s element type", as the
    above said), and then takes the
    address of that element, which also
    seems legal enough.

So they don't do quite the same thing, although in this case, the end result is the same.

等往事风中吹 2024-07-30 08:47:36

合法的。

根据 C++ 的 gcc 文档&array[ 5] 是合法的。 在 C++ 和 C 中,您可以安全地对数组末尾后面的元素进行寻址- 您将得到一个有效的指针。 所以 &array[5] 作为表达式是合法的。

但是,尝试取消引用指向未分配内存的指针仍然是未定义的行为,即使指针指向有效地址。 因此,即使指针本身有效,尝试取消引用该表达式生成的指针仍然是未定义的行为(即非法)。

但实际上,我想它通常不会导致崩溃。

编辑:顺便说一句,这通常是 STL 容器的 end() 迭代器的实现方式(作为指向末尾一位的指针),因此这是对这种做法合法性的一个很好的证明。

编辑:哦,现在我明白你并不是真的在问持有指向该地址的指针是否合法,而是在问获取指针的确切方式是否合法。 我会听从其他回答者的意见。

It is legal.

According to the gcc documentation for C++, &array[5] is legal. In both C++ and in C you may safely address the element one past the end of an array - you will get a valid pointer. So &array[5] as an expression is legal.

However, it is still undefined behavior to attempt to dereference pointers to unallocated memory, even if the pointer points to a valid address. So attempting to dereference the pointer generated by that expression is still undefined behavior (i.e. illegal) even though the pointer itself is valid.

In practice, I imagine it would usually not cause a crash, though.

Edit: By the way, this is generally how the end() iterator for STL containers is implemented (as a pointer to one-past-the-end), so that's a pretty good testament to the practice being legal.

Edit: Oh, now I see you're not really asking if holding a pointer to that address is legal, but if that exact way of obtaining the pointer is legal. I'll defer to the other answerers on that.

呆° 2024-07-30 08:47:36

我相信这是合法的,并且它取决于发生的“左值到右值”转换。 最后一行核心问题 232 具有以下内容:

我们一致认为标准中的方法似乎没问题:p = 0; *p; 本质上并不是一个错误。 左值到右值的转换将给它带来未定义的行为

虽然这个例子略有不同,但它确实表明 '*' 不会导致左值到右值的转换,因此,假设表达式是 ' 的直接操作数&' 它需要一个左值,然后定义行为。

I believe that this is legal, and it depends on the 'lvalue to rvalue' conversion taking place. The last line Core issue 232 has the following:

We agreed that the approach in the standard seems okay: p = 0; *p; is not inherently an error. An lvalue-to-rvalue conversion would give it undefined behavior

Although this is slightly different example, what it does show is that the '*' does not result in lvalue to rvalue conversion and so, given that the expression is the immediate operand of '&' which expects an lvalue then the behaviour is defined.

梦里寻她 2024-07-30 08:47:36

我不认为这是非法的,但我确实认为 &array[5] 的行为是未定义的。

  • 5.2.1 [expr.sub] E1[E2] 与 *((E1)+(E2)) 相同(根据定义)

  • 5.3.1 [expr.unary.op] 一元 * 运算符 ... 结果是引用表达式指向的对象或函数的左值.

此时,您的行为未定义,因为表达式 ((E1)+(E2)) 实际上并未指向对象,并且标准确实说明了结果应该是什么,除非它确实指向该对象。

  • 1.3.12 [defns.undefined] 当本国际标准省略任何明确的行为定义的描述时,也可能会出现未定义的行为。

如其他地方所述,array + 5&array[0] + 5 是获取超出数组末尾的指针的有效且定义良好的方法。

I don't believe that it is illegal, but I do believe that the behaviour of &array[5] is undefined.

  • 5.2.1 [expr.sub] E1[E2] is identical (by definition) to *((E1)+(E2))

  • 5.3.1 [expr.unary.op] unary * operator ... the result is an lvalue referring to the object or function to which the expression points.

At this point you have undefined behaviour because the expression ((E1)+(E2)) didn't actually point to an object and the standard does say what the result should be unless it does.

  • 1.3.12 [defns.undefined] Undefined behaviour may also be expected when this International Standard omits the description of any explicit definition of behaviour.

As noted elsewhere, array + 5 and &array[0] + 5 are valid and well defined ways of obtaining a pointer one beyond the end of array.

木格 2024-07-30 08:47:36

除了上面的答案之外,我还要指出operator& 可以被类覆盖。 因此,即使它对 POD 有效,但对您知道无效的对象执行此操作可能不是一个好主意(很像首先重写运算符&())。

In addition to the above answers, I'll point out operator& can be overridden for classes. So even if it was valid for PODs, it probably isn't a good idea to do for an object you know isn't valid (much like overriding operator&() in the first place).

无力看清 2024-07-30 08:47:36

这是合法的:

int array[5];
int *array_begin = &array[0];
int *array_end = &array[5];

第 5.2.1 节下标 表达式 E1[E2] 与 *((E1)+(E2)) 相同(根据定义)

因此我们可以说 array_end 也是等价的:

int *array_end = &(*((array) + 5)); // or &(*(array + 5))

第 5.3.1.1 节 一元运算符“*”:一元 * 运算符执行间接寻址:应用它的表达式应是指向对象类型的指针,或者
指向函数类型的指针,结果是引用表达式所指向的对象或函数的左值
如果表达式的类型是“指向 T 的指针”,则结果的类型是“T”。 [ 注意:指向不完整类型的指针(其他
比 cv void)可以取消引用。 由此获得的左值可以以有限的方式使用(初始化引用,例如
例子); 该左值不得转换为右值,请参阅 4.1。 - 尾注]

上面的重要部分:

“结果是引用对象或函数的左值”。

一元运算符“*”返回引用 int 的左值(无取消引用)。 一元运算符 '&' 然后获取左值的地址。

只要没有取消对越界指针的引用,那么该操作就完全被标准覆盖,并且所有行为都被定义。 所以根据我的阅读,上述内容是完全合法的。

事实上,许多 STL 算法依赖于明确定义的行为,这在某种程度上暗示标准委员会已经考虑到了这一点,并且我确信有一些东西明确地涵盖了这一点。

下面的评论部分提出了两个论点:(

请阅读:但它很长,我们俩最终都会恶搞)

论点 1

这是非法的,因为第 5.7 节第 5 段

当具有整型类型的表达式与指针相加或相减时,结果具有指针操作数的类型。 如果指针操作数指向数组对象的元素,并且数组足够大,则结果指向距原始元素的元素偏移量,使得结果数组元素和原始数组元素的下标之差等于整数表达式。 换句话说,如果表达式 P 指向数组对象的第 i 个元素,则表达式 (P)+N (相当于 N+(P))和 (P)-N (其中 N 的值为 n)指向分别为数组对象的第 i + n 个和 i − n 个元素(前提是它们存在)。 此外,如果表达式 P 指向数组对象的最后一个元素,则表达式 (P)+1 指向数组对象的最后一个元素后一位,如果表达式 Q 指向数组对象的最后一个元素后一位,表达式 (Q)-1 指向数组对象的最后一个元素。 如果指针操作数和结果都指向同一个数组对象的元素,或者过去的一个
数组对象的最后一个元素,求值不会产生溢出; 否则,行为未定义。

尽管该部分是相关的; 它不显示未定义的行为。 我们讨论的数组中的所有元素要么在数组内,要么在数组末尾(上一段已经很好地定义了)。

参数 2:

下面介绍的第二个参数是:* 是取消引用运算符。
尽管这是用于描述“*”运算符的常用术语; 该术语在标准中被故意避免,因为术语“取消引用”在语言方面以及它对底层硬件的含义没有明确定义。

尽管访问超出数组末尾的内存肯定是未定义的行为。 我不相信一元 * 运算符在这种情况下访问内存(读/写内存)(不是以标准定义的方式)。 在此上下文中(如标准所定义(参见 5.3.1.1)),一元 * 运算符 返回一个引用对象的左值。 根据我对语言的理解,这并不是对底层内存的访问。 该表达式的结果立即被一元 & 使用。 运算符 运算符,返回引用对象的左值所引用的对象的地址

还提供了对维基百科和非规范来源的许多其他参考。 所有这些我都觉得无关紧要。 C++ 由标准定义

结论:

我愿意承认该标准的许多部分我可能没有考虑到,并且可能证明我的上述论点是错误的。 下面提供了NON。 如果你给我看一个标准参考,表明这是 UB。 我会

  1. 留下答案。
  2. 全部大写,这是愚蠢的,我对所有人来说都是错误的。

这不是一个论点:

并不是世界上所有的东西都是由 C++ 标准定义的。 敞开心扉。

This is legal:

int array[5];
int *array_begin = &array[0];
int *array_end = &array[5];

Section 5.2.1 Subscripting The expression E1[E2] is identical (by definition) to *((E1)+(E2))

So by this we can say that array_end is equivalent too:

int *array_end = &(*((array) + 5)); // or &(*(array + 5))

Section 5.3.1.1 Unary operator '*': The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or
a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points.
If the type of the expression is “pointer to T,” the type of the result is “T.” [ Note: a pointer to an incomplete type (other
than cv void) can be dereferenced. The lvalue thus obtained can be used in limited ways (to initialize a reference, for
example); this lvalue must not be converted to an rvalue, see 4.1. — end note ]

The important part of the above:

'the result is an lvalue referring to the object or function'.

The unary operator '*' is returning a lvalue referring to the int (no de-refeference). The unary operator '&' then gets the address of the lvalue.

As long as there is no de-referencing of an out of bounds pointer then the operation is fully covered by the standard and all behavior is defined. So by my reading the above is completely legal.

The fact that a lot of the STL algorithms depend on the behavior being well defined, is a sort of hint that the standards committee has already though of this and I am sure there is a something that covers this explicitly.

The comment section below presents two arguments:

(please read: but it is long and both of us end up trollish)

Argument 1

this is illegal because of section 5.7 paragraph 5

When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past
the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

And though the section is relevant; it does not show undefined behavior. All the elements in the array we are talking about are either within the array or one past the end (which is well defined by the above paragraph).

Argument 2:

The second argument presented below is: * is the de-reference operator.
And though this is a common term used to describe the '*' operator; this term is deliberately avoided in the standard as the term 'de-reference' is not well defined in terms of the language and what that means to the underlying hardware.

Though accessing the memory one beyond the end of the array is definitely undefined behavior. I am not convinced the unary * operator accesses the memory (reads/writes to memory) in this context (not in a way the standard defines). In this context (as defined by the standard (see 5.3.1.1)) the unary * operator returns a lvalue referring to the object. In my understanding of the language this is not access to the underlying memory. The result of this expression is then immediately used by the unary & operator operator that returns the address of the object referred to by the lvalue referring to the object.

Many other references to Wikipedia and non canonical sources are presented. All of which I find irrelevant. C++ is defined by the standard.

Conclusion:

I am wiling to concede there are many parts of the standard that I may have not considered and may prove my above arguments wrong. NON are provided below. If you show me a standard reference that shows this is UB. I will

  1. Leave the answer.
  2. Put in all caps this is stupid and I am wrong for all to read.

This is not an argument:

Not everything in the entire world is defined by the C++ standard. Open your mind.

人生戏 2024-07-30 08:47:36

工作草案(n2798):

“一元 & 运算符的结果是
指向其操作数的指针。 操作数
应为左值或限定 ID。
在第一种情况下,如果
表达式是“T”,类型
结果是“指向 T 的指针。”(第 103 页)

据我所知,数组 [5] 不是一个合格的 ID(列表位于第 87 页);最接近的似乎是标识符,但虽然数组是标识符数组[5] 不是左值,因为“左值指的是对象或函数。 " (p. 76)。 array[5] 显然不是一个函数,并且不能保证引用一个有效的对象(因为 array + 5 在最后分配的数组元素之后)。

显然,它在某些情况下可能有效,但它不是有效的 C++ 或安全

注:添加以获取数组后面的值是合法的(第 113 页):

“如果表达式 P [指针]
指向数组的最后一个元素
对象,表达式 (P)+1 点
超过数组的最后一个元素
对象,并且如果表达式 Q 指向
超过数组的最后一个元素
对象,表达式 (Q)-1 指向
数组对象的最后一个元素。
如果指针操作数和
结果指向相同的元素
数组对象,或最后一个对象
数组对象的元素,
评估不应产生
溢出”

但是使用 & 这样做是不合法的。

Working draft (n2798):

"The result of the unary & operator is
a pointer to its operand. The operand
shall be an lvalue or a qualified-id.
In the first case, if the type of the
expression is “T,” the type of the
result is “pointer to T.”" (p. 103)

array[5] is not a qualified-id as best I can tell (the list is on p. 87); the closest would seem to be identifier, but while array is an identifier array[5] is not. It is not an lvalue because "An lvalue refers to an object or function. " (p. 76). array[5] is obviously not a function, and is not guaranteed to refer to a valid object (because array + 5 is after the last allocated array element).

Obviously, it may work in certain cases, but it's not valid C++ or safe.

Note: It is legal to add to get one past the array (p. 113):

"if the expression P [a pointer]
points to the last element of an array
object, the expression (P)+1 points
one past the last element of the array
object, and if the expression Q points
one past the last element of an array
object, the expression (Q)-1 points to
the last element of the array object.
If both the pointer operand and the
result point to elements of the same
array object, or one past the last
element of the array object, the
evaluation shall not produce an
overflow"

But it is not legal to do so using &.

忘年祭陌 2024-07-30 08:47:36

即使合法,为什么要背离惯例呢? 无论如何, array + 5 更短,而且在我看来,更具可读性。

编辑:如果你希望它是对称的,你可以写

int* array_begin = array; 
int* array_end = array + 5;

Even if it is legal, why depart from convention? array + 5 is shorter anyway, and in my opinion, more readable.

Edit: If you want it to by symmetric you can write

int* array_begin = array; 
int* array_end = array + 5;
草莓酥 2024-07-30 08:47:36

它应该是未定义的行为,原因如下:

  1. 尝试访问越界元素会导致未定义的行为。 因此,该标准并不禁止在这种情况下抛出异常的实现(即在访问元素之前实现检查边界)。 如果<代码>& (array[size]) 被定义为 begin (array) + size,在越界访问的情况下抛出异常的实现将不再符合标准。

  2. 如果 array 不是数组而是任意集合类型,则不可能生成此生成 end (array)

It should be undefined behaviour, for the following reasons:

  1. Trying to access out-of-bounds elements results in undefined behaviour. Hence the standard does not forbid an implementation throwing an exception in that case (i.e. an implementation checking bounds before an element is accessed). If & (array[size]) were defined to be begin (array) + size, an implementation throwing an exception in case of out-of-bound access would not conform to the standard anymore.

  2. It's impossible to make this yield end (array) if array is not an array but rather an arbitrary collection type.

一江春梦 2024-07-30 08:47:36

序言

这里有相当多的答案相当旧,并且引用了相对旧版本的 C++ 标准(或其草案)。 其他的都是基于C标准; C99 专门进行了修订,以使其合法并具有定义的行为,但这并不意味着 C++ 中也进行了匹配的更改。 看起来 C++ 标准中的文本随着时间的推移发生了一些变化,因此可能不清楚一些旧引用对于当前定义的 C++ 有何意义。

由于措辞随着时间的推移而发生变化,我将引用 C++ 标准的几个具体草案。 如果后来的草案再次修改措辞(这不会令我感到惊讶),则必须针对修改后的措辞再次分析问题。

N4835

后缀表达式后面跟着方括号中的表达式就是后缀表达式。 表达方式之一
应为“T 数组”类型的左值或“指向 T 的指针”类型的纯右值,另一个应为以下类型的纯右值
无范围枚举或整型。 结果的类型为“T”。 类型“T”应该是一个完全定义的
对象类型。59 表达式 E1[E2]*((E1)+(E2)) 相同(根据定义),除了以下情况:
数组操作数,如果该操作数是左值,则结果是左值,否则结果是x值。 表达方式
E1 排序在表达式 E2 之前。

因此,array[5] 相当于*(array + 5)

然后,我们尝试使用 & 运算符获取该表达式的地址。 定义如下(§[expr.unary.op]/3):

一元 & 运算符的结果是指向其操作数的指针。

  • 如果操作数是一个 qualified-id,命名某个类型为 的类 C 的非静态或变体成员 m T,
    结果的类型为“指向 T 类型的类 C 成员的指针”,并且是指定 C::m 的纯右值。
  • 否则,如果操作数是 T 类型的左值,则结果表达式是“指向的指针”类型的纯右值
    T”,其结果是指向指定对象(6.7.1)或函数的指针。 [注:特别是,采取
    cv T”类型变量的地址产生“指向 cv T”类型的指针。 ——尾注] 对于
    指针算术 (7.6.6) 和比较 (7.6.9, 7.6.10) 的目的,不是数组的对象
    以这种方式获取地址的元素被认为属于具有一个类型元素的数组
    T
  • 否则,程序格式不正确。


这三种可能性中的第一种适用于类成员,因此这里无关紧要。

第二个适用于左值。 所以问题是 array + 5 是否是左值。 根据§[basic.lval]/1.1:

  • glvalue 是一个表达式,其计算确定对象、位域或函数的身份。
    [...]
  • xvalue 是一个泛左值,表示其资源可以重用的对象(通常是因为它接近
    其生命周期结束)。
    [...]
  • 左值是不是 xvalue 的左值。

虽然我们可以在数组末尾形成一个地址,但该地址不能确定对象、位字段或函数的标识。 相关选项是“object”,但没有可以确定身份的对象1。 因此,当 array 定义了 N 个元素时,array + N 就不是左值。

只剩下第三种选择:程序格式不正确。

N4944

N4944 的 §[expr.sub]/1 的措辞与 N4835 相同,因此这里不再引用。

在 N4944 中,有关 * 运算符的措辞略有变化。 它以 (§[expr.unary.op]/3) 开头:

一元 & 的操作数 运算符应为某种类型 T 的左值。

N4944 保留了相同的左值定义:

  • glvalue 是一个表达式,其计算确定对象、位域或函数的身份。
    [...]
  • xvalue 是一个泛左值,表示其资源可以重用的对象(通常是因为它接近
    其生命周期结束)。
    [...]
  • 左值是不是 xvalue 的左值。

同样,指向数组末尾的指针不是左值,因此尝试对其应用 * 运算符的代码格式错误。

结论

在最新版本的 C++ 标准中,类似以下的代码

int array[5];
int *foo = &array[5];

格式不正确。


1. Well, it could happen that there's some object at that address, but if so it's an accidental coincidence. Nothing on the standard requires there to be an object that address.

Preamable

Quite a few of the answers here are fairly old, and quote relatively old versions of the C++ standard (or drafts thereof). Others are based on the C standard; C99 was revised specifically to make this legal, with defined behavior, but that doesn't mean a matching change was made in C++. It looks like the text in the C++ standard has changed somewhat over time, so it may be unclear how meaningful some of the older citations are for C++ as currently defined.

Since the wording has changed over time, I'm going to cite a couple of specific drafts of the C++ standard. If later drafts revise the wording again (which wouldn't surprise me) the issue would have to be analyzed again with respect to the revised wording.

N4835

A postfix expression followed by an expression in square brackets is a postfix expression. One of the expressions
shall be a glvalue of type “array of T” or a prvalue of type “pointer to T” and the other shall be a prvalue of
unscoped enumeration or integral type. The result is of type “T”. The type “T” shall be a completely-defined
object type.59 The expression E1[E2] is identical (by definition) to *((E1)+(E2)), except that in the case of
an array operand, the result is an lvalue if that operand is an lvalue and an xvalue otherwise. The expression
E1 is sequenced before the expression E2.

So, array[5] is equivalent to *(array + 5).

We then attempt to take the address of that expression using the & operator. This is defined as follows (§[expr.unary.op]/3):

The result of the unary & operator is a pointer to its operand.

  • If the operand is a qualified-id naming a non-static or variant member m of some class C with type T,
    the result has type “pointer to member of class C of type T” and is a prvalue designating C::m.
  • Otherwise, if the operand is an lvalue of type T, the resulting expression is a prvalue of type “pointer to
    T” whose result is a pointer to the designated object (6.7.1) or function. [Note: In particular, taking
    the address of a variable of type “cv T” yields a pointer of type “pointer to cv T”. —end note] For
    purposes of pointer arithmetic (7.6.6) and comparison (7.6.9, 7.6.10), an object that is not an array
    element whose address is taken in this way is considered to belong to an array with one element of type
    T.
  • Otherwise, the program is ill-formed.

The first of these three possibilities applies to class members, so it's irrelevant here.

The second applies to an lvalue. So the question is whether array + 5 is an lvalue or not. According to §[basic.lval]/1.1:

  • A glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function.
    [...]
  • An xvalue is a glvalue that denotes an object whose resources can be reused (usually because it is near
    the end of its lifetime).
    [...]
  • An lvalue is a glvalue that is not an xvalue.

While we can form an address one past the end of an array, that address does not determine the identity of an object, bit-field or function. The relevant option would be "object", but there is no object there whose identity it can determine1. As such, when array has been defined with N elements, array + N is not an lvalue.

That leaves only the third option: the program is ill-formed.

N4944

N4944 has identical wording for §[expr.sub]/1 as N4835, so I won't quote it again here.

In N4944 the wording with respect to the * operator has changed slightly. It starts with (§[expr.unary.op]/3):

The operand of the unary & operator shall be an lvalue of some type T.

N4944 retains the same definition of an lvalue though:

  • A glvalue is an expression whose evaluation determines the identity of an object, bit-field, or function.
    [...]
  • An xvalue is a glvalue that denotes an object whose resources can be reused (usually because it is near
    the end of its lifetime).
    [...]
  • An lvalue is a glvalue that is not an xvalue.

As such, again, a pointer to one past the end of an array is not an lvalue, so code that attempts to apply the * operator to it is ill-formed.

Conclusion

In recent versions of the C++ standard, code like:

int array[5];
int *foo = &array[5];

...is ill formed.


1. Well, it could happen that there's some object at that address, but if so it's an accidental coincidence. Nothing on the standard requires there to be an object that address.

治碍 2024-07-30 08:47:36

C++ 标准,5.19,第 4 段:

地址常量表达式是指向左值的指针...该指针应使用一元 & 显式创建。 运算符...或使用数组 (4.2)...类型的表达式。 下标运算符 []...可用于创建地址常量表达式,但不能使用这些运算符来访问对象的值。 如果使用下标运算符,则其操作数之一应为整型常量表达式。

在我看来, &array[5] 是合法的 C++,是一个地址常量表达式。

C++ standard, 5.19, paragraph 4:

An address constant expression is a pointer to an lvalue....The pointer shall be created explicitly, using the unary & operator...or using an expression of array (4.2)...type. The subscripting operator []...can be used in the creation of an address constant expression, but the value of an object shall not be accessed by the use of these operators. If the subscripting operator is used, one of its operands shall be an integral constant expression.

Looks to me like &array[5] is legal C++, being an address constant expression.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文