C++ 到底在哪里?标准说取消引用未初始化的指针是未定义的行为?
到目前为止,我找不到如何推断以下内容:
int* ptr;
*ptr = 0;
是未定义的行为。
首先,5.3.1/1 规定 *
表示将 T*
转换为 T
的间接。但这并没有说明UB的任何事情。
然后经常引用3.7.3.2/4,说在非空指针上使用释放函数会导致指针无效,并且稍后使用无效指针是UB。但在上面的代码中没有任何关于释放的内容。
上面的代码中的UB是如何推导出来的呢?
So far I can't find how to deduce that the following:
int* ptr;
*ptr = 0;
is undefined behavior.
First of all, there's 5.3.1/1 that states that *
means indirection which converts T*
to T
. But this doesn't say anything about UB.
Then there's often quoted 3.7.3.2/4 saying that using deallocation function on a non-null pointer renders the pointer invalid and later usage of the invalid pointer is UB. But in the code above there's nothing about deallocation.
How can UB be deduced in the code above?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
第 4.1 节看起来像是一个候选节(强调我的):
我相信只要在规范中搜索“uninitial”就可以找到更多候选者。
Section 4.1 looks like a candidate (emphasis mine):
I'm sure just searching on "uninitial" in the spec can find you more candidates.
OP的问题纯属无稽之谈。标准没有要求某些行为是未定义的,事实上,我认为所有这些措辞都应该从标准中删除,因为它会让人们感到困惑,并使标准变得比必要的更加冗长。
该标准定义了某些行为。问题是,它是否指定了这种情况下的任何行为?如果没有,则无论是否明确说明,该行为都是未定义的。
事实上,一些未定义的规范留在标准中主要是为了标准编写者的调试辅助,其想法是,如果一个地方的要求与另一个地方未定义行为的显式声明相冲突,就会产生矛盾:这是证明标准缺陷的一种方法。如果没有未定义行为的明确声明,规定行为的其他条款将是规范性的且不受质疑。
The OP's question is nonsense. There is no requirement that the Standard say certain behaviours are undefined, and indeed I would argue that all such wording be removed from the Standard because it confuses people and makes the Standard more verbose than necessary.
The Standard defines certain behaviour. The question is, does it specify any behaviour in this case? If it does not, the behaviour is undefined whether or not it says so explicitly.
In fact the specification that some things are undefined is left in the Standard primarily as a debugging aid for the Standards writers, the idea being to generate a contradiction if there is a requirement in one place which conflicts with an explicit statement of undefined behaviour in another: that's a way to prove a defect in the Standard. Without the explicit statement of undefined behaviour, the other clause prescribing behaviour would be normative and unchallenged.
我发现这个问题的答案是C++标准草案的一个意想不到的角落,
24.2
节 迭代器要求,特别是24.2.1
节 一般 段落 5 和 10 分别表示(强调我的):并且:
和脚注
268
说:尽管看起来确实存在一些关于空指针是否是单数的争议或不,看起来术语奇异值需要以更通用的方式正确定义。
单数的意图似乎在缺陷报告278。迭代器有效性是什么意思?在基本原理部分说:
因此,失效和未初始化
可能
会创建一个奇异的值,但由于我们无法证明它们是 >非奇异我们必须假设它们是奇异。更新
另一种常识性方法是注意标准草案部分
5.3.1
一元运算符段落1其中说(强调我的):然后,如果我们转到
3.10
部分 左值和右值 段落 1 说(强调我的):但
ptr
除非偶然,否则不会指向有效的对象。I found the answer to this question is a unexpected corner of the C++ draft standard, section
24.2
Iterator requirements, specifically section24.2.1
In general paragraph 5 and 10 which respectively say (emphasis mine):and:
and footnote
268
says:Although it does look like there is some controversy over whether a null pointer is singular or not and it looks like the term singular value needs to be properly defined in a more general manner.
The intent of singular is seems to be summed up well in defect report 278. What does iterator validity mean? under the rationale section which says:
So invalidation and being uninitialized
may
create a value that is singular but since we can not prove they are nonsingular we must assume they are singular.Update
An alternative common sense approach would be to note that the draft standard section
5.3.1
Unary operators paragraph 1 which says(emphasis mine):and if we then go to section
3.10
Lvalues and rvalues paragraph 1 says(emphasis mine):but
ptr
will not, except by chance, point to a valid object.评估未初始化的指针会导致未定义的行为。由于取消引用指针首先需要对其进行评估,这意味着取消引用也会导致未定义的行为。
尽管措辞发生了变化,但 C++11 和 C++14 中都是如此。
在 C++14 中,它被 [dcl.init]/12 完全覆盖:
其中“以下情况”是对
unsigned char
的特定操作。在 C++11 中,[conv.lval/2] 在左值到右值转换过程中涵盖了这一点(即从
ptr
表示的存储区域检索指针值):C++14 中的粗体部分已被删除,并替换为 [dcl.init/12] 中的额外文本。
Evaluating an uninitialized pointer causes undefined behaviour. Since dereferencing the pointer first requires evaluating it, this implies that dereferencing also causes undefined behaviour.
This was true in both C++11 and C++14, although the wording changed.
In C++14 it is fully covered by [dcl.init]/12:
where the "following cases" are particular operations on
unsigned char
.In C++11, [conv.lval/2] covered this under the lvalue-to-rvalue conversion procedure (i.e. retrieving the pointer value from the storage area denoted by
ptr
):The bolded part was removed for C++14 and replaced with the extra text in [dcl.init/12].
我不会假装我对此了解很多,但有些编译器会将指针初始化为 NULL,并且取消引用指向 NULL 的指针是 UB。
另外考虑到未初始化的指针可能指向任何内容(包括 NULL),当您取消引用它时,您可以得出结论,它是 UB。
第 8.3.2 节中的注释 [dcl.ref]
—ISO/IEC 14882:1998(E),ISO C++ 标准,第 8.3.2 节 [dcl.ref]
我认为我应该将其写为注释,我不太确定。
I'm not going to pretend I know a lot about this, but some compilers would initialize the pointer to NULL and dereferencing a pointer to NULL is UB.
Also considering that uninitialized pointer could point to anything (this includes NULL) you could concluded that it's UB when you dereference it.
A note in section 8.3.2 [dcl.ref]
—ISO/IEC 14882:1998(E), the ISO C++ standard, in section 8.3.2 [dcl.ref]
I think I should have written this as comment instead, I'm not really that sure.
要取消引用指针,您需要读取指针变量(不是谈论它指向的对象)。读取未初始化的变量是未定义的行为。
读取指针的值后,您对指针的值执行的操作此时不再重要,无论是写入(如您的示例中)还是从它指向的对象中读取。
To dereference the pointer, you need to read from the pointer variable (not talking about the object it points to). Reading from an uninitialized variable is undefined behaviour.
What you do with the value of pointer after you have read it, doesn't matter anymore at this point, be it writing to (like in your example) or reading from the object it points to.
即使内存中某些内容的正常存储没有“空间”用于任何陷阱位或陷阱表示,也不需要实现以与静态持续时间变量相同的方式存储自动变量,除非用户代码可能保存指向它们某处的指针。这种行为对于整数类型最为明显。在典型的 32 位系统上,给定代码:
即使该值超出了 uint16_t 可表示的范围,
test
生成 65540 也不会特别令人惊讶。没有陷阱表示的类型。如果uint16_t
类型的局部变量保存不确定值,则不要求读取它会产生uint16_t
范围内的值。由于以这种方式使用无符号整数时也可能会导致意外行为,因此没有理由期望指针不会以更糟糕的方式表现。Even if the normal storage of something in memory would have no "room" for any trap bits or trap representations, implementations are not required to store automatic variables the same way as static-duration variables except when there is a possibility that user code might hold a pointer to them somewhere. This behavior is most visible with integer types. On a typical 32-bit system, given the code:
it would not be particularly surprising for
test
to yield 65540 even though that value is outside the representable range ofuint16_t
, a type which has no trap representations. If a local variable of typeuint16_t
holds Indeterminate Value, there is no requirement that reading it yield a value within the range ofuint16_t
. Since unexpected behaviors could result when using even unsigned integers in such fashion, there's no reason to expect that pointers couldn't behave in even worse fashion.