是否“偏移”?来自的宏调用未定义的行为?

发布于 2024-11-17 01:31:34 字数 270 浏览 4 评论 0原文

MSVC 实现的示例:

#define offsetof(s,m) \
    (size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
//                                                   ^^^^^^^^^^^

可以看出,它取消引用空指针,这通常会调用未定义的行为。这是规则的例外还是正在发生的事情?

Example from MSVC's implementation:

#define offsetof(s,m) \
    (size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
//                                                   ^^^^^^^^^^^

As can be seen, it dereferences a null pointer, which normally invokes undefined behaviour. Is this an exception to the rule or what is going on?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

べ繥欢鉨o。 2024-11-24 01:31:34

当语言标准说“未定义的行为”时,任何给定的编译器都可以定义该行为。标准库中的实现代码通常依赖于此。那么有两个问题:

(1)代码UB是否符合C++标准?

这是一个非常困难的问题,因为这是一个众所周知的几乎缺陷,C++98/03 标准从未在规范文本中明确指出,通常取消引用空指针是 UB 的。 typeid 的例外暗示,它不是 UB。

您可以肯定地说,将 offsetof 与非 POD 类型一起使用是 UB。

(2) 代码 UB 是否相对于其编写的编译器而言?

不,当然不是。

给定编译器的编译器供应商代码可以使用该编译器的任何功能。

干杯&呵呵,

Where the language standard says "undefined behavior", any given compiler can define the behavior. Implementation code in the standard library typically relies on that. So there are two questions:

(1) Is the code UB with respect to the C++ standard?

That's a really hard question, because it's a well known almost-defect that the C++98/03 standard never says right out in normative text that in general it's UB to dereference a nullpointer. It is implied by the exception for typeid, where it's not UB.

What you can say decidedly is that it's UB to use offsetof with a non-POD type.

(2) Is the code UB with respect to the compiler that it's written for?

No, of course not.

A compiler vendor's code for a given compiler can use any feature of that compiler.

Cheers & hth.,

苦妄 2024-11-24 01:31:34

“未定义行为”的概念不适用于标准库的实现,无论它是宏、函数还是其他任何东西。

一般情况下,标准库不应被视为用 C++(或 C)语言实现。这也适用于标准头文件。标准库应该符合其外部规范,但其他一切都是实现细节,不受该语言的所有和任何其他要求的约束。标准库应该始终被认为是用某种“内部”语言实现的,它可能与 C++ 或 C 非常相似,但仍然不是 C++ 或 C。

换句话说,您引用的宏不会产生未定义的行为,只要它具体是标准库中定义的 offsetof 宏。但是,如果您在代码中执行完全相同的操作(例如以相同的方式定义自己的宏),它确实会导致未定义的行为。 “Quod licet Jovi,non licet bovi”。

The notion of "undefined behavior" is not applicable to the implementation of the Standard Library, regardless of whether it is a macro, a function or anything else.

In general case, the Standard Library should not be seen as implemented in C++ (or C) language. That applies to standard header files as well. The Standard Library should conform to its external specification, but everything else is an implementation detail, exempt from all and any other requirements of the language. The Standard Library should be always thought of as implemented in some "internal" language, which might closely resemble C++ or C, but still is not C++ or C.

In other words, the macro you quoted does not produce undefined behavior, as long as it is specifically the offsetof macro defined in the Standard Library. But if you do exactly the same thing in your code (like define your own macro in the very same way), it will indeed result in undefined behavior. "Quod licet Jovi, non licet bovi".

一曲爱恨情仇 2024-11-24 01:31:34

当 C 标准指定某些操作调用未定义行为时,这通常并不意味着此类操作是被禁止的,而是实现可以自由地指定或不指定后续行为,因为它们认为合适。因此,在标准要求定义行为的情况下,实现可以自由地执行此类操作,当且仅当实现可以保证这些操作的行为与标准要求一致。例如,考虑 strcpy 的以下实现:

char *strcpy(char *dest, char const *src)
{
  ptrdiff_t diff = dest-src-1;
  int ch;
  while((ch = *src++) != 0)
    src[diff] = ch;
  return dest;
}

如果 src 和 dest 是不相关的指针,则 dest-src 的计算将产生未定义的行为。然而,在某些平台上,char*ptrdiff_t 之间的关系是这样的:给定任何 char* p1, p2,计算 p1 + (p2-p1); 始终等于 p2。在做出这种保证的平台上,上述 strcpy 实现将是合法的(并且在某些此类平台上可能比任何可行的替代方案更快)。然而,在其他一些平台上,这样的函数可能总是会失败,除非两个字符串都是同一分配对象的一部分。

同样的原理也适用于 offsetof 宏。不要求编译器提供任何方法来获得与 offsetof 等效的行为(除了实际使用该宏)如果编译器的指针算术模型使得可以获得所需的 offsetof< /code> 行为通过在空指针上使用 -> 运算符,然后其 offsetof 宏可以做到这一点。如果编译器不支持在除指向该类型实例的合法指针之外的其他内容上使用 -> 的任何努力,那么它可能需要定义一个可以计算字段偏移量的内在函数,并且定义 offsetof 宏来使用它。重要的不是标准定义了使用标准库宏和函数执行的操作的行为,而是实现确保了此类宏和函数的行为符合要求。

When the C Standard specifies that certain actions invoke Undefined Behavior, that does has not generally meant that such actions were forbidden, but rather that implementations were free to specify the consequent behaviors or not as they see fit. Consequently, implementations would be free to perform such actions in cases where the Standard requires defined behavior, if and only if the implementations can guarantee that the behaviors for those actions will be consistent with what the Standard requires. Consider, for example, the following implementation of strcpy:

char *strcpy(char *dest, char const *src)
{
  ptrdiff_t diff = dest-src-1;
  int ch;
  while((ch = *src++) != 0)
    src[diff] = ch;
  return dest;
}

If src and dest are unrelated pointers, the computation of dest-src would yield Undefined Behavior. On some platforms, however, the relation between char* and ptrdiff_t is such that given any char* p1, p2, the computation p1 + (p2-p1); will always equal p2. On platforms which make that guarantee, the above implementation of strcpy would be legitimate (and on some such platforms might be faster than any plausible alternative). On some other platforms, however, such a function might always fail except when both strings are part of the same allocated object.

The same principle applies to the offsetof macro. There is no requirement that compilers offer any way to get behavior equivalent to offsetof (other than by actually using that macro) If a compiler's model for pointer arithmetic makes it possible to get the required offsetof behavior by using the -> operator on a null pointer, then its offsetof macro can do that. If a compiler wouldn't support any efforts to use -> on something other than a legitimate pointer to an instance of the type, then it may need to define an intrinsic which can compute a field offset and define the offsetof macro to use that. What's important is not that the Standard define the behaviors of actions performed using standard-library macros and functions, but rather than the implementation ensures that behaviors of such macros and functions match requirements.

酒儿 2024-11-24 01:31:34

这基本上相当于询问这是否是 UB:

s* p = 0;
volatile auto& r = p->m;

显然没有对 r 的目标生成内存访问,因为它是 易失性 并且编译器被禁止生成对易失性变量。但 *s 不是易失性的,因此编译器可能会生成对它的访问。根据标准,地址运算符和转换为引用类型都不会创建未评估的上下文。

因此,我没有看到 易失性 的任何原因,并且我同意其他人的观点,即根据标准,这是未定义的行为。当然,任何编译器都可以定义标准使其实现指定或未定义的行为。

最后,[dcl.ref] 部分中的注释说

特别是,空引用不能存在于定义良好的程序中,因为创建此类引用的唯一方法是将其绑定到通过取消引用空指针获得的“对象”,这会导致未定义的行为。< /p>

This is basically equivalent to asking whether this is UB:

s* p = 0;
volatile auto& r = p->m;

Clearly no memory access is generated to the target of r, because it's volatile and the compiler is prohibited from generating spurious accesses to volatile variables. But *s is not volatile, so the compiler could possibly generate an access to it. Neither the address-of operator nor casting to reference type creates an unevaluated context according to the standard.

So, I don't see any reason for the volatile, and I agree with the others that this is undefined behavior according to the standard. Of course, any compiler is permitted to define behavior where the standard leaves it implementation-specified or undefined.

Finally, a note in section [dcl.ref] says

in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the "object" obtained by dereferencing a null pointer, which causes undefined behavior.

猫九 2024-11-24 01:31:34

如果 m 位于结构 s 内的偏移量 0 处,以及在某些其他情况下,这在 C++ 中不是未定义的行为。根据问题232(强调我的) :

一元 * 运算符执行间接寻址:应用它的表达式应是指向对象类型的指针,或指向函数类型的指针,结果是引用表达式指向的对象或函数的左值,如果有的话。 如果指针是空指针值 (7.11 [conv.ptr]) 或指向数组的最后一个元素对象 (8.7 [expr.add]),结果是空左值并且不引用任何对象或函数。空左值不可修改。

因此,仅当 m 既不在偏移 0 处,也不在与地址对应的偏移处时,&((s *)0)->m 才是未定义的行为它是数组对象最后一个元素之后的一个。请注意,允许向 null 添加 0 偏移量 在 C++ 中,但在 C 中则不然。

正如其他人所指出的,编译器被允许(并且极有可能)永远不会创建未定义的行为,并且可以与利用特定编译器的增强规范的库一起打包。

It is NOT undefined behavior in C++ if m is at offset 0 within the structure s, as well as in certain other cases. According to Issue 232 (emphasis mine):

The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points, if any. If the pointer is a null pointer value (7.11 [conv.ptr]) or points one past the last element of an array object (8.7 [expr.add]), the result is an empty lvalue and does not refer to any object or function. An empty lvalue is not modifiable.

Therefore, the &((s *)0)->m is undefined behavior only if m is neither at offset 0, nor at an offset corresponding to an address which is one past the last element of an array object. Note that adding a 0 offset to null is allowed in C++ but not in C.

As others have noted, the compiler is allowed (and extremely likely) to not ever create the undefined behavior, and may be packaged with libraries that make use of the specific compiler's enhanced specifications.

赤濁 2024-11-24 01:31:34

不,这不是未定义的行为。该表达式在运行时解析。

请注意,它从空指针获取成员m 的地址。它不是取消引用空指针。

No, this is NOT undefined behaviour. The expression is resolved at runtime.

Note that it is taking the address of the member m from a null pointer. It is NOT dereferencing the null pointer.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文