空结构(或结构填充)可以具有任意值吗?

发布于 2025-02-08 04:48:27 字数 557 浏览 1 评论 0原文

假设我有这个具有UB的代码:

union Flag {
    constexpr Flag() : empty{} {}

    struct {} empty;
    bool value;
};

static Flag flag;

int main() {
    return flag.value;
}

当它不是Union flag的活动成员时,UB正在访问value

目前,瑞银不会捕获此错误,因为(我理解)瑞银没有办法检查工会的最后书面成员。对于这种特殊情况,我认为瑞银可以通过相同的检查BOOL的非真实/false值间接地捕获一些UB进行的UB。如果一个空结构的字节被视为“非初始化”或“可以具有任何任意值”,则编译器可以合法地将该空结构的任意字节设置为任何非零/一个值捕获布尔无效值的负载。

我想知道的是:在语义上允许使用空结构的标称字节 - 更普遍地,所有结构中的任何填充字节 - 都以任何非零模式初始化?

Let's say I have this code that has UB:

union Flag {
    constexpr Flag() : empty{} {}

    struct {} empty;
    bool value;
};

static Flag flag;

int main() {
    return flag.value;
}

where the UB is accessing value when it’s not the active member of the union Flag.

Currently, UBSan will not catch this error because (as I understand it) UBSan does not have a way to check the last written member of a union. For this particular case, I think UBSan could catch some UB going on here indirectly via the same check for non-true/false values for type bool. If the byte of an empty struct is considered “uninitialized” or “can have any arbitrary value”, then the compiler could legally set the arbitrary byte of this empty struct to to any non-zero/one value, and then UBSan would be able to catch the load of an invalid value for bool.

What I'd like to know is: Is it semantically permissible to have the nominal byte of empty structs–and more generally, any padding bytes in all structs–be initialized to any non-zero pattern?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

无法言说的痛 2025-02-15 04:48:27

在语义上允许具有空结构的标称字节,并且更一般而言,所有构造中的任何填充字节都可以初始化为任何非零模式?

就编译器而言,任何不属于类型的价值表示的字节都是公平的游戏。好吧,在某种程度上。

您可以在此类字节中有效地memcpy,但这仅当源数据(直接或间接地)从该类型的现有对象(直接或间接)时才有效。这来自[basic.types]/2& 3。因此,在对象模型中,用户不需要将所有内容放入该存储中。

因此,对于居住在C ++对象模型中的代码,允许实现填充填充字节的内容。

C ++ 20的隐式对象创建规则使它变得更加困难,因为它允许在其中已经具有字节的存储中表现出非初始化的对象。这些表现通常与安置新的代码不关联,因此瑞银很难初始化这样的东西。

您所展示的联盟是隐式寿命(由于微不足道的副本/移动构造函数),因此用户可以玩此类游戏。

Is it semantically permissible to have the nominal byte of empty structs–and more generally, any padding bytes in all structs–be initialized to any non-zero pattern?

Any bytes that are not part of the value representation of a type are fair game, as far as the compiler is concerned. Well, to some degree.

You can validly memcpy into such bytes, but this is only valid if the source data comes (directly or indirectly) from an existing object of that type. This is from [basic.types]/2&3. So within the object model, the user doesn't get to just put whatever in that storage.

As such, for code that's living within the C++ object model, an implementation is allowed to play around with the contents of padding bytes.

C++20's implicit object creation rules make this rather more difficult, as it allows uninitialized objects to be manifested in storage that already has bytes in it. These manifestations aren't typically associated with code like placement-new, so it would be very difficult for UBSan to initialize such a thing.

The union you've shown is implicit lifetime (due to the trivial copy/move constructors), so users can play games with such things.

北方的韩爷 2025-02-15 04:48:27

仅当一个人接受某些联合对象有时有可能被读成任何类型的可能性之前,直到下次写作时,标准的某些部分才真正有意义。最值得注意的是,如果仅使用freadmemcpy或其他此类手段从字节源中编写了fread fread ,则从一个字节源编写了与任何事物的任何内容,工会,编译器通常无法知道哪个工会成员正在初始化,因此阅读任何字节序列将成为有效表示的成员都是有效的。

编译器不必在联合内部填充空结构,并且只有在使用字符类型观察其钻头表示时,反复无常的编译器才能决定零填充结构,但是如果代码使用字符类型来观察字符类型联合的初始字节模式从未写过,并且还读取使用其他类型的存储,例如bool,需要编译器才能将位模式视为在后者中有效,否则已确保字符类型访问报告的位模式对该类型无效。编译器坚持这一要求的最简单方法是简单地将访问视为有效。

Certain parts of the Standard only really make sense if one accepts the possibility that certain union objects may sometimes be eligible to be read as any type until the next time they are written. Most notably, if a union object containing only trivial types is written using fread, memcpy, or other such means, from a byte source that has no identifiable association with anything in the union, there would often be no way a compiler could know which union member was being initialized, so reading any member for which the byte sequence would be a valid representation would be valid.

A compiler would not have to zero-fill an empty structure within a union, and a conforming but capricious compiler could decide to zero-fill structures only when their bitwise representation is observed using character types, but if code uses a character type to observe the initial byte pattern of a union that is never written and also reads that storage using another type like bool, a compiler would be required to either treat the bit pattern as valid in the latter read, or else have ensured that the character-type access reported a bit pattern that's not valid for that type. The easiest way for a compiler to uphold that requirement is to simply treat the access as valid.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文