为什么这种不良通用初始化器语法编译并导致不可预测的行为?
我有一堆用于使用硬件(FPGA)寄存器的代码,该寄存器的形式大致是:(
struct SomeRegFields {
unsigned int lower : 16;
unsigned int upper : 16;
};
union SomeReg {
uint32_t wholeReg;
SomeRegFields fields;
};
这些寄存器类型
中 的大多数都更复杂。这是说明性的。以下方式:
SomeReg reg1;
reg1.wholeReg = 0;
// ... assign individual fields
card->writeReg(REG1_ADDRESS, reg1.wholeReg);
SomeReg reg2;
reg2.wholeReg = card->readReg(REG2_ADDRESS);
// ... do something with reg2 field values
我有点缺席,意外地获得了以下内容:
SomeReg reg1{ reg1.wholeReg = 0 };
SomeReg reg2{ reg2.wholeReg = card->readReg(REG2_ADDRESS) };
reg1.wholereg =
零件当然是错误的,应删除。
令我困扰的是,此在MSVC和GCC上都编译。我本来可以在这里期待语法错误。此外,有时它可以正常工作,并且该值实际上正确地复制/分配了,但是有时,即使返回的寄存器值为non-0,也会导致0值。这是不可预测的,但是在哪些情况下行之有效的情况下似乎是一致的。
知道为什么编译器不将其标记为不良语法,以及为什么在某些情况下似乎有效,而是在其他情况下崩溃?我认为这当然是不确定的行为,但是为什么它会在通常背靠背的呼叫几乎是几乎相同的呼叫之间更改的行为呢?
一些汇编信息:
如果我通过编译器Explorer :
int main()
{
SomeReg myReg { myReg.wholeReg = 10 };
return myReg.fields.upper;
}
这是代码GCC TRUNK吐出的主要代码通过优化OFF(-O0
):
main:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 10
* mov eax, DWORD PTR [rbp-4]
* mov DWORD PTR [rbp-4], eax
movzx eax, WORD PTR [rbp-2]
movzx eax, ax
pop rbp
ret
标记*
的行是此版本与没有BAD myReg.wholereg = 部分。 MSVC给出了类似的结果,尽管即使进行了优化,但它似乎也在做一些。在这种情况下,它只是在寄存器中和退出登记册中导致额外的任务,因此它仍然可以按预期工作,但是鉴于我的意外实验结果,它不得在更复杂的情况下以这种方式进行编译,即不从编译中分配。 - 时值值。
I have a bunch of code for working with hardware (FPGA) registers, which is roughly of the form:
struct SomeRegFields {
unsigned int lower : 16;
unsigned int upper : 16;
};
union SomeReg {
uint32_t wholeReg;
SomeRegFields fields;
};
(Most of these register types are more complex. This is illustrative.)
While cleaning up a bunch of code that set up registers in the following way:
SomeReg reg1;
reg1.wholeReg = 0;
// ... assign individual fields
card->writeReg(REG1_ADDRESS, reg1.wholeReg);
SomeReg reg2;
reg2.wholeReg = card->readReg(REG2_ADDRESS);
// ... do something with reg2 field values
I got a bit absent-minded and accidentally ended up with the following:
SomeReg reg1{ reg1.wholeReg = 0 };
SomeReg reg2{ reg2.wholeReg = card->readReg(REG2_ADDRESS) };
The reg1.wholeReg =
part is wrong, of course, and should be removed.
What's bugging me is that this compiles on both MSVC and GCC. I would have expected a syntax error here. Moreover, sometimes it works fine and the value actually gets copied/assigned correctly, but other times, it will result in a 0 value even if the register value returned is non-0. It's unpredictable, but appears to be consistent between runs which cases work and which don't.
Any idea why the compilers don't flag this as bad syntax, and why it seems to work in some cases but breaks in others? I assume this is undefined behavior, of course, but why would it would change behaviors between what often seem like nearly identical calls, often back-to-back?
Some compilation info:
If I run this through Compiler Explorer:
int main()
{
SomeReg myReg { myReg.wholeReg = 10 };
return myReg.fields.upper;
}
This is the code GCC trunk spits out for main with optimization off (-O0
):
main:
push rbp
mov rbp, rsp
mov DWORD PTR [rbp-4], 10
* mov eax, DWORD PTR [rbp-4]
* mov DWORD PTR [rbp-4], eax
movzx eax, WORD PTR [rbp-2]
movzx eax, ax
pop rbp
ret
The lines marked with *
are the only difference between this version and a version without the bad myReg.wholeReg =
part. MSVC gives similar results, though even with optimization off, it seems to be doing some. In this case, it just causes an extra assignment in and back out of a register, so it still works as intended, but given my accidental experimental results, it must not always compile this way in more complex cases, i.e. not assigning from a compile-time-deducible value.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这只是将其视为一种表达。您是将
card-> readReg(reg2_address)
的返回值分配给reg1.Wholereg
,然后使用此表达式的结果(lvalue涉及>
reg1.wholereg
)要汇总reg2
的第一个成员(即reg2.wholereg2.wholereg
)。之后reg1
和reg2
应保持相同的值,即功能的返回值。从句法上讲,在这里也是如此
,这是技术上不确定的行为,因为您不允许您在初始化之前访问变量或类成员。实际上,我希望这种情况通常会起作用,但是,初始化
reg1.wholereg
0
,然后再次。在其自身的初始化器中提到变量在句法上是正确的,有时可能是有用的(例如,将指针传递给变量本身)。这就是为什么没有汇编错误的原因。
即使您修复了初始化,这也具有其他未定义的行为,因为您无法将C ++中的联合用于类型的双关。这总是不确定的行为,尽管有些编译器可能会允许其达到C中允许的程度,但标准不允许阅读
fields.upper
如果wholereg
是联盟的活跃成员(这意味着分配值的最后一个成员)。This is simply treated as an expression. You are assigning the return value of
card->readReg(REG2_ADDRESS)
toreg1.wholeReg
and then you use the result of this expression (a lvalue referring toreg1.wholeReg
) to aggregate-initialize the first member ofreg2
(i.e.reg2.wholeReg
). Afterwardsreg1
andreg2
should hold the same value, the return value of the function.Syntactically the same happens in
However, here it is technically undefined behavior since you are not allowed to access variables or class members before they are initialized. Practically speaking, I would expect this to usually work nontheless, initializing
reg1.wholeReg
to0
and then once again.Referring to a variable in its own initializer is syntactically correct and may sometimes be useful (e.g. to pass a pointer to the variable itself). This is why there is no compilation error.
This has additional undefined behavior, even if you fix the initialization, because you can't use a union in C++ for type punning at all. That is always undefined behavior, although some compilers might allow it to the degree that is allowed in C. Still, the standard does not allow reading
fields.upper
ifwholeReg
is the active member of the union (meaning the last member to which a value was assigned).