类型转换后的不良价值影响
我使用本机无符号长变量作为缓冲区,用于在其中包含两个无符号短变量。根据我对 C++ 的了解,这应该是一个有效的方法。我多次使用这种方法将 2 个无符号字符存储在一个无符号短整型中,没有任何问题。不幸的是,当在不同的架构上使用它时,它的反应很奇怪。它似乎在第二次分配后更新了该值。 (溢出)案例只是为了演示它。任何人都可以解释为什么它会有这样的反应吗?
unsigned long dwTest = 0xFFEEDDCC;
printf("sizeof(unsigned short) = %d\n", sizeof(unsigned short));
printf("dwTest = %08X\n", dwTest);
//Address + values
printf("Addresses + Values: %08X <- %08X, %08X <- %08X\n", (DWORD)(&((unsigned short*)&dwTest)[0]), (((unsigned short*)&dwTest)[0]), (DWORD)(&((unsigned short*)&dwTest)[1]), (((unsigned short*)&dwTest)[1]) );
((unsigned short*)&dwTest)[0] = (WORD)0xAAAA;
printf("dwTest = %08X\n", dwTest);
((unsigned short*)&dwTest)[1] = (WORD)0xBBBB;
printf("dwTest = %08X\n", dwTest);
//(Overflow)
((unsigned short*)&dwTest)[2] = (WORD)0x9999;
printf("dwTest = %08X\n", dwTest);
Visual C++ 2010 输出(正常):
sizeof(unsigned short) = 2
dwTest = FFEEDDCC
Addresses + Values: 0031F728 <- 0000DDCC, 0031F72A <- 0000FFEE
dwTest = FFEEAAAA
dwTest = BBBBAAAA
dwTest = BBBBAAAA
ARM9 GCC Crosstool 输出(不起作用):
sizeof(unsigned short) = 2
dwTest = FFEEDDCC
Addresses + Values: 7FAFECD8 <- 0000DDCC, 7FAFECDA <- 0000FFEE
dwTest = FFEEDDCC
dwTest = FFEEAAAA
dwTest = BBBBAAAA
I am using a native unsigned long variable as a buffer used to contain two unsigned short variable inside it. From my knowledge of C++ it should be a valid method. I used this method to store 2 unsigned char inside one unsigned short many times without any problem. Unfortunately when using it on a different architecture, it react strangely. It seems to update the value after a second assignation. The (Overflow) case is there simply to demonstrate it. Can anyone shed some light on why it react that way?
unsigned long dwTest = 0xFFEEDDCC;
printf("sizeof(unsigned short) = %d\n", sizeof(unsigned short));
printf("dwTest = %08X\n", dwTest);
//Address + values
printf("Addresses + Values: %08X <- %08X, %08X <- %08X\n", (DWORD)(&((unsigned short*)&dwTest)[0]), (((unsigned short*)&dwTest)[0]), (DWORD)(&((unsigned short*)&dwTest)[1]), (((unsigned short*)&dwTest)[1]) );
((unsigned short*)&dwTest)[0] = (WORD)0xAAAA;
printf("dwTest = %08X\n", dwTest);
((unsigned short*)&dwTest)[1] = (WORD)0xBBBB;
printf("dwTest = %08X\n", dwTest);
//(Overflow)
((unsigned short*)&dwTest)[2] = (WORD)0x9999;
printf("dwTest = %08X\n", dwTest);
Visual C++ 2010 output (OK):
sizeof(unsigned short) = 2
dwTest = FFEEDDCC
Addresses + Values: 0031F728 <- 0000DDCC, 0031F72A <- 0000FFEE
dwTest = FFEEAAAA
dwTest = BBBBAAAA
dwTest = BBBBAAAA
ARM9 GCC Crosstool output (Doesn't work):
sizeof(unsigned short) = 2
dwTest = FFEEDDCC
Addresses + Values: 7FAFECD8 <- 0000DDCC, 7FAFECDA <- 0000FFEE
dwTest = FFEEDDCC
dwTest = FFEEAAAA
dwTest = BBBBAAAA
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您尝试做的事情称为类型双关。有两种传统方法可以做到这一点。
一种方法是通过指针(你已经做了什么)。不幸的是,这与优化器冲突。您会看到,由于停止问题,优化器在一般情况下无法知道两个指针不会互相别名。这意味着编译器必须重新加载可能已通过指针修改的任何值,从而导致大量可能不必要的重新加载。
因此,引入了严格别名规则。它基本上是说,两个指针只有在类型相同时才能互相别名。作为特殊规则,
char *
可以为任何其他指针指定别名(但反之则不然)。 这打破了通过指针进行类型双关,并让编译器生成更高效的代码。当 gcc 检测到类型双关并启用警告时,它会这样警告您:进行类型双关的另一种方法是通过联合:
这会打开一个新的蠕虫罐头。在最好的情况下,这取决于实现。现在,我无法访问 C89 标准,但在 C99 中,它最初声明除最后一个存储到的联合成员之外的联合成员的值未指定。这在 TC 中进行了更改,以声明未指定与最后存储到的成员不对应的字节值,并另外说明与最后存储到的成员相对应的字节将根据新的规则重新解释。类型(显然取决于实现的东西)。
对于 C++,我在标准中找不到有关 union hack 的语言。无论如何,C++ 有
reinterpret_cast<>
,您应该在 C++ 中使用它来进行类型双关(使用reinterpret_cast<>
的参考变体)。无论如何,您可能不应该使用类型双关(取决于实现),并且您应该通过位移手动构建您的值。
What you are trying to do is called type-punning. There are two traditional ways to do it.
A way to do it is via pointers (what you have done). Unfortunately, this conflicts with the optimizer. You see, due to the halting problem, the optimizer cannot know in the general case that two pointers don't alias each other. This means that the compiler has to reload any value that may have been modified via a pointer, resulting in tons of potentially unnecessary reloads.
So, the strict-aliasing rule was introduced. It basically says that two pointers can only alias each other when they are of the same type. As a special rule, a
char *
can alias any other pointer (but not the other way around). This breaks type-punning via pointers, and lets the compiler generate more efficient code. When gcc detects type-punning and has warnings enabled, it will warn you thus:Another way to do type-punning is via the union:
This opens up a new whole can of worms. In the best case, this is implementation dependant. Now, I don't have access to the C89 standard, but in C99 it originally stated that the value of an union member other than the last one stored into is unspecified. This was changed in a TC to state that the values of bytes that don't correspond to the last stored-into member are unspecified, and stated otherwise that the bytes that do correspond to the last stored-into member are reinterpreted as per the new type (something which is obviously implementation dependant).
For C++, I can't find the language about the union hack in the standard. Anyways, C++ has
reinterpret_cast<>
, which is what you should use for type-punning in C++ (use the reference variant ofreinterpret_cast<>
).Anyways, you probably shouldn't be using type-punning (implementation-dependant), and you should build up your values manually via bit-shifting.