Union中的概念问题

发布于 2024-10-03 01:46:35 字数 645 浏览 2 评论 0原文

我的代码是这样的

// using_a_union.cpp
#include <stdio.h>

union NumericType
{
    int         iValue;
    long        lValue;  
    double      dValue;  
};

int main()
{
    union NumericType Values = { 10 };   // iValue = 10
    printf("%d\n", Values.iValue);
    Values.dValue = 3.1416;
    printf("%d\n", Values.iValue); // garbage value
}

为什么在执行 Values.dValue = 3.1416 后尝试打印 Values.iValue 时会得到垃圾值? 我认为内存布局会像这个Values.iValue 和会发生什么 Values.lValue; 当我将某些内容分配给 Values.dValue 时?

My code is this

// using_a_union.cpp
#include <stdio.h>

union NumericType
{
    int         iValue;
    long        lValue;  
    double      dValue;  
};

int main()
{
    union NumericType Values = { 10 };   // iValue = 10
    printf("%d\n", Values.iValue);
    Values.dValue = 3.1416;
    printf("%d\n", Values.iValue); // garbage value
}

Why do I get garbage value when I try to print Values.iValue after doing Values.dValue = 3.1416?
I thought the memory layout would be like this. What happens to Values.iValue and
Values.lValue; when I assign something to Values.dValue ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

吝吻 2024-10-10 01:46:35

联合中,所有数据成员都重叠。您一次只能使用联合的一个数据成员。

iValuelValuedValue 都占用相同的空间。

一旦写入 dValueiValuelValue 成员就不再可用:只有 dValue 可用。


编辑:解决以下评论:您不能写入联合的一个数据成员,然后从另一个数据成员读取。这样做会导致未定义的行为。 (有一个重要的例外:您可以将 C 和 C++ 中的任何对象重新解释为 char 数组。还有其他一些小例外,例如能够将有符号整数重新解释为无符号整数。)可以在 C 标准(C99 6.5/6-7)和 C++ 标准(C++03 3.10,如果我没记错的话)中找到更多信息。

这在实践中有时可能“有效”吗?是的。但是,除非您的编译器明确声明保证此类重新解释能够正确工作并指定其保证的行为,否则您不能依赖它。

In a union, all of the data members overlap. You can only use one data member of a union at a time.

iValue, lValue, and dValue all occupy the same space.

As soon as you write to dValue, the iValue and lValue members are no longer usable: only dValue is usable.


Edit: To address the comments below: You cannot write to one data member of a union and then read from another data member. To do so results in undefined behavior. (There's one important exception: you can reinterpret any object in both C and C++ as an array of char. There are other minor exceptions, like being able to reinterpret a signed integer as an unsigned integer.) You can find more in both the C Standard (C99 6.5/6-7) and the C++ Standard (C++03 3.10, if I recall correctly).

Might this "work" in practice some of the time? Yes. But unless your compiler expressly states that such reinterpretation is guaranteed to be work correctly and specifies the behavior that it guarantees, you cannot rely on it.

疑心病 2024-10-10 01:46:35

因为浮点数的表示方式与整数不同。

所有这些变量占用相同的内存区域(双倍占用更明显)。如果您尝试将 double 的前四个字节作为 int 读取,您将不会得到您的想法。您在这里处理原始内存布局,您需要知道这些类型是如何表示的。


编辑:我还应该添加(正如詹姆斯已经指出的那样),写入联合中的一个变量然后从另一个变量读取确实会调用未定义的行为,并且应该避免(除非您将数据重新解释为 char 数组) )。

Because floating point numbers are represented differently than integers are.

All of those variables occupy the same area of memory (with the double occupying more obviously). If you try to read the first four bytes of that double as an int you are not going to get back what you think. You are dealing with raw memory layout here and you need to know how these types are represented.


EDIT: I should have also added (as James has already pointed out) that writing to one variable in a union and then reading from another does invoke undefined behavior and should be avoided (unless you are re-interpreting the data as an array of char).

柠檬色的秋千 2024-10-10 01:46:35

好吧,我们先看一个更简单的例子。 Ed 的答案描述了浮动部分,但是我们首先检查一下 int 和 char 是如何存储的!

这是我刚刚编写的一个示例:

#include "stdafx.h"
#include <iostream>
using namespace std;

union Color {
    int value;
    struct {
        unsigned char R, G, B, A;
    };
};

int _tmain(int argc, _TCHAR* argv[])
{
    Color c;
    c.value = 0xFFCC0000;
    cout << (int)c.R << ", " << (int)c.G << ", " << (int)c.B << ", " << (int)c.A << endl;
    getchar();
    return 0;
}

您期望输出是什么?

255, 204, 0, 0

对吗?

如果 int 为 32 位,每个字符为 8 位,则 R 应对应于最左边的字节,G 对应于第二个字节,依此类推。

但那是错误的。至少在我的机器/编译器上,整数似乎是以相反的字节顺序存储的。我明白了,

0、0、204、255

因此,为了给出我们期望的输出(或者无论如何我期望的输出),我们必须将结构更改为 A,B,G,R 。这与字节序有关。

不管怎样,我不是这方面的专家,只是我在尝试解码一些二进制文件时偶然发现的东西。关键是,浮点数不一定按照您期望的方式进行编码……您必须了解它们在内部如何存储,才能了解为什么会获得该输出。

Well, let's just look at simpler example first. Ed's answer describes the floating part, but how about we examine how ints and chars are stored first!

Here's an example I just coded up:

#include "stdafx.h"
#include <iostream>
using namespace std;

union Color {
    int value;
    struct {
        unsigned char R, G, B, A;
    };
};

int _tmain(int argc, _TCHAR* argv[])
{
    Color c;
    c.value = 0xFFCC0000;
    cout << (int)c.R << ", " << (int)c.G << ", " << (int)c.B << ", " << (int)c.A << endl;
    getchar();
    return 0;
}

What would you expect the output to be?

255, 204, 0, 0

Right?

If an int is 32 bits, and each of the chars is 8 bits, then R should correspond to the to the left-most byte, G the second one, and so forth.

But that's wrong. At least on my machine/compiler, it appears ints are stored in reverse byte order. I get,

0, 0, 204, 255

So to make this give the output we'd expect (or the output I would have expected anyway), we have to change the struct to A,B,G,R. This has to do with endianness.

Anyway, I'm not an expert on this stuff, just something I stumbled upon when trying to decode some binaries. The point is, floats aren't necessarily encoded the way you'd expect either... you have to understand how they're stored internally to understand why you're getting that output.

烟雨凡馨 2024-10-10 01:46:35

您已经完成了此操作:

union NumericType Values = { 10 };   // iValue = 10 
printf("%d\n", Values.iValue); 
Values.dValue = 3.1416; 

编译器如何为此联合使用内存类似于使用具有最大大小和对齐方式的变量(如果有多个,则使用其中任何一个),并在写入联合中的其他类型之一时重新解释强制转换/ 问题是,

double dValue; // creates a variable with alignment & space
               // as per "union Numerictype Values"
*reinterpret_cast<int*>(&dValue) = 10; // separate step equiv. to = { 10 }
printf("%d\n", *reinterpret_cast<int*>(dValue)); // print as int
dValue = 3.1416;                                 // assign as double
printf("%d\n", *reinterpret_cast<int*>(dValue));  // now print as int

在将 dValue 设置为 3.1416 时,您已经完全覆盖了用于保存数字 10 的位。新值可能看起来是垃圾,但这只是解释第一个 (sizeof int ) double 3.1416 的字节,相信那里有一个有用的 int 值。

如果您希望这两件事是独立的 - 因此设置 double 不会影响之前存储的 int - 那么您应该使用 struct/class

它可能会帮助您考虑这个程序:

#include <iostream>

void print_bits(std::ostream& os, const void* pv, size_t n)
{
    for (int i = 0; i < n; ++i)
    {
        uint8_t byte = static_cast<const uint8_t*>(pv)[i];
        for (int j = 0; j < 8; ++j)
            os << ((byte & (128 >> j)) ? '1' : '0');
        os << ' ';
    }
}

union X
{
    int i;
    double d;
};

int main()
{
    X x = { 10 };
    print_bits(std::cout, &x, sizeof x);
    std::cout << '\n';
    x.d = 3.1416;
    print_bits(std::cout, &x, sizeof x);
    std::cout << '\n';
}

对于我来说,它产生了这个输出:

00001010 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
10100111 11101000 01001000 00101110 11111111 00100001 00001001 01000000

至关重要的是,每行的前半部分显示了用于 iValue 的 32 位:注意最低有效字节中的 1010 二进制(在左侧)像我这样的 Intel CPU)是十进制 10。写入 3.1416 会将整个 64 位更改为表示 3.1416 的模式(请参阅 http://en.wikipedia。 org/wiki/Double_ precision_floating-point_format)。旧的 1010 模式被覆盖、破坏,不再是电磁存储器。

You've done this:

union NumericType Values = { 10 };   // iValue = 10 
printf("%d\n", Values.iValue); 
Values.dValue = 3.1416; 

How a compiler uses memory for this union is similar to using the variable with largest size and alignment (any of them if there are several), and reinterpret cast when one of the other types in the union is written/accessed, as in:

double dValue; // creates a variable with alignment & space
               // as per "union Numerictype Values"
*reinterpret_cast<int*>(&dValue) = 10; // separate step equiv. to = { 10 }
printf("%d\n", *reinterpret_cast<int*>(dValue)); // print as int
dValue = 3.1416;                                 // assign as double
printf("%d\n", *reinterpret_cast<int*>(dValue));  // now print as int

The problem is that in setting dValue to 3.1416 you've completely overwritten the bits that used to hold the number 10. The new value may appear to be garbage, but it's simply the result of interpreting the first (sizeof int) bytes of the double 3.1416, trusting there to be a useful int value there.

If you want the two things to be independent - so setting the double doesn't affect the earlier-stored int - then you should use a struct/class.

It may help you to consider this program:

#include <iostream>

void print_bits(std::ostream& os, const void* pv, size_t n)
{
    for (int i = 0; i < n; ++i)
    {
        uint8_t byte = static_cast<const uint8_t*>(pv)[i];
        for (int j = 0; j < 8; ++j)
            os << ((byte & (128 >> j)) ? '1' : '0');
        os << ' ';
    }
}

union X
{
    int i;
    double d;
};

int main()
{
    X x = { 10 };
    print_bits(std::cout, &x, sizeof x);
    std::cout << '\n';
    x.d = 3.1416;
    print_bits(std::cout, &x, sizeof x);
    std::cout << '\n';
}

Which, for me, produced this output:

00001010 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
10100111 11101000 01001000 00101110 11111111 00100001 00001001 01000000

Crucially, the first half of each line shows the 32 bits that are used for iValue: note the 1010 binary in the least significant byte (on the left on an Intel CPU like mine) is 10 decimal. Writing 3.1416 changes the entire 64-bits to a pattern representing 3.1416 (see http://en.wikipedia.org/wiki/Double_precision_floating-point_format). The old 1010 pattern is overwritten, clobbered, an electromagnetic memory no more.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文