Union U在记忆中是什么样子的?

发布于 2024-08-12 05:39:08 字数 788 浏览 3 评论 0原文

enum Enums { k1, k2, k3, k4 };

union MYUnion { 
    struct U{ 
         char P;
    }u;

    struct U0 { 
        char state; 
    } u0; 

    struct U1 { 
        Enums e; 
        char c; 
        int v1; 
    } u1; 

    struct U2 { 
        Enums e; 
        char c; 
        int v1; 
        int v2; 
    } u2; 

    struct U3 { 
        Enums e; 
        unsigned int i; 
        char c; 
    } u3; 

    struct U4 { 
        Enums e;
        unsigned int i; 
        char c; 
        int v1; 
    } u4; 

    struct U5 { 
        Enums e; 
        unsigned int i; 
        char c; 
        int v1; 
        int v2; 
    } u5; 
} myUnion

我对 C++ 中 Union 的整个概念感到非常困惑。这个“myUnion”在记忆中是什么样子的?我知道数据共享同一个内存块,但是如何呢?尺寸是多少 “我的联盟”?如果是“u5”的大小那么这块内存中的数据是如何分配的?

enum Enums { k1, k2, k3, k4 };

union MYUnion { 
    struct U{ 
         char P;
    }u;

    struct U0 { 
        char state; 
    } u0; 

    struct U1 { 
        Enums e; 
        char c; 
        int v1; 
    } u1; 

    struct U2 { 
        Enums e; 
        char c; 
        int v1; 
        int v2; 
    } u2; 

    struct U3 { 
        Enums e; 
        unsigned int i; 
        char c; 
    } u3; 

    struct U4 { 
        Enums e;
        unsigned int i; 
        char c; 
        int v1; 
    } u4; 

    struct U5 { 
        Enums e; 
        unsigned int i; 
        char c; 
        int v1; 
        int v2; 
    } u5; 
} myUnion

I'm so confused with this whole idea of Union in C++. What does this "myUnion" look like in memory?? I know that the data share the same memory block, but how? What is the size of
"myUnion"? If it is the size of "u5" then how is the data allocated in this block of memory??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

花伊自在美 2024-08-19 05:39:08
  1. 联合的大小是联合中最大事物的大小。
  2. 联合的布局是您最后存储的内容。

因此,在最后一个联合中,如果您存储到 .i 中,然后存储到 .e 中,则 int 的第一个字节将被枚举值覆盖(假设sizeof (enum) 在您的环境中为 1)。

并集就像:

void * p = malloc(sizeof(biggest_item))
Enums *ep = (Enums *)e;
unsigned int * ip = (unsigned int *)p;
char *cp = (char *)p;

*ep*ip*cp 的赋值就像并集一样工作。

  1. the size of the union is the size of the largest thing in the union.
  2. the layout of the union is whatever you last stored.

So, in your last union, if you store into .i, and then store into .e the first byte of the int will be overwritten with the enum value (assuming that sizeof (enum) is 1 on your environment).

A union is like:

void * p = malloc(sizeof(biggest_item))
Enums *ep = (Enums *)e;
unsigned int * ip = (unsigned int *)p;
char *cp = (char *)p;

Assignments to *ep, *ip, and *cp work just like the union.

2024-08-19 05:39:08

你感到困惑是对的!我很困惑......让我们先看一些简单的事情,然后再继续讨论更复杂的示例。

首先是工会基础知识。联合仅意味着当您创建联合类型的变量时,底层组件(在下面的示例中 i 和 f)实际上在内存中重叠。它让您有时将该内存视为 int,有时将该内存视为 float。这自然会很糟糕,你必须知道自己在做什么。

union AUnion
{
   int i;
   float f;
}; // assumes an int is 32 bits

AUnion aUnion;
aUnion.i = 0;
printf("%f", aUnion.f);

上面的代码中,会打印出什么?要理解这个问题的答案,您必须了解整数和浮点数在内存中的表示方式。两者都占用32位内存。然而,这两种类型的记忆的解释方式有所不同。当我设置 aUnion.i = 0 时,我说的是“将 0 整数写入 aUnion”。碰巧,一个 0 整数对应于将所有 32 位设置为 0。现在,当我们去打印 aUnion.f 时,我们说的是“将 aUnion 视为真正的 32 位浮点数,然后打印然后,计算机将所有这些底层位视为 float 而不是 int 的一部分。计算机知道如何将任何随机的 32 位字节视为 float,因为 它知道如何将浮点数格式化为二进制

现在来处理一些更复杂的联合代码:

enum Enums { k1, k2, k3, k4 };

union MYUnion { 
struct U{ 
     char P;
}u;

struct U0 { 
    char state; 
} u0; 

struct U1 { 
    Enums e; 
    char c; 
    int v1; 
} u1; 

所有这些结构都以与上面的 int 和 float 相同的方式重叠。现在,如果我们假设将枚​​举映射到 int,那么我们可以将枚举映射到底层内存中的 int 值。 ,基于枚举的规则:

 enum Enums { k1/*0*/, k2/*1*/, k3/*2*/, k4/*3*/ };

那么我们所拥有的是

union MYUnion { 
struct U{ 
     char P;
}u;

struct U0 { 
    char state; 
} u0; 

struct U1 { 
    int e; 
    char c; 
    int v1; 
} u1; 

并且你有一个非常奇怪的联合 因为如果你这样做

MyUnion m;
m.u.P = 'h'

当稍后你访问枚举(很可能是一个 int )时,它将被读取为无效价值。这是因为 P 只是 1 个字节,而 int 是 4 个字节。当读取为枚举时,您会得到奇怪的结果。

我强烈建议你解雇负责这段代码的人。

You are right to be confused! I'm confused... Let's look at something simple first before moving on to the more complex example you have.

First union basics. A union just means that when you create a variable of the union type, the underlying components (in the example below i and f) are really overlapping in memory. It lets you sometimes treat that memory as an int and sometimes treat that memory as a float. This naturally can be nasty and you really have to know what you're doing.

union AUnion
{
   int i;
   float f;
}; // assumes an int is 32 bits

AUnion aUnion;
aUnion.i = 0;
printf("%f", aUnion.f);

In the above code, what will be print out? Well to understand the answer to that question you have to understand how ints and floats are represented in memory. Both take up 32 bits of memory. How that memory is interpreted however differs between the two types. When I set aUnion.i = 0, I am saying "write a 0'd integer to aUnion". A 0'd integer, it so happens, corresponds to setting all 32-bits to 0. Now when we go to print aUnion.f, we are saying "treat aUnion as if the bits are really a 32-bit float, and print it out! The computer then treats all those underlying bits as if they are really parts of a float instead of the int. The computer knows how to treat any random bunch of 32-bits as a float because it knows how a floating point number is formatted in binary.

Now to take on some of your more complex union code:

enum Enums { k1, k2, k3, k4 };

union MYUnion { 
struct U{ 
     char P;
}u;

struct U0 { 
    char state; 
} u0; 

struct U1 { 
    Enums e; 
    char c; 
    int v1; 
} u1; 

All these structs are overlapped in the same way the int and float were above. Now if we assume that Enums are mapped to an int. Then we can map the enums to int values in the underlying memory, based on the rules of enums:

 enum Enums { k1/*0*/, k2/*1*/, k3/*2*/, k4/*3*/ };

So then what we have is

union MYUnion { 
struct U{ 
     char P;
}u;

struct U0 { 
    char state; 
} u0; 

struct U1 { 
    int e; 
    char c; 
    int v1; 
} u1; 

And you have a very strange union because if you do

MyUnion m;
m.u.P = 'h'

When later you access the enum (which is most likely an int beneath the hoods), it will be read as an invalid value. This is because P is just 1 byte, and the int is 4 bytes. When read as the enum, you will get weird results.

I highly suggest you go sack who is responsible for this code.

分分钟 2024-08-19 05:39:08

正如 Murali 所说,联盟的规模将是参与联盟的最大结构体的规模。

应用程序将为最大的块分配足够的字节。内存映射的工作原理如下:

考虑以下情况:

union Foo
{
  struct A
  {
    int x;
    unsigned char y;
    unsigned char z;

  }
  struct B
  {
    unsigned char a;
    unsigned char b;
    unsigned char c;
    unsigned char d;
    unsigned char e;
  }
}

在这种情况下,假设 int 是 32 位(取决于您的目标平台),a、b、c 和 d 提供对构成整数 X 的字节的访问。 A 将覆盖 x 的第一个字节,b 将覆盖 x 的第二个字节,依此类推。

相反,向 X 写入值将影响 a、b、c 和 d。

无符号字符 y 和 e 占用相同的空间(同样,取决于 int 是 32 位的事实),因此 .y 和 .e 实际上是彼此的别名。

unsigned char Az 不与结构体 B 的任何元素重叠,因此它实际上不受 B 更改的影响。

这里的要点是联合结构体的元素占用相同的内存。不同的结构体允许您使用不同的数据类型,从而提供不同的方式来读取和写入相同的内存。

As Murali said, the size of the union will be that of the largest struct that participates in the union.

The app will allocate enough bytes for the largest block. Memory mapping works like this:

Consider the following:

union Foo
{
  struct A
  {
    int x;
    unsigned char y;
    unsigned char z;

  }
  struct B
  {
    unsigned char a;
    unsigned char b;
    unsigned char c;
    unsigned char d;
    unsigned char e;
  }
}

In this case, assuming that int is 32 bits (which depends on your target platform), a,b,c and d provide access to the bytes that make up the integer X. Writing to A will overwrite the first byte of x, b will overwrite the second byte of x, and so forth.

Conversely, writing a value to X will affect a,b,c and d.

unsigned chars y and e occupy the same space (again, depending on the fact that int is 32 bits) so .y and .e are effectively aliases for each other.

The unsigned char A.z does not overlap any element of struct B, so it is effectively immune to changes to B.

The point here is that the elements of the unioned structs occupy the same memory. The different structs provide different ways to read and write the same memory by letting you use different datatypes.

节枝 2024-08-19 05:39:08

它取决于目标 CPU 的内存模型(大-小)字节序等,以了解内存块的外观,但通常每个结构都会在同一区域上启动相同的地址和布局 -如果你知道它会做什么,那很有用,否则就是一脚射门。

And it depends on the memory model (big-little)endian, etc. of your target CPU to know what the block of memory will look like, but in general each struct will start a the same address and layout over the same area -- useful if you know what it is going to do, a shot to the foot otherwise.

我要还你自由 2024-08-19 05:39:08

可能是因为工会不是一个很棒的功能。它们对于您受到极大限制的内存共享可能很有用。或者不可移植的内存技巧(写入一种大小的变量并从另一种大小读取)。或者用于模仿漂亮的标记联合。

带标签的联合模式的工作原理如下。

enum mytags { FULL_TIME_WORKER, PART_TIMER, CONTRACTOR}

struct worker_t {
    enum mytags tag,
    union {
        struct full_time_worker_t full_time_wroker,
         struct part_time_worker_t part_time_worker,
         struct contractor_t contractor }
    }
}


bool person_can_do_x(worker_t w)
{
    switch(workers.tag)
    {
    case FULL_TIME_WORKER:
    ...
    case PART_TIMER:
    ...
    case CONTRACTOR:
       ...w.contractor.....
    default:
        return true
    }
}

Probably because unions aren't a great feature. They might be useful for memory sharing where you are extremely constrained. Or unportable memory tricks(writing into variables of one size and reading from another). Or for emulating nice tagged unions.

The tagged union pattern works like this.

enum mytags { FULL_TIME_WORKER, PART_TIMER, CONTRACTOR}

struct worker_t {
    enum mytags tag,
    union {
        struct full_time_worker_t full_time_wroker,
         struct part_time_worker_t part_time_worker,
         struct contractor_t contractor }
    }
}


bool person_can_do_x(worker_t w)
{
    switch(workers.tag)
    {
    case FULL_TIME_WORKER:
    ...
    case PART_TIMER:
    ...
    case CONTRACTOR:
       ...w.contractor.....
    default:
        return true
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文