什么时候有人会使用工会?这是纯 C 时代的残余吗?

发布于 2024-10-14 02:16:12 字数 204 浏览 2 评论 0原文

我学到了,但并没有真正加入工会。我读过的每本 C 或 C++ 文本都会介绍它们(有时是顺便介绍),但它们往往很少给出关于为什么或在哪里使用它们的实际示例。工会何时在现代(甚至遗留)情况下有用?我唯一的两个猜测是,当您的工作空间非常有限时,或者当您正在开发 API(或类似的东西)并且您希望强制最终用户在以下位置仅拥有多个对象/类型的一个实例时,对微处理器进行编程:一度。这两个猜测是否接近正确?

I have learned but don't really get unions. Every C or C++ text I go through introduces them (sometimes in passing), but they tend to give very few practical examples of why or where to use them. When would unions be useful in a modern (or even legacy) case? My only two guesses would be programming microprocessors when you have very limited space to work with, or when you're developing an API (or something similar) and you want to force the end user to have only one instance of several objects/types at one time. Are these two guesses even close to right?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(18

像极了他 2024-10-21 02:16:12

联合在处理字节级(低级)数据时很有用。

我最近的使用之一是 IP 地址建模,如下所示:

// Composite structure for IP address storage
union
{
    // IPv4 @ 32-bit identifier
    // Padded 12-bytes for IPv6 compatibility
    union
    {
        struct
        {
            unsigned char _reserved[12];
            unsigned char _IpBytes[4];
        } _Raw;

        struct
        {
            unsigned char _reserved[12];
            unsigned char _o1;
            unsigned char _o2;
            unsigned char _o3;
            unsigned char _o4;    
        } _Octet;    
    } _IPv4;

    // IPv6 @ 128-bit identifier
    // Next generation internet addressing
    union
    {
        struct
        {
            unsigned char _IpBytes[16];
        } _Raw;

        struct
        {
            unsigned short _w1;
            unsigned short _w2;
            unsigned short _w3;
            unsigned short _w4;
            unsigned short _w5;
            unsigned short _w6;
            unsigned short _w7;
            unsigned short _w8;   
        } _Word;
    } _IPv6;
} _IP;

Unions are useful when dealing with byte-level (low level) data.

One of my recent usage was on IP address modeling which looks like below :

// Composite structure for IP address storage
union
{
    // IPv4 @ 32-bit identifier
    // Padded 12-bytes for IPv6 compatibility
    union
    {
        struct
        {
            unsigned char _reserved[12];
            unsigned char _IpBytes[4];
        } _Raw;

        struct
        {
            unsigned char _reserved[12];
            unsigned char _o1;
            unsigned char _o2;
            unsigned char _o3;
            unsigned char _o4;    
        } _Octet;    
    } _IPv4;

    // IPv6 @ 128-bit identifier
    // Next generation internet addressing
    union
    {
        struct
        {
            unsigned char _IpBytes[16];
        } _Raw;

        struct
        {
            unsigned short _w1;
            unsigned short _w2;
            unsigned short _w3;
            unsigned short _w4;
            unsigned short _w5;
            unsigned short _w6;
            unsigned short _w7;
            unsigned short _w8;   
        } _Word;
    } _IPv6;
} _IP;
盛夏已如深秋| 2024-10-21 02:16:12

联合在 C 中提供多态性。

Unions provide polymorphism in C.

风吹雨成花 2024-10-21 02:16:12

我使用联合的一个例子:

class Vector
{
        union 
        {
            double _coord[3];
            struct 
            {
                double _x;
                double _y; 
                double _z;
            };

        };
...
}

这允许我以数组或元素的形式访问数据。

我使用联合让不同的术语指向相同的值。在图像处理中,无论我是在处理列、宽度还是 X 方向的尺寸,都可能会变得令人困惑。为了缓解这个问题,我使用了联合,这样我就知道哪些描述是在一起的。

   union {   // dimension from left to right   // union for the left to right dimension
        uint32_t            m_width;
        uint32_t            m_sizeX;
        uint32_t            m_columns;
    };

    union {   // dimension from top to bottom   // union for the top to bottom dimension
        uint32_t            m_height;
        uint32_t            m_sizeY;
        uint32_t            m_rows;
    };

An example when I've used a union:

class Vector
{
        union 
        {
            double _coord[3];
            struct 
            {
                double _x;
                double _y; 
                double _z;
            };

        };
...
}

this allows me to access my data as an array or the elements.

I've used a union to have the different terms point to the same value. In image processing, whether I was working on columns or width or the size in the X direction, it can become confusing. To alleve this problem, I use a union so I know which descriptions go together.

   union {   // dimension from left to right   // union for the left to right dimension
        uint32_t            m_width;
        uint32_t            m_sizeX;
        uint32_t            m_columns;
    };

    union {   // dimension from top to bottom   // union for the top to bottom dimension
        uint32_t            m_height;
        uint32_t            m_sizeY;
        uint32_t            m_rows;
    };
时光暖心i 2024-10-21 02:16:12

union 关键字虽然仍在 C++031 中使用,但主要是 C 时代的残余。最明显的问题是它仅适用于 POD1

然而,联合体的想法仍然存在,并且确实 Boost 库具有类似联合体的类:

boost::variant<std::string, Foo, Bar>

它具有联合体的大部分优点(如果不是全部的话),并增加了

  • :正确使用非 POD 类型
  • 静态类型安全

在实践中,已经证明它相当于 union + enum 的组合,并且经过基准测试,它的速度一样快(而 boost::any 更多的是 dynamic_cast 的领域,因为它使用 RTTI)。

1C++11 中的联合已升级 (不受限制的联合),现在可以包含带有析构函数的对象,尽管用户必须手动调用析构函数(在当前活动的联合成员上)。使用变体仍然要容易得多。

The union keyword, while still used in C++031, is mostly a remnant of the C days. The most glaring issue is that it only works with POD1.

The idea of the union, however, is still present, and indeed the Boost libraries feature a union-like class:

boost::variant<std::string, Foo, Bar>

Which has most of the benefits of the union (if not all) and adds:

  • ability to correctly use non-POD types
  • static type safety

In practice, it has been demonstrated that it was equivalent to a combination of union + enum, and benchmarked that it was as fast (while boost::any is more of the realm of dynamic_cast, since it uses RTTI).

1Unions were upgraded in C++11 (unrestricted unions), and can now contain objects with destructors, although the user has to invoke the destructor manually (on the currently active union member). It's still much easier to use variants.

天涯沦落人 2024-10-21 02:16:12

一个例子是在嵌入式领域,其中寄存器的每一位可能意味着不同的东西。例如,8 位整数和具有 8 个独立的 1 位位域的结构的并集允许您更改一位或整个字节。

One example is in the embedded realm, where each bit of a register may mean something different. For example, a union of an 8-bit integer and a structure with 8 separate 1-bit bitfields allows you to either change one bit or the entire byte.

隔纱相望 2024-10-21 02:16:12

赫伯·萨特GOTW 大约六年前,添加了强调

“但不要认为联合只是早期的遗留物。联合可能是通过允许数据重叠来节省空间最有用的,这在 C++ 中仍然是可取的 例如,世界上一些最先进的 C++ 标准库实现现在就使用这种技术来实现“小字符串”。优化”,这是一个很好的优化替代方案,它重用字符串对象本身内部的存储:对于大字符串,字符串对象内部的空间存储指向动态分配的缓冲区的常用指针和缓冲区大小等内务信息;对于小字符串,相反,重复使用相同的空间来直接存储字符串内容,并完全避免任何动态内存分配。有关小字符串优化(以及相当深度的其他字符串优化和悲观化)的更多信息,请参阅...。”

对于不太有用的示例,请参阅冗长但不确定的问题 gcc, strict - 别名,并通过联合进行转换

Herb Sutter wrote in GOTW about six years ago, with emphasis added:

"But don't think that unions are only a holdover from earlier times. Unions are perhaps most useful for saving space by allowing data to overlap, and this is still desirable in C++ and in today's modern world. For example, some of the most advanced C++ standard library implementations in the world now use just this technique for implementing the "small string optimization," a great optimization alternative that reuses the storage inside a string object itself: for large strings, space inside the string object stores the usual pointer to the dynamically allocated buffer and housekeeping information like the size of the buffer; for small strings, the same space is instead reused to store the string contents directly and completely avoid any dynamic memory allocation. For more about the small string optimization (and other string optimizations and pessimizations in considerable depth), see... ."

And for a less useful example, see the long but inconclusive question gcc, strict-aliasing, and casting through a union.

思念绕指尖 2024-10-21 02:16:12

嗯,我能想到的一个示例用例是这样的:

typedef union
{
    struct
    {
        uint8_t a;
        uint8_t b;
        uint8_t c;
        uint8_t d;
    };
    uint32_t x;
} some32bittype;

然后您可以访问该 32 位数据块的 8 位单独部分;然而,请做好可能被字节序所影响的准备。

这只是一个假设的示例,但每当您想要将字段中的数据拆分为这样的组成部分时,您都可以使用并集。

也就是说,还有一种字节序安全的方法:

uint32_t x;
uint8_t a = (x & 0xFF000000) >> 24;

例如,因为二进制运算将由编译器转换为正确的字节序。

Well, one example use case I can think of is this:

typedef union
{
    struct
    {
        uint8_t a;
        uint8_t b;
        uint8_t c;
        uint8_t d;
    };
    uint32_t x;
} some32bittype;

You can then access the 8-bit separate parts of that 32-bit block of data; however, prepare to potentially be bitten by endianness.

This is just one hypothetical example, but whenever you want to split data in a field into component parts like this, you could use a union.

That said, there is also a method which is endian-safe:

uint32_t x;
uint8_t a = (x & 0xFF000000) >> 24;

For example, since that binary operation will be converted by the compiler to the correct endianness.

无法回应 2024-10-21 02:16:12

联合的一些用途:

  • 为未知的外部主机提供通用的字节顺序接口。
  • 操作外部CPU架构浮点数据,例如接受VAX G_FLOATS 从网络链接并将其转换为 IEEE 754 长用于处理的实数
  • 提供对更高级别类型的直接位操作访问。

<前><代码>联合{
无符号字符byte_v[16];
长双 ld_v;
}

使用此声明,可以轻松显示 long double 的十六进制字节值、更改指数符号、确定它是否为非正规值,或为 CPU 实现 long double 算术不支持等等

  • 依赖于某些值时节省存储空间:

    类人{  
        字符串名称;  
    
        字符性别; // M = 男性,F = 女性,O = 其他  
        联盟{  
            输精管结扎日期; // 对于男性  
            怀孕期间; // 对于女性  
        性别特定数据;
    }
    
  • Grep 包含文件以供编译器使用。您会发现 union 有数十到数百种用途:

    [wally@zenetfedora ~]$ cd /usr/include
    [wally@zenetfedora include]$ grep -w union *
    a.out.h:联合
    argp.h:解析选项,调用 getopt 并结合所有 argp
    bfd.h:联盟
    bfd.h:联盟
    bfd.h:联合内部_auxent;
    bfd.h: (bfd *, struct bfd_symbol *, int, 联合internal_auxent *);
    bfd.h: 联合 {
    bfd.h: /* 符号的值。这确实应该是一个联盟
    bfd.h:联盟
    bfd.h:联盟
    bfdlink.h: /* 取决于类型的信息联合。 */
    bfdlink.h:联合
    bfdlink.h:这个字段。该字段存在于所有联合元素中
    bfdlink.h:联盟;该结构是主要的空间用户
    bfdlink.h:联合
    bfdlink.h:联合
    curses.h:联合
    db_cxx.h://4201:无名结构/联合
    elf.h:联盟
    elf.h:联盟
    elf.h:联盟
    elf.h:联盟
    elf.h:typedef 联合
    _G_config.h:typedef 联合
    gcrypt.h:联合
    gcrypt.h:联合
    gcrypt.h:联合
    gmp-i386.h: 联合 {
    ieee754.h:联合 ieee754_float
    ieee754.h:联合 ieee754_double
    ieee754.h:联合ieee854_long_double
    ifaddrs.h:联合
    jpeglib.h:联合{
    ldap.h:联合 mod_vals_u {
    ncurses.h:联合
    newt.h: 联合 {
    obstack.h:联合
    pi-file.h: 联合 {
    resolv.h: 联合 {
    signal.h:extern int sigqueue (__pid_t __pid, int __sig, __const union sigval __val)
    stdlib.h:/* 允许传统 BSD 使用 `union wait' 的大量头发
    stdlib.h: (__extension__ (((union { __typeof(status) __in; int __i; }) \
    stdlib.h:/* 这是“wait”参数的类型。时髦的联盟
    stdlib.h:导致使用“int *”或“union wait *”进行重新声明
    stdlib.h:typedef 联合
    stdlib.h: 联合等待 *__uptr;
    stdlib.h: } __WAIT_STATUS __attribute__ ((__transparent_union__));
    thread_db.h:联合
    thread_db.h:联合
    tiffio.h: 联合 {
    wchar.h:联合
    xf86drm.h:typedef union _drmVBlank {
    

Some uses for unions:

  • Provide a general endianness interface to an unknown external host.
  • Manipulate foreign CPU architecture floating point data, such as accepting VAX G_FLOATS from a network link and converting them to IEEE 754 long reals for processing.
  • Provide straightforward bit twiddling access to a higher-level type.
union {
      unsigned char   byte_v[16];
      long double     ld_v;
 }

With this declaration, it is simple to display the hex byte values of a long double, change the exponent's sign, determine if it is a denormal value, or implement long double arithmetic for a CPU which does not support it, etc.

  • Saving storage space when fields are dependent on certain values:

    class person {  
        string name;  
    
        char gender;   // M = male, F = female, O = other  
        union {  
            date  vasectomized;  // for males  
            int   pregnancies;   // for females  
        } gender_specific_data;
    }
    
  • Grep the include files for use with your compiler. You'll find dozens to hundreds of uses of union:

    [wally@zenetfedora ~]$ cd /usr/include
    [wally@zenetfedora include]$ grep -w union *
    a.out.h:  union
    argp.h:   parsing options, getopt is called with the union of all the argp
    bfd.h:  union
    bfd.h:  union
    bfd.h:union internal_auxent;
    bfd.h:  (bfd *, struct bfd_symbol *, int, union internal_auxent *);
    bfd.h:  union {
    bfd.h:  /* The value of the symbol.  This really should be a union of a
    bfd.h:  union
    bfd.h:  union
    bfdlink.h:  /* A union of information depending upon the type.  */
    bfdlink.h:  union
    bfdlink.h:       this field.  This field is present in all of the union element
    bfdlink.h:       the union; this structure is a major space user in the
    bfdlink.h:  union
    bfdlink.h:  union
    curses.h:    union
    db_cxx.h:// 4201: nameless struct/union
    elf.h:  union
    elf.h:  union
    elf.h:  union
    elf.h:  union
    elf.h:typedef union
    _G_config.h:typedef union
    gcrypt.h:  union
    gcrypt.h:    union
    gcrypt.h:    union
    gmp-i386.h:  union {
    ieee754.h:union ieee754_float
    ieee754.h:union ieee754_double
    ieee754.h:union ieee854_long_double
    ifaddrs.h:  union
    jpeglib.h:  union {
    ldap.h: union mod_vals_u {
    ncurses.h:    union
    newt.h:    union {
    obstack.h:  union
    pi-file.h:  union {
    resolv.h:   union {
    signal.h:extern int sigqueue (__pid_t __pid, int __sig, __const union sigval __val)
    stdlib.h:/* Lots of hair to allow traditional BSD use of `union wait'
    stdlib.h:  (__extension__ (((union { __typeof(status) __in; int __i; }) \
    stdlib.h:/* This is the type of the argument to `wait'.  The funky union
    stdlib.h:   causes redeclarations with either `int *' or `union wait *' to be
    stdlib.h:typedef union
    stdlib.h:    union wait *__uptr;
    stdlib.h:  } __WAIT_STATUS __attribute__ ((__transparent_union__));
    thread_db.h:  union
    thread_db.h:  union
    tiffio.h:   union {
    wchar.h:  union
    xf86drm.h:typedef union _drmVBlank {
    
三月梨花 2024-10-21 02:16:12

我发现 C++ 联合非常酷。似乎人们通常只考虑想要“就地”更改联合实例的值的用例(这似乎仅用于节省内存或执行可疑的转换)。

事实上,联合作为一种软件工程工具具有强大的威力,即使您从不更改任何联合实例的值

用例 1:变色龙

通过联合,您可以将多个任意类重新组合到一个名称下,这与基类及其派生类的情况并非没有相似之处。然而,改变的是对于给定的联合实例可以做什么和不能做什么:

struct Batman;
struct BaseballBat;

union Bat
{
    Batman brucewayne;
    BaseballBat club;
};

ReturnType1 f(void)
{
    BaseballBat bb = {/* */};
    Bat b;
    b.club = bb;
    // do something with b.club
}

ReturnType2 g(Bat& b)
{
    // do something with b, but how do we know what's inside?
}

Bat returnsBat(void);
ReturnType3 h(void)
{
    Bat b = returnsBat();
    // do something with b, but how do we know what's inside?
}

程序员在想要使用给定联合实例时似乎必须确定其内容的类型。上面的函数f就是这种情况。然而,如果一个函数接收一个联合实例作为传递的参数,就像上面的 g 的情况一样,那么它不知道如何处理它。这同样适用于返回联合实例的函数,请参阅h:调用者如何知道里面有什么?

如果联合实例永远不会作为参数或返回值传递,那么它的生活必然非常单调,当程序员选择更改其内容时,会感到兴奋不已:

Batman bm = {/* */};
Baseball bb = {/* */};
Bat b;
b.brucewayne = bm;
// stuff
b.club = bb;

这是最(不)流行的用例工会。另一个用例是联合实例附带一些可以告诉您其类型的信息。

用例 2:“很高兴认识你,我是来自 Classobject

假设程序员选择始终将联合实例与类型描述符配对(I'将由读者自行想象一个这样的对象的实现)。如果程序员想要节省内存并且类型描述符的大小相对于联合的大小不可忽略,那么这就违背了联合本身的目的。但我们假设联合实例可以作为参数或返回值传递,而被调用者或调用者不知道里面的内容,这一点至关重要。

然后程序员必须编写一个 switch 控制流语句来告诉 Bruce Wayne 与木棍或类似的东西不同。当联合体中只有两种类型的内容时,这还不算太糟糕,但显然,联合体不再可扩展。

用例 3:

作为 ISO C++ 标准建议的作者< /a> 把它放回 2008 年,

许多重要的问题域需要大量的对象或有限的内存
资源。在这些情况下,节省空间非常重要,而联合通常是实现这一点的完美方式。事实上,一个常见的用例是联合体在其生命周期内永远不会更改其活跃成员的情况。它可以被构造、复制和破坏,就像它是一个只包含一个成员的结构一样。一个典型的应用是创建不相关类型的异构集合,这些类型不是动态分配的(可能它们是在映射或数组成员中就地构造的)。

现在,举一个带有 UML 类图的示例:

many Composition for class A

简单英语的情况:一个对象类 A 可以 具有 B1、...、Bn 中任何类的对象,并且每种类型最多一个,其中 n 至少是一个相当大的数字10.

我们不想像这样向 A 添加字段(数据成员):

private:
    B1 b1;
    .
    .
    .
    Bn bn;

因为 n 可能会有所不同(我们可能希望将 Bx 类添加到混合中),并且因为这会导致混乱使用构造函数,因为 A 对象会占用大量空间。

我们可以使用一个古怪的容器,其中包含指向 Bx 对象的指针,并进行强制转换来检索它们,但这很丑陋,而且是 C 风格的……但更重要的是,这会留下我们需要管理许多动态分配对象的生命周期。

相反,可以做的是:

union Bee
{
    B1 b1;
    .
    .
    .
    Bn bn;
};

enum BeesTypes { TYPE_B1, ..., TYPE_BN };

class A
{
private:
    std::unordered_map<int, Bee> data; // C++11, otherwise use std::map

public:
    Bee get(int); // the implementation is obvious: get from the unordered map
};

然后,要从 data 获取联合实例的内容,您可以使用 a.get(TYPE_B2).b2 等,其中a是类A实例。

由于联合在 C++11 中不受限制,这一点更加强大。请参阅上面链接的文档这篇文章了解详细信息。

I find C++ unions pretty cool. It seems that people usually only think of the use case where one wants to change the value of a union instance "in place" (which, it seems, serves only to save memory or perform doubtful conversions).

In fact, unions can be of great power as a software engineering tool, even when you never change the value of any union instance.

Use case 1: the chameleon

With unions, you can regroup a number of arbitrary classes under one denomination, which isn't without similarities with the case of a base class and its derived classes. What changes, however, is what you can and can't do with a given union instance:

struct Batman;
struct BaseballBat;

union Bat
{
    Batman brucewayne;
    BaseballBat club;
};

ReturnType1 f(void)
{
    BaseballBat bb = {/* */};
    Bat b;
    b.club = bb;
    // do something with b.club
}

ReturnType2 g(Bat& b)
{
    // do something with b, but how do we know what's inside?
}

Bat returnsBat(void);
ReturnType3 h(void)
{
    Bat b = returnsBat();
    // do something with b, but how do we know what's inside?
}

It appears that the programmer has to be certain of the type of the content of a given union instance when he wants to use it. It is the case in function f above. However, if a function were to receive a union instance as a passed argument, as is the case with g above, then it wouldn't know what to do with it. The same applies to functions returning a union instance, see h: how does the caller know what's inside?

If a union instance never gets passed as an argument or as a return value, then it's bound to have a very monotonous life, with spikes of excitement when the programmer chooses to change its content:

Batman bm = {/* */};
Baseball bb = {/* */};
Bat b;
b.brucewayne = bm;
// stuff
b.club = bb;

And that's the most (un)popular use case of unions. Another use case is when a union instance comes along with something that tells you its type.

Use case 2: "Nice to meet you, I'm object, from Class"

Suppose a programmer elected to always pair up a union instance with a type descriptor (I'll leave it to the reader's discretion to imagine an implementation for one such object). This defeats the purpose of the union itself if what the programmer wants is to save memory and that the size of the type descriptor is not negligible with respect to that of the union. But let's suppose that it's crucial that the union instance could be passed as an argument or as a return value with the callee or caller not knowing what's inside.

Then the programmer has to write a switch control flow statement to tell Bruce Wayne apart from a wooden stick, or something equivalent. It's not too bad when there are only two types of contents in the union but obviously, the union doesn't scale anymore.

Use case 3:

As the authors of a recommendation for the ISO C++ Standard put it back in 2008,

Many important problem domains require either large numbers of objects or limited memory
resources. In these situations conserving space is very important, and a union is often a perfect way to do that. In fact, a common use case is the situation where a union never changes its active member during its lifetime. It can be constructed, copied, and destructed as if it were a struct containing only one member. A typical application of this would be to create a heterogeneous collection of unrelated types which are not dynamically allocated (perhaps they are in-place constructed in a map, or members of an array).

And now, an example, with a UML class diagram:

many compositions for class A

The situation in plain English: an object of class A can have objects of any class among B1, ..., Bn, and at most one of each type, with n being a pretty big number, say at least 10.

We don't want to add fields (data members) to A like so:

private:
    B1 b1;
    .
    .
    .
    Bn bn;

because n might vary (we might want to add Bx classes to the mix), and because this would cause a mess with constructors and because A objects would take up a lot of space.

We could use a wacky container of void* pointers to Bx objects with casts to retrieve them, but that's fugly and so C-style... but more importantly that would leave us with the lifetimes of many dynamically allocated objects to manage.

Instead, what can be done is this:

union Bee
{
    B1 b1;
    .
    .
    .
    Bn bn;
};

enum BeesTypes { TYPE_B1, ..., TYPE_BN };

class A
{
private:
    std::unordered_map<int, Bee> data; // C++11, otherwise use std::map

public:
    Bee get(int); // the implementation is obvious: get from the unordered map
};

Then, to get the content of a union instance from data, you use a.get(TYPE_B2).b2 and the likes, where a is a class A instance.

This is all the more powerful since unions are unrestricted in C++11. See the document linked to above or this article for details.

胡大本事 2024-10-21 02:16:12

联合通常与鉴别器一起使用:一个指示联合的哪些字段有效的变量。例如,假设您想创建自己的 Variant 类型:

struct my_variant_t {
    int type;
    union {
        char char_value;
        short short_value;
        int int_value;
        long long_value;
        float float_value;
        double double_value;
        void* ptr_value;
    };
};

那么您将使用它,例如:

/* construct a new float variant instance */
void init_float(struct my_variant_t* v, float initial_value) {
    v->type = VAR_FLOAT;
    v->float_value = initial_value;
}

/* Increments the value of the variant by the given int */
void inc_variant_by_int(struct my_variant_t* v, int n) {
    switch (v->type) {
    case VAR_FLOAT:
        v->float_value += n;
        break;

    case VAR_INT:
        v->int_value += n;
        break;
    ...
    }
}

这实际上是一个非常常见的习惯用法,特别是在 Visual Basic 内部。

有关真实示例,请参阅 SDL 的 SDL_Event union。 (此处的实际源代码)。联合顶部有一个 type 字段,并且每个 SDL_*Event 结构上都会重复相同的字段。然后,要处理正确的事件,您需要检查 type 字段的值。

好处很简单:有一种数据类型可以处理所有事件类型,而无需使用不必要的内存。

Unions are usually used with the company of a discriminator: a variable indicating which of the fields of the union is valid. For example, let's say you want to create your own Variant type:

struct my_variant_t {
    int type;
    union {
        char char_value;
        short short_value;
        int int_value;
        long long_value;
        float float_value;
        double double_value;
        void* ptr_value;
    };
};

Then you would use it such as:

/* construct a new float variant instance */
void init_float(struct my_variant_t* v, float initial_value) {
    v->type = VAR_FLOAT;
    v->float_value = initial_value;
}

/* Increments the value of the variant by the given int */
void inc_variant_by_int(struct my_variant_t* v, int n) {
    switch (v->type) {
    case VAR_FLOAT:
        v->float_value += n;
        break;

    case VAR_INT:
        v->int_value += n;
        break;
    ...
    }
}

This is actually a pretty common idiom, specially on Visual Basic internals.

For a real example see SDL's SDL_Event union. (actual source code here). There is a type field at the top of the union, and the same field is repeated on every SDL_*Event struct. Then, to handle the correct event you need to check the value of the type field.

The benefits are simple: there is one single data type to handle all event types without using unnecessary memory.

想念有你 2024-10-21 02:16:12

最近,严格别名规则在最近版本的C标准中引入。

您可以使用 union do 来类型双关而不违反 C标准。
该程序具有未指定的行为(因为我假设floatunsigned int具有相同的长度),但没有未定义的行为 em> (参见在这里)。

#include <stdio.h> 

union float_uint
{
    float f;
    unsigned int ui;
};

int main()
{
    float v = 241;
    union float_uint fui = {.f = v};

    //May trigger UNSPECIFIED BEHAVIOR but not UNDEFINED BEHAVIOR 
    printf("Your IEEE 754 float sir: %08x\n", fui.ui);

    //This is UNDEFINED BEHAVIOR as it violates the Strict Aliasing Rule
    unsigned int* pp = (unsigned int*) &v;

    printf("Your IEEE 754 float, again, sir: %08x\n", *pp);

    return 0;
}

One recent boost on the, already elevated, importance of the unions has been given by the Strict Aliasing Rule introduced in recent version of C standard.

You can use unions do to type-punning without violating the C standard.
This program has unspecified behavior (because I have assumed that float and unsigned int have the same length) but not undefined behavior (see here).

#include <stdio.h> 

union float_uint
{
    float f;
    unsigned int ui;
};

int main()
{
    float v = 241;
    union float_uint fui = {.f = v};

    //May trigger UNSPECIFIED BEHAVIOR but not UNDEFINED BEHAVIOR 
    printf("Your IEEE 754 float sir: %08x\n", fui.ui);

    //This is UNDEFINED BEHAVIOR as it violates the Strict Aliasing Rule
    unsigned int* pp = (unsigned int*) &v;

    printf("Your IEEE 754 float, again, sir: %08x\n", *pp);

    return 0;
}
琴流音 2024-10-21 02:16:12

假设您有 n 种不同类型的配置(只是一组定义参数的变量)。通过使用配置类型的枚举,您可以定义一个具有配置类型 ID 的结构,以及所有不同类型配置的并集。

这样,无论您在何处传递配置,都可以使用 ID 来确定如何解释配置数据,但如果配置很大,您就不会被迫为每个潜在类型使用并行结构,从而浪费空间。

Lets say you have n different types of configurations (just being a set of variables defining parameters). By using an enumeration of the configuration types, you can define a structure that has the ID of the configuration type, along with a union of all the different types of configurations.

This way, wherever you pass the configuration can use the ID to determine how to interpret the configuration data, but if the configurations were huge you would not be forced to have parallel structures for each potential type wasting space.

挽手叙旧 2024-10-21 02:16:12

联合提供了一种在单个存储区域中操作不同类型数据的方法,而无需在程序中嵌入任何与机器无关的信息
它们类似于 pascal 中的变体记录

作为一个可能在编译器符号表管理器中找到的示例,假设
常量可以是 int、float 或字符指针。特定常数的值
必须存储在适当类型的变量中,但无论其类型如何,如果该值占用相同的存储量并存储在相同的位置,则对于表管理来说是最方便的。这就是联合的​​目的——一个可以合法地保存多种类型中的任何一种的单个变量。语法基于结构:

union u_tag {
     int ival;
     float fval;
     char  *sval;
} u;

变量 u 将足够大以容纳三种类型中最大的一个;具体大小取决于实现。这些类型中的任何一个都可以分配给 u,然后用于
表达式,只要用法一致

Unions provide a way to manipulate different kind of data in a single area of storage without embedding any machine independent information in the program
They are analogous to variant records in pascal

As an example such as might be found in a compiler symbol table manager, suppose that a
constant may be an int, a float, or a character pointer. The value of a particular constant
must be stored in a variable of the proper type, yet it is most convenient for table management if the value occupies the same amount of storage and is stored in the same place regardless of its type. This is the purpose of a union - a single variable that can legitimately hold any of one of several types. The syntax is based on structures:

union u_tag {
     int ival;
     float fval;
     char  *sval;
} u;

The variable u will be large enough to hold the largest of the three types; the specific size is implementation-dependent. Any of these types may be assigned to u and then used in
expressions, so long as the usage is consistent

时光匆匆的小流年 2024-10-21 02:16:12

我想添加一个使用联合的很好的实际示例 - 实现公式计算器/解释器或在计算中使用某种形式(例如,您希望在计算的运行时期间使用可修改的部分)公式 - 数值求解方程 - 仅作为示例)。
因此,您可能想要定义不同类型(整数、浮点,甚至复数)的数字/常量,如下所示:

struct Number{
enum NumType{int32, float, double, complex}; NumType num_t;
union{int ival; float fval; double dval; ComplexNumber cmplx_val}
}

因此,您可以节省内存,更重要的是 - 您可以避免任何可能极端数量的动态分配(如果您使用大量运行时定义的数字)的小对象(与通过类继承/多态性的实现相比)。但更有趣的是,您仍然可以对这种类型的结构使用 C++ 多态性的强大功能(例如,如果您喜欢双重调度;)。只需将“虚拟”接口指针添加到所有数字类型的父类作为此结构的字段,指向此实例而不是/除了原始类型之外,或者使用良好的旧 C 函数指针。

struct NumberBase
{
virtual Add(NumberBase n);
...
}
struct NumberInt: Number
{
//implement methods assuming Number's union contains int
NumberBase Add(NumberBase n);
...
}
struct NumberDouble: Number
{
 //implement methods assuming Number's union contains double
 NumberBase Add(NumberBase n);
 ...
}
//e.t.c. for all number types/or use templates
struct Number: NumberBase{
 union{int ival; float fval; double dval; ComplexNumber cmplx_val;}
 NumberBase* num_t;
 Set(int a)
 {
 ival=a;
  //still kind of hack, hope it works because derived classes of   Number    dont add any fields
 num_t = static_cast<NumberInt>(this);
 }
}

因此,您可以使用多态性来代替 switch(type) 的类型检查 - 具有内存高效的实现(没有小对象的动态分配) - 当然,如果您需要的话。

I would like to add one good practical example for using union - implementing formula calculator/interpreter or using some kind of it in computation(for example, you want to use modificable during run-time parts of your computing formulas - solving equation numerically - just for example).
So you may want to define numbers/constants of different types(integer, floating-point, even complex numbers) like this:

struct Number{
enum NumType{int32, float, double, complex}; NumType num_t;
union{int ival; float fval; double dval; ComplexNumber cmplx_val}
}

So you're saving memory and what is more important - you avoid any dynamic allocations for probably extreme quantity(if you use a lot of run-time defined numbers) of small objects(compared to implementations through class inheritance/polymorphism). But what's more interesting, you still can use power of C++ polymorphism(if you're fan of double dispatching, for example ;) with this type of struct. Just add "dummy" interface pointer to parent class of all number types as a field of this struct, pointing to this instance instead of/in addition to raw type, or use good old C function pointers.

struct NumberBase
{
virtual Add(NumberBase n);
...
}
struct NumberInt: Number
{
//implement methods assuming Number's union contains int
NumberBase Add(NumberBase n);
...
}
struct NumberDouble: Number
{
 //implement methods assuming Number's union contains double
 NumberBase Add(NumberBase n);
 ...
}
//e.t.c. for all number types/or use templates
struct Number: NumberBase{
 union{int ival; float fval; double dval; ComplexNumber cmplx_val;}
 NumberBase* num_t;
 Set(int a)
 {
 ival=a;
  //still kind of hack, hope it works because derived classes of   Number    dont add any fields
 num_t = static_cast<NumberInt>(this);
 }
}

so you can use polymorphism instead of type checks with switch(type) - with memory-efficient implementation(no dynamic allocation of small objects) - if you need it, of course.

二智少女猫性小仙女 2024-10-21 02:16:12

来自http://cplus.about.com/od/learningc/ss/lowlevel_9。嗯

union 的使用很少而且相距很远。在大多数计算机上,大小
指针和 int 通常是相同的 - 这是因为两者
通常适合 CPU 中的寄存器。所以如果你想做一个快速的
以及指向 int 的指针的脏转换或其他方式,声明一个
工会。

union intptr { int i;整数*p; }; 
联合 intptr x; xi = 1000; 
/* 将 90 放在位置 1000 */ 
*(xp)=90; 

联合的另一个用途是在命令或消息协议中,其中
发送和接收不同大小的消息。每种消息类型都会
保存不同的信息,但每个信息都有固定的部分(可能是
结构)和可变部分位。这就是您实现它的方式..

<代码>struct head { int id;整数响应;整数大小; }; struct msgstring50 { 结构头固定;字符消息[50]; } 结构体

struct msgstring80 { 结构头固定;字符消息[80]; }
struct msgint10 { 结构头固定;整数消息[10]; } 结构体
msgack { 结构头固定;好的; } 联合消息类型 {
结构 msgstring50 m50;结构 msgstring80 m80;结构msgint10
i10;结构msgack ack; }

在实践中,虽然联合的大小相同,但有意义
只发送有意义的数据而不浪费空间。 msgack 只是
大小为 16 个字节,而 msgstring80 的大小为 92 个字节。所以当一个
messagetype 变量已初始化,它的大小字段已设置
根据它是什么类型。然后这可以被其他人使用
函数来传输正确的字节数。

From http://cplus.about.com/od/learningc/ss/lowlevel_9.htm:

The uses of union are few and far between. On most computers, the size
of a pointer and an int are usually the same- this is because both
usually fit into a register in the CPU. So if you want to do a quick
and dirty cast of a pointer to an int or the other way, declare a
union.

union intptr {   int i;   int * p; }; 
union intptr x; x.i = 1000; 
/* puts 90 at location 1000 */ 
*(x.p)=90; 

Another use of a union is in a command or message protocol where
different size messages are sent and received. Each message type will
hold different information but each will have a fixed part (probably a
struct) and a variable part bit. This is how you might implement it..

struct head {   int id;   int response;   int size; }; struct msgstring50 {    struct head fixed;    char message[50]; } struct

struct msgstring80 { struct head fixed; char message[80]; }
struct msgint10 { struct head fixed; int message[10]; } struct
msgack { struct head fixed; int ok; } union messagetype {
struct msgstring50 m50; struct msgstring80 m80; struct msgint10
i10; struct msgack ack; }

In practice, although the unions are the same size, it makes sense to
only send the meaningful data and not wasted space. A msgack is just
16 bytes in size while a msgstring80 is 92 bytes. So when a
messagetype variable is initialized, it has its size field set
according to which type it is. This can then be used by other
functions to transfer the correct number of bytes.

夜吻♂芭芘 2024-10-21 02:16:12

在 C 的早期(例如 1974 年记录的),所有结构为其成员共享一个公共名称空间。每个成员名称都与一个类型和一个偏移量相关联;如果“wd_woozle”是偏移量 12 处的“int”,则给定任何结构类型的指针 pp->wd_woozle 将等效于 * (int*)(((char*)p)+12)。该语言要求所有结构类型的所有成员都具有唯一的名称,例外,它明确允许在使用成员名称的每个结构将它们视为公共初始序列的情况下重用成员名称。

结构类型可以混杂使用这一事实使得结构的行为就好像它们包含重叠字段一样。例如,给定定义:

struct float1 { float f0;};
struct byte4  { char b0,b1,b2,b3; }; /* Unsigned didn't exist yet */

代码可以声明类型“float1”的结构,然后使用“成员”b0...b3 访问其中的各个字节。当更改语言以便每个结构为其成员接收单独的命名空间时,依赖于多种方式访问​​事物的能力的代码将会崩溃。为不同的结构类型分离命名空间的价值足以要求更改此类代码以适应它,但此类技术的价值足以证明扩展语言以继续支持它是合理的。

为利用访问 struct float1 内存储的能力而编写的代码,就好像它是 struct byte4 一样,可以通过添加声明:union f1b4 { struct float1 ff;结构体byte4 bb; };,将对象声明为 union f1b4; 类型,而不是 struct float1,并替换对 f0b0 的访问b1 等,以及 ff.f0bb.b0bb.b1、虽然有更好的方法可以支持此类代码,但 union 方法至少在某种程度上是可行的,至少对于 C89 时代的别名规则解释来说是这样。

In the earliest days of C (e.g. as documented in 1974), all structures shared a common namespace for their members. Each member name was associated with a type and an offset; if "wd_woozle" was an "int" at offset 12, then given a pointer p of any structure type, p->wd_woozle would be equivalent to *(int*)(((char*)p)+12). The language required that all members of all structures types have unique names except that it explicitly allowed reuse of member names in cases where every struct where they were used treated them as a common initial sequence.

The fact that structure types could be used promiscuously made it possible to have structures behave as though they contained overlapping fields. For example, given definitions:

struct float1 { float f0;};
struct byte4  { char b0,b1,b2,b3; }; /* Unsigned didn't exist yet */

code could declare a structure of type "float1" and then use "members" b0...b3 to access the individual bytes therein. When the language was changed so that each structure would receive a separate namespace for its members, code which relied upon the ability to access things multiple ways would break. The values of separating out namespaces for different structure types was sufficient to require that such code be changed to accommodate it, but the value of such techniques was sufficient to justify extending the language to continue supporting it.

Code which had been written to exploit the ability to access the storage within a struct float1 as though it were a struct byte4 could be made to work in the new language by adding a declaration: union f1b4 { struct float1 ff; struct byte4 bb; };, declaring objects as type union f1b4; rather than struct float1, and replacing accesses to f0, b0, b1, etc. with ff.f0, bb.b0, bb.b1, etc. While there are better ways such code could have been supported, the union approach was at least somewhat workable, at least with C89-era interpretations of the aliasing rules.

浮世清欢 2024-10-21 02:16:12

来自有关联合的维基百科文章

联合的主要用途是
节省空间,因为它提供了
让许多不同类型成为的方式
保存在同一个空间。工会也
提供粗略的多态性。然而,
没有类型检查,所以它
由程序员来确定
访问正确的字段
不同的背景。相关领域
联合变量的通常是
由其他人的状态决定
变量,可能在封闭的
结构体。

一种常见的 C 编程习惯用法
联合来执行 C++ 所谓的
reinterpret_cast,通过分配给一个
联合的领域并从中读取
另一个,正如代码中所做的那样
取决于的原始表示
值。

From the Wikipedia article on unions:

The primary usefulness of a union is
to conserve space, since it provides a
way of letting many different types be
stored in the same space. Unions also
provide crude polymorphism. However,
there is no checking of types, so it
is up to the programmer to be sure
that the proper fields are accessed in
different contexts. The relevant field
of a union variable is typically
determined by the state of other
variables, possibly in an enclosing
struct.

One common C programming idiom uses
unions to perform what C++ calls a
reinterpret_cast, by assigning to one
field of a union and reading from
another, as is done in code which
depends on the raw representation of
the values.

拥抱我好吗 2024-10-21 02:16:12

union 的一个绝妙用法是内存对齐,我在 PCL(点云库)源代码中找到了它。 API 中的单一数据结构可以针对两种架构:支持 SSE 的 CPU 以及不支持 SSE 的 CPU。例如:PointXYZ 的数据结构是

typedef union
{
  float data[4];
  struct
  {
    float x;
    float y;
    float z;
  };
} PointXYZ;

3 个浮点数用一个附加浮点数填充以进行 SSE 对齐。
因此,

PointXYZ point;

用户可以访问 point.data[0] 或 point.x (取决于 SSE 支持)来访问 x 坐标。
更多类似的更好的使用详细信息请参见以下链接:PCL 文档 PointT 类型

A brilliant usage of union is memory alignment, which I found in the PCL(Point Cloud Library) source code. The single data structure in the API can target two architectures: CPU with SSE support as well as the CPU without SSE support. For eg: the data structure for PointXYZ is

typedef union
{
  float data[4];
  struct
  {
    float x;
    float y;
    float z;
  };
} PointXYZ;

The 3 floats are padded with an additional float for SSE alignment.
So for

PointXYZ point;

The user can either access point.data[0] or point.x (depending on the SSE support) for accessing say, the x coordinate.
More similar better usage details are on following link: PCL documentation PointT types

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文