联合作为基类
标准定义Union不能用作Base类,但是这有什么具体的理由吗?据我了解,联合可以有构造函数、析构函数、成员变量以及对这些变量进行操作的方法。简而言之,联合可以封装可通过成员函数访问的数据类型和状态。因此,从最常见的角度来看,它有资格成为一个类,如果它可以充当一个类,那么为什么它不能充当基类呢?
编辑:虽然答案试图解释推理,但我仍然不明白 Union 作为派生类如何比 Union 作为一个类时更糟糕。因此,为了获得更具体的答案和推理,我将推动这个问题以获得赏金。无意冒犯已经发布的答案,谢谢!
The standard defines that Unions cannot be used as Base class, but is there any specific reasoning for this? As far as I understand Unions can have constructors, destructors, also member variables, and methods to operate on those varibales. In short a Union can encapsulate a datatype and state which might be accessed through member functions. Thus it in most common terms qualifies for being a class and if it can act as a class then why is it restricted from acting as a base class?
Edit: Though the answers try to explain the reasoning I still do not understand how Union as a Derived class is worst than when Union as just a class. So in hope of getting more concrete answer and reasoning I will push this one for a bounty. No offence to the already posted answers, Thanks for those!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
Bjarne Stroustrup 在《带注释的 C++ 参考手册》中表示“似乎没有什么理由这样做”。
Bjarne Stroustrup said 'there seems little reason for it' in The Annotated C++ Reference Manual.
标题询问为什么联合不能作为基类,但问题似乎是关于联合作为派生类。那么,是哪一个呢?
没有任何技术原因说明联合体不能成为基类;这是不允许的。合理的解释是将联合视为一个结构,其成员可能在内存中重叠,并将派生类视为从此(相当奇怪的)结构继承的类。如果您需要该功能,通常可以说服大多数编译器接受匿名联合作为结构的成员。这是一个适合用作基类的示例。 (为了更好的衡量,联合中有一个匿名结构。)
联合作为派生类的基本原理可能更简单:结果不会是联合。工会必须是所有成员和所有基地的联合体。这很公平,并且可能会带来一些有趣的模板可能性,但是您会遇到许多限制(所有基类和成员都必须是 POD ——并且您是否能够继承两次,因为派生类型本质上是非-POD?),这种类型的继承与该语言所支持的其他类型不同(好吧,这并不是说这之前已经停止了 C++),而且无论如何它都是多余的——现有的联合功能也可以。
Stroustrup 在 D&E 书中这样说:
(省略不会改变含义。)
所以我想这个决定是任意的,他只是认为没有理由改变联合功能(它与 C++ 的 C 子集一起工作得很好),所以没有设计与新 C++ 功能的任何集成。当风向改变时,它就陷入了困境。
The title asks why unions can't be a base class, but the question appears to be about unions as a derived class. So, which is it?
There's no technical reason why unions can't be a base class; it's just not allowed. A reasonable interpretation would be to think of the union as a struct whose members happen to potentially overlap in memory, and consider the derived class as a class that inherits from this (rather odd) struct. If you need that functionality, you can usually persuade most compilers to accept an anonymous union as a member of a struct. Here's an example, that's suitable for use as a base class. (And there's an anonymous struct in the union for good measure.)
The rationale for unions as a derived class is probably simpler: the result wouldn't be a union. Unions would have to be the union of all their members, and all of their bases. That's fair enough, and might open up some interesting template possibilities, but you'd have a number of limitations (all bases and members would have to be POD -- and would you be able to inherit twice, because a derived type is inherently non-POD?), this type of inheritance would be different from the other type the language sports (OK, not that this has stopped C++ before) and it's sort of redundant anyway -- the existing union functionality would do just as well.
Stroustrup says this in the D&E book:
(The elision doesn't change the meaning.)
So I imagine the decision is arbitrary, and he just saw no reason to change the union functionality (it works fine as-is with the C subset of C++), and so didn't design any integration with the new C++ features. And when the wind changed, it got stuck that way.
我想您在对 EJP 答案的评论中自己已经得到了答案。
我认为 C++ 中包含联合只是为了向后兼容 C。我想在 1970 年,在内存空间很小的系统上,联合似乎是一个好主意。当 C++ 出现时,我想联合已经看起来不太有用了。
考虑到联合体无论如何都非常危险,而且不是很有用,从联合体继承所产生的大量新机会可能看起来不是一个好主意:-)
I think you got the answer yourself in your comments on EJP's answer.
I think unions are only included in C++ at all in order to be backwards compatible with C. I guess unions seemed like a good idea in 1970, on systems with tiny memory spaces. By the time C++ came along I imagine unions were already looking less useful.
Given that unions are pretty dangerous anyway, and not terribly useful, the vast new opportunities for creating bugs that inheriting from unions would create probably just didn't seem like a good idea :-)
这是我对 C++ 03 的猜测。
按照 $9.5/1,在 C++ 03 中,联合不能有虚函数。有意义的派生的全部要点是能够重写派生类中的行为。如果联合不能具有虚函数,则意味着从联合派生没有意义。
因此就有了这个规则。
Here's my guess for C++ 03.
As per $9.5/1, In C++ 03, Unions can not have virtual functions. The whole point of a meaningful derivation is to be able to override behaviors in the derived class. If a union cannot have virtual functions, that means that there is no point in deriving from a union.
Hence the rule.
您可以使用 C++11 中的匿名联合功能继承联合的数据布局。
一般来说,最好不要直接使用联合,而是将它们包含在结构或类中。然后,您可以将继承建立在结构外层的基础上,并在需要时在内部使用联合。
You can inherit the data layout of a union using the anonymous union feature from C++11.
In general its almost always better not work with unions directly but enclose them within a struct or class. Then you can base your inheritance off the struct outer layer and use unions within if you need to.
托尼·帕克给出了一个非常接近事实的答案。 C++ 委员会基本上认为不值得付出努力使联合成为 C++ 的重要组成部分,类似于将数组视为我们必须从 C 继承但实际上并不想要的遗留内容。
联合有问题:如果我们允许联合中使用非 POD 类型,那么它们是如何构造的?这当然可以做到,但不一定安全,任何考虑都需要委员会资源。最终的结果将不太令人满意,因为在理智的语言中真正需要的是受歧视的联合,而裸露的 C 联合永远不可能以与 C 兼容的方式提升为受歧视的联合(无论如何,我可以想象)。
详细说明技术问题:由于您可以将仅包含 POD 组件的联合包装在结构中而不会丢失任何内容,因此允许联合作为基础没有任何优势。对于仅包含 POD 的联合组件,显式构造函数简单地分配组件之一,或者使用 bitblit (memcpy) 来生成编译器生成的复制构造函数(或赋值),都没有问题。
然而,这样的联合并没有足够的用处,除非保留它们,以便现有的 C 代码可以被认为是有效的 C++。这些仅限 POD 的联合在 C++ 中被破坏,因为它们无法保留在 C 中拥有的重要不变量:任何数据类型都可以用作组件类型。
为了使联合有用,我们必须允许可构造类型作为成员。这很重要,因为仅在构造函数主体中分配组件(无论是联合本身还是任何封闭结构)是不可接受的:例如,您不能将字符串分配给未初始化的字符串组件。
接下来,我们必须发明一些规则来使用内存初始化器初始化联合组件,例如:
但现在的问题是:规则是什么?通常规则是您必须初始化类的每个成员和基类,如果您不显式地这样做,则默认构造函数用于其余部分,并且如果未显式初始化的一种类型没有默认构造函数,则它是一个错误[异常:复制构造函数,默认为成员复制构造函数]。
显然,这一规则不适用于联合:规则必须改为:如果联合至少有一个非 POD 成员,则必须在构造函数中显式初始化一个成员。在这种情况下,不会生成默认构造函数、复制构造函数、赋值运算符或析构函数,并且如果实际使用这些成员中的任何一个,则必须显式提供它们。
所以现在的问题是:你会如何编写一个复制构造函数?当然,如果您按照 X-Windows 事件联合的设计方式来设计联合,则完全有可能做到并获得正确的结果:在每个组件中使用判别标记,但您必须使用放置运算符 new 来完成此操作,你将不得不打破我上面写的规则,乍一看似乎是正确的!
那么默认构造函数呢?如果您没有其中之一,则无法声明未初始化的变量。
在其他情况下,您可以在外部确定组件并使用placement new 在外部管理联合,但这不是复制构造函数。事实是,如果你有 N 个组件,你就需要 N 个构造函数,而 C++ 有一个错误的想法,即构造函数使用类名,这会让你缺少名称,并迫使你使用幻像类型来允许重载来选择正确的类型构造函数..并且您不能对复制构造函数执行此操作,因为它的签名是固定的。
好吧,那么还有其他选择吗?也许是的,但这些想法并不那么容易实现,而且更难说服 100 多人相信值得在三天的充满其他问题的会议中进行思考。
遗憾的是委员会没有实施上述规则:联合对于对齐任意数据是强制性的,并且组件的外部管理实际上并不难手动完成,并且当代码由合适的算法生成时微不足道且完全安全,换句话说,如果您想使用 C++ 作为编译器目标语言并仍然生成可读、可移植的代码,则该规则是强制。这种具有可构造成员的联合有很多用途,但最重要的一个是表示包含嵌套块的函数的堆栈框架:每个块在结构中都有本地数据,每个结构都是联合组件,不需要任何构造函数或者这样,编译器将只使用新的放置。该联合提供对齐和尺寸以及自由转换的组件访问。
[并且没有其他一致的方法来获得正确的对齐!]
因此,你的问题的答案是:你问了错误的问题。仅 POD 联合作为基类没有任何优势,而且它们当然不能是派生类,因为那样它们就不是 POD。为了使它们有用,需要一些时间来理解为什么应该遵循 C++ 中其他地方使用的原则:丢失的位不是错误,除非您尝试使用它们。
Tony Park gave an answer which is pretty close to the truth. The C++ committee basically didn't think it was worth the effort to make unions a strong part of C++, similarly to the treatment of arrays as legacy stuff we had to inherit from C but didn't really want.
Unions have problems: if we allow non-POD types in unions, how do they get constructed? It can certainly be done, but not necessarily safely, and any consideration would require committee resources. And the final result would be less than satisfactory, because what is really required in a sane language is discriminated unions, and bare C unions could never be elevated to discriminated unions in way compatible with C (that I can imagine, anyhow).
To elaborate on the technical issues: since you can wrap a POD-component only union in a struct without losing anything, there's no advantage allowing unions as bases. With POD-only union components, there's no problem with explicit constructors simply assigning one of the components, nor with using a bitblit (memcpy) for compiler generated copy constructor (or assignment).
Such unions, however, aren't useful enough to bother with except to retain them so existing C code can be considered valid C++. These POD-only unions are broken in C++ because they fail to retain a vital invariant they possess in C: any data type can be used as a component type.
To make unions useful, we must allow constructable types as members. This is significant because it is not acceptable to merely assign a component in a constructor body, either of the union itself, or any enclosing struct: you cannot, for example, assign a string to an uninitialised string component.
It follows one must invent some rules for initialising union component with mem-initialisers, for example:
But now the question is: what is the rule? Normally the rule is you must initialise every member and base of a class, if you do not do so explicitly, the default constructor is used for the remainder, and if one type which is not explicitly initialised does not have a default constructor, it's an error [Exception: copy constructors, the default is the member copy constructor].
Clearly this rule can't work for unions: the rule has to be instead: if the union has at least one non-POD member, you must explicitly initialise exactly one member in a constructor. In this case, no default constructor, copy constructor, assignment operator, or destructor will be generated and if any of these members are actually used, they must be explicitly supplied.
So now the question becomes: how would you write, say, a copy constructor? It is, of course quite possible to do and get right if you design your union the way, say, X-Windows event unions are designed: with the discriminant tag in each component, but you will have to use placement operator new to do it, and you will have to break the rule I wrote above which appeared at first glance to be correct!
What about default constructor? If you don't have one of those, you can't declare an uninitialised variable.
There are other cases where you can determine the component externally and use placement new to manage a union externally, but that isn't a copy constructor. The fact is, if you have N components you'd need N constructors, and C++ has a broken idea that constructors use the class name, which leaves you rather short of names and forces you to use phantom types to allow overloading to choose the right constructor .. and you can't do that for the copy constructor since its signature is fixed.
Ok, so are there alternatives? Probably, yes, but they're not so easy to dream up, and harder to convince over 100 people that it's worthwhile to think about in a three day meeting crammed with other issues.
It is a pity the committee did not implement the rule above: unions are mandatory for aligning arbitrary data and external management of the components is not really that hard to do manually, and trivial and completely safe when the code is generated by a suitable algorithm, in other words, the rule is mandatory if you want to use C++ as a compiler target language and still generate readable, portable code. Such unions with constructable members have many uses but the most important one is to represent the stack frame of a function containing nested blocks: each block has local data in a struct, and each struct is a union component, there is no need for any constructors or such, the compiler will just use placement new. The union provides alignment and size, and cast free component access.
[And there is no other conforming way to get the right alignment!]
Therefore the answer to your question is: you're asking the wrong question. There's no advantage to POD-only unions being bases, and they certainly can't be derived classes because then they wouldn't be PODs. To make them useful, some time is required to understand why one should follow the principle used everywhere else in C++: missing bits aren't an error unless you try to use them.
Union 是一种可以用作其任何一个成员的类型,具体取决于已设置的成员 - 以后只能读取该成员。
当您从类型派生时,派生类型将继承基类型 - 派生类型可以在基类型所在的任何位置使用。如果可以从联合派生,则可以在任何可以使用联合成员的地方使用派生类(不是隐式的,而是通过命名成员显式地使用),但在这些成员中只有一个成员可以合法访问。问题是已设置成员的数据未存储在联合中。
为了避免这种微妙但危险的矛盾,实际上破坏了从联合派生的类型系统是不允许的。
Union is a type that can be used as any one of its members depending on which member has been set - only that member can be later read.
When you derive from a type the derived type inherits the base type - the derived type can be used wherever the base type could be. If you could derive from a union the derived class could be used (not implicitly, but explicitly through naming the member) wherever any of the union members could be used, but among those members only one member could be legally accessed. The problem is the data on which member has been set is not stored in the union.
To avoid this subtle yet dangerous contradiction that in fact subverts a type system deriving from a union is not allowed.