编组或包装可以由工会实施吗?
在 beej 的网络指南中,有一个关于序列化数据的编组或打包的部分,其中他描述了用于打包和解包数据的各种函数(int、float、double 等)。
使用 union(可以为 float 和 double 定义类似)(如下定义)并将 integer.pack 作为 integer.i 的打包版本传输,而不是 pack 和 unpack 函数更容易。
union _integer{
char pack[4];
int i;
}integer;
有人可以解释为什么工会是一个糟糕的选择吗? ?
有没有更好的数据打包方法?
In beej's guide to networking there is a section of marshalling or packing data for Serialization where he describes various functions for packing and unpacking data (int,float,double ..etc).
It is easier to use union(similar can be defined for float and double) as defined below and transmit integer.pack as packed version of integer.i, rather than pack and unpack functions.
union _integer{
char pack[4];
int i;
}integer;
Can some one shed some light on why union is a bad choice?
Is there any better method of packing data?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
不同的计算机可能会以不同的方式排列数据。经典问题是字节顺序(在您的示例中,pack[0] 是否具有 MSB 或 LSB)。使用这样的联合将数据与生成它的计算机上的特定表示形式联系起来。
如果您想了解其他编组数据的方法,请查看 Boost 序列化 和 Google protobuf。
Different computers may lay the data out differently. The classic issue is endianess (in your example, whether pack[0] has the MSB or LSB). Using a union like this ties the data to the specific representation on the computer that generated it.
If you want to see other ways to marshall data, check out the Boost serialization and Google protobuf.
联合技巧虽然通常有效,但并不保证有效。设置 char 数据,然后在尝试读取 int 时读取 0 是完全有效的(根据标准),反之亦然。 union 被设计为内存微观优化,而不是铸造的替代品。
此时,通常您要么将转换包装在一个方便的对象中,要么使用reinterpret_cast。有点笨重,或者丑陋......但是当您打包数据时,这些都不一定是坏事。
The union trick is not guaranteed to work, although it usually does. It's perfectly valid (according to the standard) for you to set the char data, and then read 0s when you attempt to read the int, or vice-versa. union was designed to be a memory micro-optimization, not a replacement for casting.
At this point, usually you either wrap up the conversion in a handy object or use reinterpret_cast. Slightly bulky, or ugly... but neither of those are necessarily bad things when you're packing data.
为什么不直接将
reinterpret_cast
转换为char*
或将memcpy
转换为char
缓冲区呢?它们基本上是相同的东西并且不那么令人困惑。你的想法是可行的,所以如果你愿意的话就去做吧,但我发现干净的代码才是快乐的代码。我的工作越容易理解,别人(比如未来的我)破坏它的可能性就越小。
另请注意,只有 POD(普通旧数据)类型可以放置在联合中,这对联合方法带来了一些限制,而更直观的方法则没有这些限制。
Why not just do a
reinterpret_cast
to achar*
or amemcpy
into achar
buffer? They're basically the same thing and less confusing.Your idea would work, so go for it if you want, but I find that clean code is happy code. The easier it is to understand my work, the less likely it is that someone (like my future self) will break it.
Also note that only POD (plain old data) types can be placed in a union, which puts some limitations on the union approach that aren't there in a more intuitive one.