long 和 char[] 的并集,字节顺序

发布于 2024-12-27 16:17:25 字数 286 浏览 3 评论 0原文

如果我在 8 位处理器的代码中执行以下操作:

typedef union
{
unsigned long longn ;
unsigned char chars[4];
} longbytes;

longbytes.chars[0] 是否始终是 longbytes.longn 的最低字节,或者是否如此取决于字节顺序/编译器/平台/目标/运气等?我已经查看了我编译的代码的反汇编,这就是我的特定情况下的情况,但我很好奇该代码是否可移植。

If I do the following in code for an 8-bit processor:

typedef union
{
unsigned long longn ;
unsigned char chars[4];
} longbytes;

Is longbytes.chars[0] always going to be the lowest byte of longbytes.longn, or does it depend on endianness/compiler/platform/target/luck etc.? I've viewed the disassembly of my complied code and that's how it is in my specific case, but I'm curious if this code is portable.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

心作怪 2025-01-03 16:17:25

其不可移植的原因有几个:

  • 这取决于您的平台(或编译器)强制首先写入哪个字节的字节顺序,因此您不能指望 chars[0] 寻址最低字节
  • unsigned long 不能保证与 4 个字符一样长,因此根据平台的不同,您甚至可能无法获得完整的 long (或者sizeof(long) 可能更小然后4,您可以进一步阅读,但至少对于 8 位处理器来说,
  • 读取不同的联合成员通常是不可移植的,这是实现定义的行为。结合其他两个问题,

总而言之,该代码根本不可移植。

There are several reasons why this is not portable:

  • It depends on the endianess your platform (or compiler) enforces which byte is written first, so you can't count on chars[0] addressing the lowest byte
  • unsigned long is not guaranteed to be exactly as long as 4 chars, so depending on the platform you might not even get the complete long (or sizeof(long) might be smaller then 4 and you read further, but that's unlikely for 8Bit processors at least.
  • Reading a different union member then you wrote to is generally not portable, it is implementation defined behaviour. The reason for this is basically the combination of the two other issues.

So all in all that code is not portable at all.

留一抹残留的笑 2025-01-03 16:17:25

一般来说,如果您需要关心字节序,那么您就做错了一些事情,并且需要解决您的问题(例如使用移位和掩码,或序列化/反序列化)。

例如,您可能不应该建立工会,而应该这样做:

uint32_t pack(uint8_t byte0, uint8_t byte1, uint8_t byte2, uint8_t byte3) {
    long result;

    result = byte0;
    result |= byte1 << 8;
    result |= byte2 << 16;
    result |= byte3 << 24;
    return result;
}

uint8_t unpack(int byteNumber, uint32_t value) {
    return (value >> (byteNumber * 8));
}

In general, if you ever need to care about endianness you're doing something wrong, and need to work around your problem (e.g. with shifts and masks, or serialisation/de-serialisation).

For example, rather than having a union maybe you should do something like:

uint32_t pack(uint8_t byte0, uint8_t byte1, uint8_t byte2, uint8_t byte3) {
    long result;

    result = byte0;
    result |= byte1 << 8;
    result |= byte2 << 16;
    result |= byte3 << 24;
    return result;
}

uint8_t unpack(int byteNumber, uint32_t value) {
    return (value >> (byteNumber * 8));
}
荒路情人 2025-01-03 16:17:25

这取决于平台如何在内部存储long。写入联合的一个元素然后从另一个元素读取是不可移植的。

It depends on how the platform stores longs internally. Writing to one element of a union and then reading from another is not portable.

一杯敬自由 2025-01-03 16:17:25

只要写入其中一部分并从另一部分读取不会导致未定义的行为,union 数据结构就是可移植的。具体来说,写入 unsigned long 并读取 unsigned char[4] 或反之亦然是未定义的行为。

The union data structure is portable as long as you do not cause undefined behavior by writing into one part of it and reading from the other. Specifically, writing to unsigned long and reading from unsigned char[4] or vice versa is undefined behavior.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文