使用联合代替强制转换的定义是否明确?

发布于 2024-11-10 06:31:49 字数 343 浏览 4 评论 0原文

今天早上我和一位同事讨论了检测字节顺序的“编码技巧”的正确性。

诀窍是:

bool is_big_endian()
{
  union
  {
    int i;
    char c[sizeof(int)];
  } foo;


  foo.i = 1;
  return (foo.c[0] == 1);
}

对我来说,union 的这种用法似乎是不正确的,因为设置联合体的一个成员并读取另一个成员没有定义良好。但我不得不承认,这只是一种感觉,我缺乏实际证据来证实我的观点。

这个伎俩正确吗?谁在这里?

I had a discussion this morning with a colleague regarding the correctness of a "coding trick" to detect endianness.

The trick was:

bool is_big_endian()
{
  union
  {
    int i;
    char c[sizeof(int)];
  } foo;


  foo.i = 1;
  return (foo.c[0] == 1);
}

To me, it seems that this usage of an union is incorrect because setting one member of the union and reading another is not well-defined. But I have to admit that this is just a feeling and I lack actual proofs to strengthen my point.

Is this trick correct ? Who is right here ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

那请放手 2024-11-17 06:31:49

您的代码不可移植。它可能适用于某些编译器,也可能不适用于某些编译器。

当您尝试访问联合体的非活动成员时,您对未定义行为的看法是正确的[就像给出的代码的情况一样]

$9.5/1

在联合体中,任一时刻最多有一个数据成员处于活动状态,即任一时刻联合体中最多可以存储一个数据成员的值.

因此 foo.c[0] == 1 是不正确的,因为 c 此时未处于活动状态。如果您认为我错了,请随时纠正我。

Your code is not portable. It might work on some compilers or it might not.

You are right about the behaviour being undefined when you try to access the inactive member of the union [as it is in the case of the code given]

$9.5/1

In a union, at most one of the data members can be active at any time, that is, the value of at most one of the data members can be stored in a union at any time.

So foo.c[0] == 1 is incorrect because c is not active at that moment. Feel free to correct me if you think I am wrong.

心安伴我暖 2024-11-17 06:31:49

不要这样做,最好使用如下所示的内容:

#include <arpa/inet.h>
//#include <winsock2.h> // <-- for Windows use this instead

#include <stdint.h>

bool is_big_endian() {
  uint32_t i = 1;
  return i == htonl(i);
}

说明:

htonl 函数将 u_long 从主机转换为 TCP/IP 网络字节顺序(大端字节序)。


参考文献:

Don't do this, better use something like the following:

#include <arpa/inet.h>
//#include <winsock2.h> // <-- for Windows use this instead

#include <stdint.h>

bool is_big_endian() {
  uint32_t i = 1;
  return i == htonl(i);
}

Explanation:

The htonl function converts a u_long from host to TCP/IP network byte order (which is big-endian).


References:

高冷爸爸 2024-11-17 06:31:49

您是对的,该代码没有明确定义的行为。以下是如何进行便携式操作:

#include <cstring>

bool is_big_endian()
{
    static unsigned const i = 1u;
    char c[sizeof(unsigned)] = { };
    std::memcpy(c, &i, sizeof(c));
    return !c[0];
}

// or, alternatively

bool is_big_endian()
{
    static unsigned const i = 1u;
    return !*static_cast<char const*>(static_cast<void const*>(&i));
}

You're correct that that code doesn't have well-defined behavior. Here's how to do it portably:

#include <cstring>

bool is_big_endian()
{
    static unsigned const i = 1u;
    char c[sizeof(unsigned)] = { };
    std::memcpy(c, &i, sizeof(c));
    return !c[0];
}

// or, alternatively

bool is_big_endian()
{
    static unsigned const i = 1u;
    return !*static_cast<char const*>(static_cast<void const*>(&i));
}
原谅过去的我 2024-11-17 06:31:49

该函数应命名为 is_little_endian。我认为你可以使用这个联合技巧。或者也可以强制转换为 char。

The function should be named is_little_endian. I think you can use this union trick. Or also a cast to char.

感情旳空白 2024-11-17 06:31:49

该代码具有未定义的行为,尽管一些(大多数?)编译器会
至少在有限的情况下定义它。

该标准的意图reinterpret_cast用于
这。然而,这种意图并没有得到很好的表达,因为标准
无法真正定义行为;没有欲望定义它
硬件不支持它(例如由于对齐问题)。和
同样清楚的是,你不能只是在两个之间进行reinterpret_cast
任意类型并期望它能够工作。

从实施质量的角度来看,我希望
union 技巧和 reinterpret_cast 起作用,如果 union
reinterpret_cast 位于同一功能块中; 工会应该
只要编译器可以看到最终类型是union就可以工作
(尽管我使用过编译器,但情况并非如此)。

The code has undefined behavior, although some (most?) compilers will
define it, at least in limited cases.

The intent of the standard is that reinterpret_cast be used for
this. This intent isn't well expressed, however, since the standard
can't really define the behavior; there is no desire to define it when
the hardware won't support it (e.g. because of alignment issues). And
it's also clear that you can't just reinterpret_cast between two
arbitrary types and expect it to work.

From a quality of implementation point of view, I would expect both the
union trick and reinterpret_cast to work, if the union or the
reinterpret_cast is in the same functional block; the union should
work as long as the compiler can see that the ultimate type is a union
(although I've used compilers where this wasn't the case).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文