有没有办法强制 C 或 C++ 的特定字节序？结构？

发布于 2024-11-24 05:28:55 字数 628 浏览 2 评论 0原文

我已经看到了一些有关结构字节序的问题和答案，但它们是关于检测系统的字节序，或在两种不同字节序之间转换数据。

然而，我现在想要的是，如果有一种方法来强制给定结构的特定字节序。除了用大量操作位域的宏重写整个过程之外，是否还有一些好的编译器指令或其他简单的解决方案？

通用的解决方案会很好，但我也会对特定的 gcc 解决方案感到满意。

编辑：

感谢所有评论指出为什么强制执行字节顺序不是一个好主意，但就我而言，这正是我所需要的。

大量数据是由特定处理器（永远不会改变，它是具有自定义硬件的嵌入式系统）生成的，并且必须由在未知处理器上运行的程序（我正在开发）读取。对数据进行按字节计算将非常麻烦，因为它由数百种不同类型的结构组成，这些结构又大又深：其中大多数内部都有许多层其他大型结构。

更改嵌入式处理器的软件是不可能的。源代码是可用的，这就是为什么我打算使用该系统中的结构，而不是从头开始并按字节评估所有数据。

这就是为什么我需要告诉编译器应该使用哪种字节序，无论它的效率如何。

它不必是一个真实的字节顺序的变化。即使它只是一个接口，并且物理上所有内容都按照处理器自己的字节顺序处理，这对我来说是完全可以接受的。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小矜持 2024-12-01 05:28:55

我通常处理这个问题的方式是这样的：

#include <arpa/inet.h> // for ntohs() etc.
#include <stdint.h>

class be_uint16_t {
public:
        be_uint16_t() : be_val_(0) {
        }
        // Transparently cast from uint16_t
        be_uint16_t(const uint16_t &val) : be_val_(htons(val)) {
        }
        // Transparently cast to uint16_t
        operator uint16_t() const {
                return ntohs(be_val_);
        }
private:
        uint16_t be_val_;
} __attribute__((packed));

对于 be_uint32_t 也是如此。

然后你可以这样定义你的结构：

struct be_fixed64_t {
    be_uint32_t int_part;
    be_uint32_t frac_part;
} __attribute__((packed));

要点是编译器几乎肯定会按照你编写字段的顺序排列字段，所以你真正担心的是大端整数。 be_uint16_t 对象是一个知道如何根据需要在大端和机器端之间透明地转换自身的类。像这样：

be_uint16_t x = 12;
x = x + 1; // Yes, this actually works
write(fd, &x, sizeof(x)); // writes 13 to file in big-endian form

事实上，如果您使用任何相当好的 C++ 编译器编译该代码片段，您应该会发现它发出一个大端“13”作为常量。

对于这些对象，内存中的表示形式是大尾数法。因此，您可以创建它们的数组，将它们放入结构中，等等。但是当您对它们进行操作时，它们会神奇地转换为机器字节序。这通常是 x86 上的一条指令，因此非常高效。在某些情况下，您必须手动进行转换：

be_uint16_t x = 37;
printf("x == %u\n", (unsigned)x); // Fails to compile without the cast

...但对于大多数代码，您可以像使用内置类型一样使用它们。

The way I usually handle this is like so:

#include <arpa/inet.h> // for ntohs() etc.
#include <stdint.h>

class be_uint16_t {
public:
        be_uint16_t() : be_val_(0) {
        }
        // Transparently cast from uint16_t
        be_uint16_t(const uint16_t &val) : be_val_(htons(val)) {
        }
        // Transparently cast to uint16_t
        operator uint16_t() const {
                return ntohs(be_val_);
        }
private:
        uint16_t be_val_;
} __attribute__((packed));

Similarly for be_uint32_t.

Then you can define your struct like this:

struct be_fixed64_t {
    be_uint32_t int_part;
    be_uint32_t frac_part;
} __attribute__((packed));

The point is that the compiler will almost certainly lay out the fields in the order you write them, so all you are really worried about is big-endian integers. The be_uint16_t object is a class that knows how to convert itself transparently between big-endian and machine-endian as required. Like this:

be_uint16_t x = 12;
x = x + 1; // Yes, this actually works
write(fd, &x, sizeof(x)); // writes 13 to file in big-endian form

In fact, if you compile that snippet with any reasonably good C++ compiler, you should find it emits a big-endian "13" as a constant.

With these objects, the in-memory representation is big-endian. So you can create arrays of them, put them in structures, etc. But when you go to operate on them, they magically cast to machine-endian. This is typically a single instruction on x86, so it is very efficient. There are a few contexts where you have to cast by hand:

be_uint16_t x = 37;
printf("x == %u\n", (unsigned)x); // Fails to compile without the cast

...but for most code, you can just use them as if they were built-in types.

回复收藏 0 原文

相权↑美人 2024-12-01 05:28:55

虽然有点晚了，但对于当前的 GCC（在 6.2.1 上测试过，它可以工作，在 4.9.2 上测试过，没有实现）终于有一种方法可以声明结构体应该以 X-endian 字节顺序保存。

以下测试程序：

#include <stdio.h>
#include <stdint.h>

struct __attribute__((packed, scalar_storage_order("big-endian"))) mystruct {
    uint16_t a;
    uint32_t b;
    uint64_t c;
};


int main(int argc, char** argv) {
    struct mystruct bar = {.a = 0xaabb, .b = 0xff0000aa, .c = 0xabcdefaabbccddee};

    FILE *f = fopen("out.bin", "wb");
    size_t written = fwrite(&bar, sizeof(struct mystruct), 1, f);
    fclose(f);
}

创建一个文件“out.bin”，您可以使用十六进制编辑器（例如hexdump -C out.bin）检查该文件。如果支持 scalar_storage_order 属性，它将按此顺序包含预期的 0xaabbff0000aaabcdefaabbccddee 并且没有漏洞。遗憾的是，这当然是非常特定于编译器的。

A bit late to the party but with current GCC (tested on 6.2.1 where it works and 4.9.2 where it's not implemented) there is finally a way to declare that a struct should be kept in X-endian byte order.

The following test program:

#include <stdio.h>
#include <stdint.h>

struct __attribute__((packed, scalar_storage_order("big-endian"))) mystruct {
    uint16_t a;
    uint32_t b;
    uint64_t c;
};


int main(int argc, char** argv) {
    struct mystruct bar = {.a = 0xaabb, .b = 0xff0000aa, .c = 0xabcdefaabbccddee};

    FILE *f = fopen("out.bin", "wb");
    size_t written = fwrite(&bar, sizeof(struct mystruct), 1, f);
    fclose(f);
}

creates a file "out.bin" which you can inspect with a hex editor (e.g. hexdump -C out.bin). If the scalar_storage_order attribute is suppported it will contain the expected 0xaabbff0000aaabcdefaabbccddee in this order and without holes. Sadly this is of course very compiler specific.

回复收藏 0 原文

浮生面具三千个 2024-12-01 05:28:55

尝试使用
#pragma scalar_storage_order big-endian 以大端格式存储
#pragma scalar_storage_order little-endian 以小端存储
#pragma scalar_storage_order default 将其存储在您的计算机默认字节序中

阅读更多内容此处

回复收藏 0 原文

锦欢 2024-12-01 05:28:55

不，我不这么认为。

Endianness 是处理器的属性，指示整数是从左到右还是从右到左表示，它不是编译器的属性。

您能做的最好的事情就是编写独立于任何字节顺序的代码。

回复收藏 0 原文

相思碎 2024-12-01 05:28:55

不，没有这样的能力。如果它存在，可能会导致编译器必须生成过多/低效的代码，因此 C++ 不支持它。

处理序列化的常用 C++ 方法（我认为这是您要解决的问题）是让结构体以所需的确切布局保留在内存中，并以在反序列化时保留字节序的方式进行序列化。

回复收藏 0 原文

不可一世的女人 2024-12-01 05:28:55

我不确定是否可以修改以下内容以满足您的目的，但在我工作的地方，我们发现以下内容在许多情况下非常有用。

当字节顺序很重要时，我们使用两种不同的数据结构。其中之一是为了代表它预计如何到达。另一个是我们希望它如何在内存中表示。然后开发转换例程以在两者之间进行切换。

工作流程的运行方式如下...

将数据读入原始结构。
将“原始结构”转换为“内存版本”
仅对“内存版本”进行操作
当对其进行操作后，将“内存版本”转换回“原始结构”并将其写出。

我们发现这种解耦很有用，因为（但不限于）...

所有转换仅位于一个地方。
使用“内存版本”时，减少了有关内存对齐问题的头痛。
它使得从一个拱门到另一个拱门的移植变得更加容易（更少的字节序问题）。

希望这种解耦对您的应用程序也有用。

回复收藏 0 原文

甜嗑 2024-12-01 05:28:55

一个可能的创新解决方案是使用 C 解释器，例如 Ch 并强制使用字节序编码到大。

回复收藏 0 原文

芯好空 2024-12-01 05:28:55

Boost 为此提供了endian 缓冲区。

例如：

#include <boost/endian/buffers.hpp>
#include <boost/static_assert.hpp>

using namespace boost::endian;

struct header {
    big_int32_buf_t     file_code;
    big_int32_buf_t     file_length;
    little_int32_buf_t  version;
    little_int32_buf_t  shape_type;
};
BOOST_STATIC_ASSERT(sizeof(h) == 16U);

Boost provides endian buffers for this.

For example:

#include <boost/endian/buffers.hpp>
#include <boost/static_assert.hpp>

using namespace boost::endian;

struct header {
    big_int32_buf_t     file_code;
    big_int32_buf_t     file_length;
    little_int32_buf_t  version;
    little_int32_buf_t  shape_type;
};
BOOST_STATIC_ASSERT(sizeof(h) == 16U);

回复收藏 0 原文

夜空下最亮的亮点 2024-12-01 05:28:55

也许不是直接答案，但仔细阅读这个问题可以希望能解答您的一些疑虑。

回复收藏 0 原文

酷到爆炸 2024-12-01 05:28:55

您可以将该结构设为一个具有数据成员的 getter 和 setter 的类。 getter 和 setter 的实现方式如下：

int getSomeValue( void ) const {
#if defined( BIG_ENDIAN )
    return _value;
#else
    return convert_to_little_endian( _value );
#endif
}

void setSomeValue( int newValue) {
#if defined( BIG_ENDIAN )
    _value = newValue;
#else
    _value = convert_to_big_endian( newValue );
#endif
}

有时，当我们从文件中读取结构时，我们会这样做 - 我们将其读入结构并在大端和小端机器上使用它来正确访问数据。

You could make the structure a class with getters and setters for the data members. The getters and setters are implemented with something like:

int getSomeValue( void ) const {
#if defined( BIG_ENDIAN )
    return _value;
#else
    return convert_to_little_endian( _value );
#endif
}

void setSomeValue( int newValue) {
#if defined( BIG_ENDIAN )
    _value = newValue;
#else
    _value = convert_to_big_endian( newValue );
#endif
}

We do this sometimes when we read a structure in from a file - we read it into a struct and use this on both big-endian and little-endian machines to access the data properly.

回复收藏 0 原文