从 char 数组中转换/提取 int

发布于 2024-12-08 22:11:54 字数 706 浏览 4 评论 0原文

我得到了一个 cstring,源自 gzread 的调用。我知道数据是块,每个块由 unsigned int、char、int 和 unsigned Short int 组成。

所以我想知道将此 cstring 拆分为适当变量的标准方法是什么。

假设前 4 个字节是 unsigned int,下一个字节是 char,接下来的 4 个字节是有符号 int,最后 2 个字节是 unsigned Short int。

//Some pseudocode below which would work
char buf[11];
unsigned int a;
char b;
int c;
unsigned short int d;

我想我可以使用适当的偏移量进行memcpy。

memcpy(&a, buf, sizeof(unsigned int));
memcpy(&b, buf+4, sizeof(char));
memcpy(&c, buf+5, sizeof(int));
memcpy(&d, buf+9, sizeof(unsigned short int));

或者使用一些位运算符更好?就像移动和掩蔽一样。

或者将所有 11 个字节直接 gzreading 到某个结构中会更好吗?或者这甚至可能吗?结构体的内存布局是固定的吗?这可以与 gzread 一起使用吗?

I got a cstring, originating from a call from gzread. I know the data is blocks, and each block is consisting of an unsigned int, char, int and unsigned short int.

So I was wondering what the standard way of splitting this cstring into the appropriate variables is.

Say the first 4 bytes, is a unsigned int, the next byte is char, the next 4 bytes is signed int, and the last 2 bytes are unsigned short int.

//Some pseudocode below which would work
char buf[11];
unsigned int a;
char b;
int c;
unsigned short int d;

I guess I could memcpy, with appropriate offsets.

memcpy(&a, buf, sizeof(unsigned int));
memcpy(&b, buf+4, sizeof(char));
memcpy(&c, buf+5, sizeof(int));
memcpy(&d, buf+9, sizeof(unsigned short int));

Or is it better to use some bitoperators? Like shifting and masking.

Or would it be better to gzreading all 11 bytes directly into some struct, or is that even possible? Is the memory layout of a struct fixed, and will this work with gzread?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

不可一世的女人 2024-12-15 22:11:54

如果您打包结构(阅读 __packed__ 属性),您可以依赖顺序并且成员不对齐。因此,您可以直接读入结构。但是,我不确定该解决方案的可移植性。

否则,请像这样使用指针魔术和转换:

char *buffer;
int a = *(reinterpret_cast<int*> (buffer))
unsigned short b = *(reinterpret_cast<unsigned short*> (buffer + sizeof(int)))

If you pack the struct (read up on __packed__ attribute), you can rely on the order and that the members are non-aligned. Hence, you could read into a struct directly. However, I'm not sure about the portability of this solution.

Otherwise, use pointer magic and casting like so:

char *buffer;
int a = *(reinterpret_cast<int*> (buffer))
unsigned short b = *(reinterpret_cast<unsigned short*> (buffer + sizeof(int)))
停顿的约定 2024-12-15 22:11:54

您需要确保文件的字节顺序与您运行代码的处理器架构相匹配。例如,如果整数以最高有效字节在前的方式写入文件,而您的处理器则使用最低有效字节在前的顺序,那么您将得到垃圾结果。

如果您想让代码从一种体系结构移植到另一种体系结构,则应将整数的所有读写操作包装在宏或内联函数后面,这些宏或内联函数根据目标处理器体系结构为您管理字节顺序。

You need to make sure that the byte order of the file matches the processor architecture you're running your code on. If, for instance, the integers are written to file with most significant byte first and your processor uses least significant byte first order, you're getting garbage for results.

If you want to make your code portable from one architecture to another, you should wrap all read and write operations for integers behind macros or inline functions that manage the byte order for you depending on the target processor architecture.

鹿童谣 2024-12-15 22:11:54

这取决于输入数据的定义方式。如果它被定义为主机端顺序(即,端序始终与运行代码的系统匹配),那么您所展示的 memcpy() 是一个很好的、可移植的方法使用。

或者,如果输入数据被定义为具有特定的字节顺序,那么最好的可移植解决方案是使用移位和按位或一次加载一个unsigned char

It depends on how the input data is defined. If it's defined to be in host-endian order (that is, the endianness always matches the system on which your code is running), then the memcpy() you have shown is a good, portable method to use.

Alternatively, if the input data is defined to have a particular endianness, then the best portable solution is to load it one unsigned char at a time, using shifts and bitwise-or.

却一份温柔 2024-12-15 22:11:54

在执行任何操作之前,您需要先了解格式规范。是
它是文本或二进制(根据您的描述大概是二进制,但是一个
永远不知道)?有符号值的表示形式是什么?什么
是字节顺序? memcpy 仅在您的机器架构下才有效
与输入格式完全对应——如今这种情况很少见,
因为几乎所有网络格式都是大端格式,并且是最广泛使用的
架构是小端的。 (当今大多数格式和架构
使用 2 的补码来表示负值,因此您通常可以“假设”
那里有兼容性。但也有例外。)

鉴于此,值的数学重建(使用掩码和
移位或乘法)是唯一可移植的解决方案。取决于
在机器和编译器的质量上,很容易导致
也有更好的表现。

You need a specification of the format before you can do anything. Is
it text or binary (presumably binary from your description, but one
never knows)? What is the representation used for signed values? What
is the byte order? memcpy will only work if your machine architecture
corresponds exactly to that of the input format—a rare case today,
since almost all network formats are big-endian, and the most widespread
architectures are little-endian. (Most formats and architectures today
use 2's complement for negative values, so you can often "assume"
compatiblity there. But there are exceptions.)

Given this, mathematical reconstruction of the value (using masking and
shifting, or multiplications) is the only portable solution. Depending
on the machine and the quality of the compiler, it could easily result
in better performance as well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文