在 C++ 中使用按位运算符将 4 个字符更改为 int

发布于 2024-10-22 05:56:40 字数 1294 浏览 1 评论 0原文

我必须做的是以二进制模式打开一个文件,其中包含旨在解释为整数的存储数据。我看过其他示例,例如 Stackoverflow-Reading “integer” size bytes from一个 char* 数组。 但我想尝试采取不同的方法(我可能只是固执,或者愚蠢:/)。我首先在十六进制编辑器中创建了一个简单的二进制文件,其内容如下。

00 00 00 47 00 00 00 17 00 00 00 41
如果 12 个字节被分成 3 个整数,则该值(应该)等于 71、23 和 65。

以二进制模式打开此文件并将 4 个字节读入字符数组后,如何使用按位运算使 char[0] 位成为 int 的前 8 位,依此类推,直到每个 char 的位成为 int 的一部分整数。

 
My integer = 00        00        00        00  
 +           ^         ^         ^         ^
Chars      Char[0]  Char[1]   Char[2]   Char[3]
             00        00        00        47


So my integer(hex) = 00 00 00 47 = numerical value of 71

另外,我不知道我的系统的字节序在这里如何发挥作用,所以有什么我需要记住的吗?

这是我到目前为止所拥有的代码片段,我只是不知道接下来要采取的步骤。


std::fstream myfile;
    myfile.open("C:\\Users\\Jacob\\Desktop\\hextest.txt", std::ios::in | std::ios::out | std::ios::binary);
    if(myfile.is_open() == false)
    {
        std::cout << "Error" << std::endl;
    }
    char* mychar;
    std::cout << myfile.is_open() << std::endl;
    mychar = new char[4];
    myfile.read(mychar, 4);

我最终计划处理从文件中读取浮点数,最终可能是自定义数据类型,但首先我只需要更熟悉使用按位运算。 谢谢。

What I must do is open a file in binary mode that contains stored data that is intended to be interpreted as integers. I have seen other examples such as Stackoverflow-Reading “integer” size bytes from a char* array. but I want to try taking a different approach (I may just be stubborn, or stupid :/). I first created a simple binary file in a hex editor that reads as follows.

00 00 00 47 00 00 00 17 00 00 00 41
This (should) equal 71, 23, and 65 if the 12 bytes were divided into 3 integers.

After opening this file in binary mode and reading 4 bytes into an array of chars, how can I use bitwise operations to make char[0] bits be the first 8 bits of an int and so on until the bits of each char are part of the int.

 
My integer = 00        00        00        00  
 +           ^         ^         ^         ^
Chars      Char[0]  Char[1]   Char[2]   Char[3]
             00        00        00        47


So my integer(hex) = 00 00 00 47 = numerical value of 71

Also, I don't know how the endianness of my system comes into play here, so is there anything that I need to keep in mind?

Here is a code snippet of what I have so far, I just don't know the next steps to take.


std::fstream myfile;
    myfile.open("C:\\Users\\Jacob\\Desktop\\hextest.txt", std::ios::in | std::ios::out | std::ios::binary);
    if(myfile.is_open() == false)
    {
        std::cout << "Error" << std::endl;
    }
    char* mychar;
    std::cout << myfile.is_open() << std::endl;
    mychar = new char[4];
    myfile.read(mychar, 4);

I eventually plan on dealing with reading floats from a file and maybe a custom data type eventually, but first I just need to get more familiar with using bitwise operations.
Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

笑忘罢 2024-10-29 05:56:40

您需要按位左移运算符:

typedef unsigned char u8;  // in case char is signed by default on your platform
unsigned num = ((u8)chars[0] << 24) | ((u8)chars[1] << 16) | ((u8)chars[2] << 8) | (u8)chars[3];

它的作用是将左参数向左移动指定的位数,并从右侧添加零作为填充。例如,2 << 1 是 4,因为 2 在二进制中是 10,向左移动 1 得到 100,即 4。

这可以用更通用的方式写成循环形式:

unsigned num = 0;
for (int i = 0; i != 4; ++i) {
    num |= (u8)chars[i] << (24 - i * 8);    // += could have also been used
}

系统的字节顺序在这里并不重要;您知道文件中表示形式的字节序,它是恒定的(因此可移植),因此当您读取字节时,您知道如何处理它们。 CPU/内存中整数的内部表示可能与文件中的不同,但代码中它的逻辑按位操作与系统的字节顺序无关;最低有效位始终位于右侧,最高有效位始终位于左侧(在代码中)。这就是为什么移位是跨平台的——它在逻辑位级别上运行:-)

You want the bitwise left shift operator:

typedef unsigned char u8;  // in case char is signed by default on your platform
unsigned num = ((u8)chars[0] << 24) | ((u8)chars[1] << 16) | ((u8)chars[2] << 8) | (u8)chars[3];

What it does is shift the left argument a specified number of bits to the left, adding zeros from the right as stuffing. For example, 2 << 1 is 4, since 2 is 10 in binary and shifting one to the left gives 100, which is 4.

This can be more written in a more general loop form:

unsigned num = 0;
for (int i = 0; i != 4; ++i) {
    num |= (u8)chars[i] << (24 - i * 8);    // += could have also been used
}

The endianness of your system doesn't matter here; you know the endianness of the representation in the file, which is constant (and therefore portable), so when you read in the bytes you know what to do with them. The internal representation of the integer in your CPU/memory may be different from that of the file, but the logical bitwise manipulation of it in code is independent of your system's endianness; the least significant bits are always at the right, and the most at the left (in code). That's why shifting is cross-platform -- it operates at the logical bit level :-)

七禾 2024-10-29 05:56:40

你有没有想过使用Boost.Spirit来制作一个二进制解析器?开始时您可能会遇到一些学习曲线,但如果您想稍后扩展程序以读取浮点数和结构化类型,那么您将拥有一个很好的起点。

Spirit 有很好的文档记录,并且是 Boost 的一部分。一旦你了解了它的来龙去脉,你就会发现你可以用它做什么,所以如果你有时间尝试一下,我真的建议你看一下。

否则,如果您希望二进制文件“可移植”,即您希望能够在大端和小端机器上读取它,则需要某种字节顺序标记(BOM)。这将是您要阅读的第一件事,之后您可以简单地逐字节读取整数。最简单的事情可能是将它们读入一个联合(如果您知道要读取的整数的大小),如下所示:

union U
{
    unsigned char uc_[4];
    unsigned long ui_;
};

将数据读入 uc_ 成员,如果需要更改字节顺序,则交换字节从 ui_ 成员读取值。没有任何移位等需要完成 - 除了交换(如果你想改变字节序)..

HTH

rlc

Have you thought of using Boost.Spirit to make a binary parser? You might hit a bit of a learning curve when you start, but if you want to expand your program later to read floats and structured types, you'll have an excellent base to start from.

Spirit is very well-documented and is part of Boost. Once you get around to understanding its ins and outs, it's really mind-boggling what you can do with it, so if you have a bit of time to play around with it, I'd really recommend taking a look.

Otherwise, if you want your binary to be "portable" - i.e. you want to be able to read it on a big-endian and a little-endian machine, you'll need some sort of byte-order mark (BOM). That would be the first thing you'd read, after which you can simply read your integers byte by byte. Simplest thing would probably be to read them into a union (if you know the size of the integer you're going to read), like this:

union U
{
    unsigned char uc_[4];
    unsigned long ui_;
};

read the data into the uc_ member, swap the bytes around if you need to change endianness and read the value from the ui_ member. There's no shifting etc. to be done - except for the swapping if you want to change endianness..

HTH

rlc

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文