逐字节读取二进制 istream

发布于 2024-10-29 13:49:05 字数 930 浏览 1 评论 0原文

我试图使用 ifstream 逐字节读取二进制文件。我之前使用过像 get() 这样的 istream 方法来一次读取二进制文件的整个块，没有任何问题。但我当前的任务适合逐字节进行，并依靠 io 系统中的缓冲来提高效率。问题是我似乎比应有的时间早了几个字节到达文件末尾。所以我编写了以下测试程序：

#include <iostream>
#include <fstream>

int main() {
    typedef unsigned char uint8;
    std::ifstream source("test.dat", std::ios_base::binary);
    while (source) {
        std::ios::pos_type before = source.tellg();
        uint8 x;
        source >> x;
        std::ios::pos_type after = source.tellg();
        std::cout << before << ' ' << static_cast<int>(x) << ' '
                  << after << std::endl;
    }
    return 0;
}

这会转储 test.dat 的内容，每行一个字节，显示前后的文件位置。

果然，如果我的文件碰巧有两字节序列 0x0D-0x0A（对应于回车符和换行符），那么这些字节将被跳过。

我已经以二进制模式打开了流。这不应该阻止它解释行分隔符吗？
提取运算符是否始终使用文本模式？
从二进制 istream 中逐字节读取的正确方法是什么？

Windows 上的 MSVC++ 2008。

原文

I was attempting to read a binary file byte by byte using an ifstream. I've used istream methods like get() before to read entire chunks of a binary file at once without a problem. But my current task lends itself to going byte by byte and relying on the buffering in the io-system to make it efficient. The problem is that I seemed to reach the end of the file several bytes sooner than I should. So I wrote the following test program:

#include <iostream>
#include <fstream>

int main() {
    typedef unsigned char uint8;
    std::ifstream source("test.dat", std::ios_base::binary);
    while (source) {
        std::ios::pos_type before = source.tellg();
        uint8 x;
        source >> x;
        std::ios::pos_type after = source.tellg();
        std::cout << before << ' ' << static_cast<int>(x) << ' '
                  << after << std::endl;
    }
    return 0;
}

This dumps the contents of test.dat, one byte per line, showing the file position before and after.

Sure enough, if my file happens to have the two-byte sequence 0x0D-0x0A (which corresponds to carriage return and line feed), those bytes are skipped.

I've opened the stream in binary mode. Shouldn't that prevent it from interpreting line separators?
Do extraction operators always use text mode?
What's the right way to read byte by byte from a binary istream?

MSVC++ 2008 on Windows.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤千羽 2024-11-05 13:49:06

>>提取器用于格式化输入；他们跳过空白（通过
默认）。对于单个字符的无格式输入，您可以使用
istream::get() （返回int，如果读取失败则为 EOF，或者
[0,UCHAR_MAX]) 或 istream::get(char&) 范围内的值（将
在参数中读取的字符，返回转换为的内容
bool，如果读取成功则为 true，如果失败则为 false。