std::ofstream 无缘无故地重复并丢失写入的数据

发布于 2024-12-16 21:05:49 字数 2612 浏览 1 评论 0原文

我刚刚目睹了 std::ofstream::write 方法的疯狂怪异行为。我正在编写自己对 Windows BMP 文件格式的处理,其中包括将位图保存到文件中 - 作为介绍。这是一个负责写入位图文件头的子例程,并引用了 std::ofstream 对象。

void
BitmapFileManager::FileHeader::save(std::ofstream& fout) const
{
    word        w;  
    dword       dw; 
    char const* cw  = reinterpret_cast<char*>(w );
    char const* cdw = reinterpret_cast<char*>(dw);
    uint const  sw  = sizeof(w );
    uint const  sdw = sizeof(dw);

    fout.write(&sig1, sizeof(sig1));
    fout.write(&sig2, sizeof(sig2));
    dw = toLittleEndian(size);         fout.write(cdw, sdw);
    w  = toLittleEndian(reserved1);    fout.write(cw , sw );
    w  = toLittleEndian(reserved2);    fout.write(cw , sw );
    dw = toLittleEndian(pixelsOffset); fout.write(cdw, sdw);
}

这里唯一需要标记的是sig1sig2都是char类型,sizeof(word) = 2sizeof(dword) = 4。此代码应导致两次将一个字节写入文件,然后写入一个四字节块,两次写入两个字节块,最后写入一个四字节块。

看一下结果的十六进制转储(还有一些后续内容,但忽略它们):

00000000  42 4d 42 4d 00 00 00 05  00 05 42 4d 00 00 28 00  |BMBM......BM..(.|

sig1sig2 打印了两次,并具有实际值是 BM,位于开头,并且出于某种奇怪的原因也在第 11 和 12 个字节。我不认识这一行中的其他值。但是看看如果我在每次写入之间添加一个调试字节会发生什么:

void
BitmapFileManager::FileHeader::save(std::ofstream& fout) const
{
    word        w;  
    dword       dw; 
    char const* cw  = reinterpret_cast<char*>(w );
    char const* cdw = reinterpret_cast<char*>(dw);
    uint const  sw  = sizeof(w );
    uint const  sdw = sizeof(dw);

    char nil = '*';
    fout.write(&sig1, sizeof(sig1));
    fout.write(&nil, sizeof(nil));
    fout.write(&sig2, sizeof(sig2));
    fout.write(&nil, sizeof(nil));
    dw = toLittleEndian(size);         fout.write(cdw, sdw);
    fout.write(&nil, sizeof(nil));
    w  = toLittleEndian(reserved1);    fout.write(cw , sw );
    fout.write(&nil, sizeof(nil));
    w  = toLittleEndian(reserved2);    fout.write(cw , sw );
    fout.write(&nil, sizeof(nil));
    dw = toLittleEndian(pixelsOffset); fout.write(cdw, sdw);
    fout.write(&nil, sizeof(nil));
}

十六进制转储变得

00000000  42 2a 4d 2a 6c 3b 78 a0  2a 00 05 2a 00 05 2a 6c  |B*M*l;x?*..*..*l|
00000010  3b 78 a0 2a 28 00 00 00  28 00 00 00 28 00 00 00  |;x?*(...(...(...|

看起来完全没问题。没有重复项,并且 * 将字符串按其应有的方式划分为 1-1-4-2-2-4 字节序列。有人可以帮我找到这个原因吗?是编译时的bug吗?我在 Mac OS X Leopard 上使用 -O2 的 gcc 版本 4.0.1(Apple Inc. build 5490),但其他级别没有改变任何内容。

I've just witnessed an insanely bizzare behaviour of the std::ofstream::write method. I am writing my own handling of Windows' BMP file format which includes saving a bitmap to a file - that's as an introduction. Here is a subroutine responsible for writing the header of the bitmap file, given reference to std::ofstream object.

void
BitmapFileManager::FileHeader::save(std::ofstream& fout) const
{
    word        w;  
    dword       dw; 
    char const* cw  = reinterpret_cast<char*>(w );
    char const* cdw = reinterpret_cast<char*>(dw);
    uint const  sw  = sizeof(w );
    uint const  sdw = sizeof(dw);

    fout.write(&sig1, sizeof(sig1));
    fout.write(&sig2, sizeof(sig2));
    dw = toLittleEndian(size);         fout.write(cdw, sdw);
    w  = toLittleEndian(reserved1);    fout.write(cw , sw );
    w  = toLittleEndian(reserved2);    fout.write(cw , sw );
    dw = toLittleEndian(pixelsOffset); fout.write(cdw, sdw);
}

The only thing to mark here is that sig1 and sig2 are both of type char, sizeof(word) = 2 and sizeof(dword) = 4. This code should result in twice writing a byte to a file, then a four byte chunk, two two byte chunks and finally a four byte chunk.

Take look at the hex dump of the result (there are also some things that follow, but ignore them):

00000000  42 4d 42 4d 00 00 00 05  00 05 42 4d 00 00 28 00  |BMBM......BM..(.|

sig1 and sig2 are printed twice, with a proper value which actually is B and M, at the beginning and for some strange reason also at 11th and 12th byte. I don't recognize other values among this line. But look what happens if I add a debug byte between every write:

void
BitmapFileManager::FileHeader::save(std::ofstream& fout) const
{
    word        w;  
    dword       dw; 
    char const* cw  = reinterpret_cast<char*>(w );
    char const* cdw = reinterpret_cast<char*>(dw);
    uint const  sw  = sizeof(w );
    uint const  sdw = sizeof(dw);

    char nil = '*';
    fout.write(&sig1, sizeof(sig1));
    fout.write(&nil, sizeof(nil));
    fout.write(&sig2, sizeof(sig2));
    fout.write(&nil, sizeof(nil));
    dw = toLittleEndian(size);         fout.write(cdw, sdw);
    fout.write(&nil, sizeof(nil));
    w  = toLittleEndian(reserved1);    fout.write(cw , sw );
    fout.write(&nil, sizeof(nil));
    w  = toLittleEndian(reserved2);    fout.write(cw , sw );
    fout.write(&nil, sizeof(nil));
    dw = toLittleEndian(pixelsOffset); fout.write(cdw, sdw);
    fout.write(&nil, sizeof(nil));
}

Hex dump becomes

00000000  42 2a 4d 2a 6c 3b 78 a0  2a 00 05 2a 00 05 2a 6c  |B*M*l;x?*..*..*l|
00000010  3b 78 a0 2a 28 00 00 00  28 00 00 00 28 00 00 00  |;x?*(...(...(...|

It seems like it's perfectly alright. There are no duplicates, and the * divides the string into sequence of 1-1-4-2-2-4 bytes as it should. Could someone help me find the reason of this? Is it a bug at compilation? I use gcc version 4.0.1 (Apple Inc. build 5490) on Mac OS X Leopard with -O2 but other levels didn't change anything.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

机场等船 2024-12-23 21:05:49

亚当·罗森菲尔德做到了。你把 aw(一个单词)的内容当作一个指针......然后一切都崩溃了。 dw 也一样。

之后随机垃圾...

Adam Rosenfield nailed it. You were taking the contents of a w (a word) and treating it as a pointer... and then all hell broke loose. Same thing for dw.

Random garbage after that...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文