C++使用 STL 算法与容器(char * 除外)进行二进制文件 I/O
我正在尝试使用 STL 复制算法对二进制文件 I/O 进行简单测试,以将数据复制到容器和二进制文件中/从容器和二进制文件中复制数据。如下所示:
1 #include <iostream>
2 #include <iterator>
3 #include <fstream>
4 #include <vector>
5 #include <algorithm>
6
7 using namespace std;
8
9 typedef std::ostream_iterator<double> oi_t;
10 typedef std::istream_iterator<double> ii_t;
11
12 int main () {
13
14 // generate some data to test
15 std::vector<double> vd;
16 for (int i = 0; i < 20; i++)
17 {
18 double d = rand() / 1000000.0;
19 vd.push_back(d);
20 }
21
22 // perform output to a binary file
23 ofstream output ("temp.bin", ios::binary);
24 copy (vd.begin(), vd.end(), oi_t(output, (char *)NULL));
25 output.close();
26
27 // input from the binary file to a container
28 std::vector<double> vi;
29 ifstream input ("temp.bin", ios::binary);
30 ii_t ii(input);
31 copy (ii, ii_t(), back_inserter(vi));
32 input.close();
33
34 // output data to screen to verify/compare the results
35 for (int i = 0; i < vd.size(); i++)
36 printf ("%8.4f %8.4f\n", vd[i], vi[i]);
37
38 printf ("vd.size() = %d\tvi.size() = %d\n", vd.size(), vi.size());
39 return 0;
40 }
结果输出如下,有两个问题,据我所知:
1804.2894 1804.2985
846.9309 0.9312
1681.6928 0.6917
1714.6369 0.6420
1957.7478 0.7542
424.2383 0.2387
719.8854 0.8852
1649.7605 0.7660
596.5166 0.5171
1189.6414 0.6410
1025.2024 0.2135
1350.4900 0.4978
783.3687 0.3691
1102.5201 0.5220
2044.8978 0.9197
1967.5139 0.5114
1365.1805 0.1815
1540.3834 0.3830
304.0892 0.0891
1303.4557 0.4600
vd.size() = 20 vi.size() = 20
1)从二进制数据读取的每个 double
都缺少小数点之前的信息。 2) 数据在小数点后第三位(或更早)被破坏,并且引入了一些任意错误。
请提供任何帮助,我们将不胜感激。 (我希望有人能指出我之前关于此问题的帖子,因为我的搜索不足)
I'm attempting a simple test of binary file I/O using the STL copy algorithm to copy data to/from containers and a binary file. See below:
1 #include <iostream>
2 #include <iterator>
3 #include <fstream>
4 #include <vector>
5 #include <algorithm>
6
7 using namespace std;
8
9 typedef std::ostream_iterator<double> oi_t;
10 typedef std::istream_iterator<double> ii_t;
11
12 int main () {
13
14 // generate some data to test
15 std::vector<double> vd;
16 for (int i = 0; i < 20; i++)
17 {
18 double d = rand() / 1000000.0;
19 vd.push_back(d);
20 }
21
22 // perform output to a binary file
23 ofstream output ("temp.bin", ios::binary);
24 copy (vd.begin(), vd.end(), oi_t(output, (char *)NULL));
25 output.close();
26
27 // input from the binary file to a container
28 std::vector<double> vi;
29 ifstream input ("temp.bin", ios::binary);
30 ii_t ii(input);
31 copy (ii, ii_t(), back_inserter(vi));
32 input.close();
33
34 // output data to screen to verify/compare the results
35 for (int i = 0; i < vd.size(); i++)
36 printf ("%8.4f %8.4f\n", vd[i], vi[i]);
37
38 printf ("vd.size() = %d\tvi.size() = %d\n", vd.size(), vi.size());
39 return 0;
40 }
The resulting output is as follows and has two problems, afaik:
1804.2894 1804.2985
846.9309 0.9312
1681.6928 0.6917
1714.6369 0.6420
1957.7478 0.7542
424.2383 0.2387
719.8854 0.8852
1649.7605 0.7660
596.5166 0.5171
1189.6414 0.6410
1025.2024 0.2135
1350.4900 0.4978
783.3687 0.3691
1102.5201 0.5220
2044.8978 0.9197
1967.5139 0.5114
1365.1805 0.1815
1540.3834 0.3830
304.0892 0.0891
1303.4557 0.4600
vd.size() = 20 vi.size() = 20
1) Every double
read from the binary data is missing the information before the decimal place.
2) The data is mangled at the 3rd decimal place (or earlier) and some arbitrary error is being introduced.
Please any help would be appreciated. (I would love for someone to point me to a previous post about this, as I've come up short in my search)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
对于问题1)您需要指定一个分隔符(例如空格)。非小数部分粘在前一个数字的小数部分上。在 C++ 中,强制转换和使用 NULL 通常是错误的。应该是一个提示;)
对于问题2)
For the question 1) You need to specify a separator (for example a space). The non-decimal part was stuck to the decimal part of the previous number. Casting and using NULL is generally wrong in C++. Should have been a hint ;)
For the question 2)
使用 std::copy() 写入二进制数据。
我会这样做:
因此对 std::copy 的调用是:
输入迭代器稍微复杂一些,因为我们必须测试流的末尾。
现在读取输入的调用是:
To write binary data using std::copy().
I would do this:
Thus the call to std::copy is:
The input iterator is slightly more complicated as we have to test for the end of the stream.
The call to read the input is now:
您可以使用
set precision
设置精度,正如 Tristram 指出的那样,您是否需要分隔符。请参阅 cppreference 以了解operator=
功能。没有设置格式,因此您需要在输出上设置它:我倾向于使用
fixed
来消除精度问题。在很多情况下,有人认为“我们永远不需要超过 5 位数字”,因此他们在各处硬编码了精度。这些都是必须纠正的代价高昂的错误。You could set the precision using
setprecision
as Tristram pointed out, and do you need a delimiter. See the cppreference to see how theoperator=
functions. There is no format set, so you will need to set it on output:I would tend to favor using
fixed
to eliminate precision problems. There have been many cases were someone thought "we'll never need more than 5 digits" so they hardcoded a precision everywhere. Those are costly bugs to have to correct.我提出了一个更好的二进制 I/O 设计。基本方法是使用三个方法:
size_on_stream、load_from_buffer、
和store_to_buffer
。这些进入接口类,以便所有支持二进制 I/O 的类都继承它。size_on_stream
方法返回在流上传输的数据的大小。通常,这不包括填充字节。这应该是递归的,以便类在其所有成员上调用该方法。load_from_buffer
方法传递一个对缓冲区指针的引用(unsigned char * &
)。该方法从缓冲区加载对象的数据成员,在每个成员之后递增指针(或在所有成员之后递增一次)。store_to_buffer 方法将数据存储到给定的缓冲区中并递增指针。
客户端调用
size_on_stream
来确定所有数据的大小。此大小的缓冲区是动态分配的。指向此缓冲区的另一个指针被传递到store_to_buffer,以将对象的成员存储到缓冲区中。最后,客户端使用二进制写入(fwrite或std::ostream::write)
将缓冲区传输到流。该技术的一些优点是:打包、抽象和块 I/O。对象将其成员打包到缓冲区中。写入缓冲区的过程对客户端是隐藏的。客户端可以使用块 I/O 函数,这总是比传输单个成员更有效。
这种设计也更加便携,因为对象可以处理 Endianess。有一个简单的方法可以实现这一点,由读者自行决定。
我扩展了这个概念,将 POD(普通旧数据)类型也纳入其中,这留给读者作为练习。
I have come up with a better design for binary I/O. The fundamental approach is to have three methods:
size_on_stream, load_from_buffer,
andstore_to_buffer
. These go into an interface class so that all classes that support binary I/O inherit it.The
size_on_stream
method returns the size of the data as transmitted on the stream. Generally, this does not include padding bytes. This should be recursive such that a class calls the method on all of its members.The
load_from_buffer
method is passed a reference to a pointer to a buffer (unsigned char * &
). The method loads the object's data members from the buffer, incrementing the pointer after every member (or incrementing once after all the members).The
store_to_buffer
method stores data into the given buffer and increments the pointer.The client calls
size_on_stream
to determine the size of all the data. A buffer of this size is dynamically allocated. Another pointer to this buffer is passed to thestore_to_buffer
to store the object's members into the buffer. Finally, the client uses a binary write(fwrite or std::ostream::write)
to transfer the buffer to the stream.Some of the benefits of this technique are: packing, abstraction and block I/O. The objects pack their members into the buffer. The process for writing into the buffer is hidden from the client. The client can use block I/O functions which are always more efficient than transferring individual members.
This design is also more portable, as the objects can take care of the Endianess. There is a simple method for this, which is left up to the reader.
I have expanded this concept to incorporate POD (Plain Old Data) types as well, which is left as an exercise for the reader.