C++使用 STL 算法与容器(char * 除外)进行二进制文件 I/O

发布于 2024-08-13 20:53:54 字数 1864 浏览 1 评论 0原文

我正在尝试使用 STL 复制算法对二进制文件 I/O 进行简单测试,以将数据复制到容器和二进制文件中/从容器和二进制文件中复制数据。如下所示:

 1 #include <iostream>
 2 #include <iterator>
 3 #include <fstream>
 4 #include <vector>
 5 #include <algorithm>
 6 
 7 using namespace std;
 8
 9 typedef std::ostream_iterator<double> oi_t;
10 typedef std::istream_iterator<double> ii_t;
11 
12 int main () {
13
14   // generate some data to test
15   std::vector<double> vd;
16   for (int i = 0; i < 20; i++)
17   {
18     double d = rand() / 1000000.0;
19     vd.push_back(d);
20   }
21 
22   // perform output to a binary file
23   ofstream output ("temp.bin", ios::binary);
24   copy (vd.begin(), vd.end(), oi_t(output, (char *)NULL));
25   output.close();
26 
27   // input from the binary file to a container
28   std::vector<double> vi;
29   ifstream input ("temp.bin", ios::binary);
30   ii_t ii(input);
31   copy (ii, ii_t(), back_inserter(vi));
32   input.close();
33 
34   // output data to screen to verify/compare the results
35   for (int i = 0; i < vd.size(); i++)
36     printf ("%8.4f  %8.4f\n", vd[i], vi[i]);
37 
38   printf ("vd.size() = %d\tvi.size() = %d\n", vd.size(), vi.size());
39   return 0;
40 }

结果输出如下,有两个问题,据我所知:

1804.2894  1804.2985
846.9309    0.9312
1681.6928    0.6917
1714.6369    0.6420
1957.7478    0.7542
424.2383    0.2387
719.8854    0.8852
1649.7605    0.7660
596.5166    0.5171
1189.6414    0.6410
1025.2024    0.2135
1350.4900    0.4978
783.3687    0.3691
1102.5201    0.5220
2044.8978    0.9197
1967.5139    0.5114
1365.1805    0.1815
1540.3834    0.3830
304.0892    0.0891
1303.4557    0.4600
vd.size() = 20  vi.size() = 20

1)从二进制数据读取的每个 double 都缺少小数点之前的信息。 2) 数据在小数点后第三位(或更早)被破坏,并且引入了一些任意错误。

请提供任何帮助,我们将不胜感激。 (我希望有人能指出我之前关于此问题的帖子,因为我的搜索不足)

I'm attempting a simple test of binary file I/O using the STL copy algorithm to copy data to/from containers and a binary file. See below:

 1 #include <iostream>
 2 #include <iterator>
 3 #include <fstream>
 4 #include <vector>
 5 #include <algorithm>
 6 
 7 using namespace std;
 8
 9 typedef std::ostream_iterator<double> oi_t;
10 typedef std::istream_iterator<double> ii_t;
11 
12 int main () {
13
14   // generate some data to test
15   std::vector<double> vd;
16   for (int i = 0; i < 20; i++)
17   {
18     double d = rand() / 1000000.0;
19     vd.push_back(d);
20   }
21 
22   // perform output to a binary file
23   ofstream output ("temp.bin", ios::binary);
24   copy (vd.begin(), vd.end(), oi_t(output, (char *)NULL));
25   output.close();
26 
27   // input from the binary file to a container
28   std::vector<double> vi;
29   ifstream input ("temp.bin", ios::binary);
30   ii_t ii(input);
31   copy (ii, ii_t(), back_inserter(vi));
32   input.close();
33 
34   // output data to screen to verify/compare the results
35   for (int i = 0; i < vd.size(); i++)
36     printf ("%8.4f  %8.4f\n", vd[i], vi[i]);
37 
38   printf ("vd.size() = %d\tvi.size() = %d\n", vd.size(), vi.size());
39   return 0;
40 }

The resulting output is as follows and has two problems, afaik:

1804.2894  1804.2985
846.9309    0.9312
1681.6928    0.6917
1714.6369    0.6420
1957.7478    0.7542
424.2383    0.2387
719.8854    0.8852
1649.7605    0.7660
596.5166    0.5171
1189.6414    0.6410
1025.2024    0.2135
1350.4900    0.4978
783.3687    0.3691
1102.5201    0.5220
2044.8978    0.9197
1967.5139    0.5114
1365.1805    0.1815
1540.3834    0.3830
304.0892    0.0891
1303.4557    0.4600
vd.size() = 20  vi.size() = 20

1) Every double read from the binary data is missing the information before the decimal place.
2) The data is mangled at the 3rd decimal place (or earlier) and some arbitrary error is being introduced.

Please any help would be appreciated. (I would love for someone to point me to a previous post about this, as I've come up short in my search)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

猫腻 2024-08-20 20:53:54

对于问题1)您需要指定一个分隔符(例如空格)。非小数部分粘在前一个数字的小数部分上。在 C++ 中,强制转换和使用 NULL 通常是错误的。应该是一个提示;)

copy (vd.begin(), vd.end(), oi_t(output, " ")); 

对于问题2)

#include <iomanip>
output << setprecision(9);

For the question 1) You need to specify a separator (for example a space). The non-decimal part was stuck to the decimal part of the previous number. Casting and using NULL is generally wrong in C++. Should have been a hint ;)

copy (vd.begin(), vd.end(), oi_t(output, " ")); 

For the question 2)

#include <iomanip>
output << setprecision(9);
累赘 2024-08-20 20:53:54

使用 std::copy() 写入二进制数据。
我会这样做:

template<typename T>
struct oi_t: public iterator<output_iterator_tag, void, void, void, void>
{
  oi_t(std::ostream& str)
    :m_str(str)
  {}
  oi_t& operator++()   {return *this;}  // increment does not do anything.
  oi_t& operator++(int){return *this;}
  oi_t& operator*()    {return *this;}  // Dereference returns a reference to this
                                       // So that when the assignment is done we
                                       // actually write the data from this class
  oi_t& operator=(T const& data)
  {
    // Write the data in a binary format
    m_str.write(reinterpret_cast<char const*>(&data),sizeof(T));
    return *this;
  }

  private:
    std::ostream&   m_str;
};

因此对 std::copy 的调用是:

copy (vd.begin(), vd.end(), oi_t<double>(output));

输入迭代器稍微复杂一些,因为我们必须测试流的末尾。

template<typename T>
struct ii_t: public iterator<input_iterator_tag, void, void, void, void>
{
  ii_t(std::istream& str)
    :m_str(&str)
  {}
  ii_t()
    :m_str(NULL)
  {}
  ii_t& operator++()   {return *this;}  // increment does nothing.
  ii_t& operator++(int){return *this;}
  T& operator*()
  {
    // On the de-reference we actuall read the data into a local //// static ////
    // Thus we can return a reference
    static T result;
    m_str->read(reinterpret_cast<char*>(&result),sizeof(T));
    return result;
  }
  // If either iterator has a NULL pointer then it is the end() of stream iterator.
  // Input iterators are only equal if they have read past the end of stream.
  bool operator!=(ii_t const& rhs)
  {
      bool lhsPastEnd  = (m_str == NULL)     || (!m_str->good());
      bool rhsPastEnd  = (rhs.m_str == NULL) || (!rhs.m_str->good());

      return !(lhsPastEnd && rhsPastEnd);
  } 

  private:
    std::istream*   m_str;
};

现在读取输入的调用是:

ii_t<double> ii(input);
copy (ii, ii_t<double>(), back_inserter(vi));

To write binary data using std::copy().
I would do this:

template<typename T>
struct oi_t: public iterator<output_iterator_tag, void, void, void, void>
{
  oi_t(std::ostream& str)
    :m_str(str)
  {}
  oi_t& operator++()   {return *this;}  // increment does not do anything.
  oi_t& operator++(int){return *this;}
  oi_t& operator*()    {return *this;}  // Dereference returns a reference to this
                                       // So that when the assignment is done we
                                       // actually write the data from this class
  oi_t& operator=(T const& data)
  {
    // Write the data in a binary format
    m_str.write(reinterpret_cast<char const*>(&data),sizeof(T));
    return *this;
  }

  private:
    std::ostream&   m_str;
};

Thus the call to std::copy is:

copy (vd.begin(), vd.end(), oi_t<double>(output));

The input iterator is slightly more complicated as we have to test for the end of the stream.

template<typename T>
struct ii_t: public iterator<input_iterator_tag, void, void, void, void>
{
  ii_t(std::istream& str)
    :m_str(&str)
  {}
  ii_t()
    :m_str(NULL)
  {}
  ii_t& operator++()   {return *this;}  // increment does nothing.
  ii_t& operator++(int){return *this;}
  T& operator*()
  {
    // On the de-reference we actuall read the data into a local //// static ////
    // Thus we can return a reference
    static T result;
    m_str->read(reinterpret_cast<char*>(&result),sizeof(T));
    return result;
  }
  // If either iterator has a NULL pointer then it is the end() of stream iterator.
  // Input iterators are only equal if they have read past the end of stream.
  bool operator!=(ii_t const& rhs)
  {
      bool lhsPastEnd  = (m_str == NULL)     || (!m_str->good());
      bool rhsPastEnd  = (rhs.m_str == NULL) || (!rhs.m_str->good());

      return !(lhsPastEnd && rhsPastEnd);
  } 

  private:
    std::istream*   m_str;
};

The call to read the input is now:

ii_t<double> ii(input);
copy (ii, ii_t<double>(), back_inserter(vi));
别理我 2024-08-20 20:53:54

您可以使用 set precision 设置精度,正如 Tristram 指出的那样,您是否需要分隔符。请参阅 cppreference 以了解 operator= 功能。没有设置格式,因此您需要在输出上设置它:

ofstream output ("temp.bin", ios::binary);
output.flags(ios_base::fixed);  //or output << fixed;
copy(vd.begin(), vd.end(), oi_t(output, " "));
output.close();

我倾向于使用 fixed 来消除精度问题。在很多情况下,有人认为“我们永远不需要超过 5 位数字”,因此他们在各处硬编码了精度。这些都是必须纠正的代价高昂的错误。

You could set the precision using setprecision as Tristram pointed out, and do you need a delimiter. See the cppreference to see how the operator= functions. There is no format set, so you will need to set it on output:

ofstream output ("temp.bin", ios::binary);
output.flags(ios_base::fixed);  //or output << fixed;
copy(vd.begin(), vd.end(), oi_t(output, " "));
output.close();

I would tend to favor using fixed to eliminate precision problems. There have been many cases were someone thought "we'll never need more than 5 digits" so they hardcoded a precision everywhere. Those are costly bugs to have to correct.

自由如风 2024-08-20 20:53:54

我提出了一个更好的二进制 I/O 设计。基本方法是使用三个方法:size_on_stream、load_from_buffer、store_to_buffer。这些进入接口类,以便所有支持二进制 I/O 的类都继承它。

size_on_stream 方法返回在流上传输的数据的大小。通常,这不包括填充字节。这应该是递归的,以便类在其所有成员上调用该方法。

load_from_buffer 方法传递一个对缓冲区指针的引用(unsigned char * &)。该方法从缓冲区加载对象的数据成员,在每个成员之后递增指针(或在所有成员之后递增一次)。

store_to_buffer 方法将数据存储到给定的缓冲区中并递增指针。

客户端调用size_on_stream来确定所有数据的大小。此大小的缓冲区是动态分配的。指向此缓冲区的另一个指针被传递到store_to_buffer,以将对象的成员存储到缓冲区中。最后,客户端使用二进制写入(fwrite或std::ostream::write)将缓冲区传输到流。

该技术的一些优点是:打包、抽象和块 I/O。对象将其成员打包到缓冲区中。写入缓冲区的过程对客户端是隐藏的。客户端可以使用块 I/O 函数,这总是比传输单个成员更有效。

这种设计也更加便携,因为对象可以处理 Endianess。有一个简单的方法可以实现这一点,由读者自行决定。

我扩展了这个概念,将 POD(普通旧数据)类型也纳入其中,这留给读者作为练习。

I have come up with a better design for binary I/O. The fundamental approach is to have three methods: size_on_stream, load_from_buffer, and store_to_buffer. These go into an interface class so that all classes that support binary I/O inherit it.

The size_on_stream method returns the size of the data as transmitted on the stream. Generally, this does not include padding bytes. This should be recursive such that a class calls the method on all of its members.

The load_from_buffer method is passed a reference to a pointer to a buffer (unsigned char * &). The method loads the object's data members from the buffer, incrementing the pointer after every member (or incrementing once after all the members).

The store_to_buffer method stores data into the given buffer and increments the pointer.

The client calls size_on_stream to determine the size of all the data. A buffer of this size is dynamically allocated. Another pointer to this buffer is passed to the store_to_buffer to store the object's members into the buffer. Finally, the client uses a binary write (fwrite or std::ostream::write) to transfer the buffer to the stream.

Some of the benefits of this technique are: packing, abstraction and block I/O. The objects pack their members into the buffer. The process for writing into the buffer is hidden from the client. The client can use block I/O functions which are always more efficient than transferring individual members.

This design is also more portable, as the objects can take care of the Endianess. There is a simple method for this, which is left up to the reader.

I have expanded this concept to incorporate POD (Plain Old Data) types as well, which is left as an exercise for the reader.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文