读取和写入结构向量到文件

发布于 2024-10-04 11:43:58 字数 958 浏览 15 评论 0原文

我在 Stack Overflow 和其他一些网站上读过一些关于将向量写入文件的文章。我已经实现了我认为有效的方法,但遇到了一些麻烦。结构中的数据成员之一是类字符串,当读回向量时,该数据会丢失。此外,在编写第一次迭代后,其他迭代会导致 malloc 错误。如何修改下面的代码以实现我所需的功能,将向量保存到文件中,然后在程序再次启动时将其读回?目前,读取是在类的构造函数中完成的,写入是在析构函数中完成的,该类的唯一数据成员是向量,但具有操作该向量的方法。

这是我的读/写方法的要点。假设向量elements...

读:

ifstream infile;
infile.open("data.dat", ios::in | ios::binary);
infile.seekg (0, ios::end);
elements.resize(infile.tellg()/sizeof(element));
infile.seekg (0, ios::beg);
infile.read( (char *) &elements[0], elements.capacity()*sizeof(element));
infile.close();

写:

ofstream outfile;
outfile.open("data.dat", ios::out | ios::binary | ios_base::trunc);
elements.resize(elements.size());
outfile.write( (char *) &elements[0], elements.size() * sizeof(element));
outfile.close();

结构元素:

struct element {
int id;
string test;
int other;        
};

I've read a few posts on Stack Overflow and a number of other site about writing vectors to files. I've implemented what I feel is working, but I'm having some troubles. One of the data members in the struct is a class string, and when reading the vector back in, that data is lost. Also, after writing the first iteration, additional iterations cause a malloc error. How can I modify the code below to achieve my desired ability to save the vector to a file, then read it back in when the program launches again? Currently, the read is done in the constructor, write in destructor, of a class who's only data member is the vector, but has methods to manipulate that vector.

Here is the gist of my read / write methods. Assuming vector<element> elements...

Read:

ifstream infile;
infile.open("data.dat", ios::in | ios::binary);
infile.seekg (0, ios::end);
elements.resize(infile.tellg()/sizeof(element));
infile.seekg (0, ios::beg);
infile.read( (char *) &elements[0], elements.capacity()*sizeof(element));
infile.close();

Write:

ofstream outfile;
outfile.open("data.dat", ios::out | ios::binary | ios_base::trunc);
elements.resize(elements.size());
outfile.write( (char *) &elements[0], elements.size() * sizeof(element));
outfile.close();

Struct element:

struct element {
int id;
string test;
int other;        
};

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

我最亲爱的 2024-10-11 11:43:58

在C++中,一般不能像那样直接读写内存和直接写入磁盘。特别是,您的struct元素包含一个string,它是一个非POD数据类型,因此无法直接访问。

一个思想实验可能有助于澄清这一点。您的代码假定所有 element 值的大小相同。如果字符串测试值之一比您假设的长,会发生什么情况?您的代码如何知道读取和写入磁盘时使用什么大小?

您需要阅读序列化,了解有关如何处理此问题的更多信息。

In C++, memory can not generally be directly read and written to disk directly like that. In particular, your struct element contains a string, which is a non-POD data type, and therefore cannot be directly accessed.

A thought experiment might help clarify this. Your code assumes that all your element values are the same size. What would happen if one of the string test values was longer than what you've assumed? How would your code know what size to use when reading and writing to disk?

You will want to read about serialization for more information about how to handle this.

许仙没带伞 2024-10-11 11:43:58

您的代码假设所有相关数据都直接存在于向量内部,而字符串是固定大小的对象,它们具有可以在堆上添加其可变大小内容的指针。您基本上是保存指针而不是文本。您应该编写一些字符串序列化代码,例如:

bool write_string(std::ostream& os, const std::string& s)
{
    size_t n = s.size();
    return os.write(n, sizeof n) && os.write(s.data(), n);
}

然后您可以为您的结构编写序列化例程。有几个设计选项:
- 许多人喜欢声明可以容纳 std::ostream 的 Binary_IStream / Binary_OStream 类型,但作为不同的类型可以用于创建一组单独的序列化例程:

operator<<(Binary_OStream& os, const Some_Class&);

或者,您可以在处理时放弃通常的流表示法二进制序列化,并使用函数调用表示法。显然,让相同的代码正确输出二进制序列化和人类可读的序列化是很好的,因此基于运算符的方法很有吸引力。

如果对数字进行序列化,则需要决定是以二进制格式还是 ASCII 格式进行序列化。对于纯二进制格式,需要可移植(即使在同一操作系统上的 32 位和 64 位编译之间),您可能还需要付出一些努力来编码和使用类型大小元数据(例如 int32_t 或 int64_t?)作为字节顺序(例如,考虑网络字节顺序和 ntohl() 系列函数)。使用 ASCII,您可以避免其中一些考虑因素,但它的长度可变,并且写入/读取速度可能较慢。下面,我随意使用带有“|”的 ASCII数字的终止符。

bool write_element(std::ostream& os, const element& e)
{
    return (os << e.id << '|') && write_string(os, e.test) && (os << e.other << '|');
}

然后对于你的向量:

os << elements.size() << '|';
for (std::vector<element>::const_iterator i = elements.begin();
     i != elements.end(); ++i)
    write_element(os, *i);

读回这个:

std::vector<element> elements;
size_t n;
if (is >> n)
    for (int i = 0; i < n; ++i)
    {
        element e;
        if (!read_element(is, e))
            return false; // fail
        elements.push_back(e);
   }

......这需要

bool read_element(std::istream& is, element& e)
{
    char c;
    return (is >> e.id >> c) && c == '|' &&
           read_string(is, e.test) &&
           (is >> e.other >> c) && c == '|';
}

......和......

bool read_string(std::istream& is, std::string& s)
{
    size_t n;
    char c;
    if ((is >> n >> c) && c == '|')
    {
        s.resize(n);
        return is.read(s.data(), n);
    }
    return false;
}

You code assumes all the relevant data exists directly inside the vector, whereas strings are fixed-sized objects that have pointers which can addres their variable sized content on the heap. You're basically saving the pointers and not the text. You should write a some string serialisation code, for example:

bool write_string(std::ostream& os, const std::string& s)
{
    size_t n = s.size();
    return os.write(n, sizeof n) && os.write(s.data(), n);
}

Then you can write serialisation routines for your struct. There are a few design options:
- many people like to declare Binary_IStream / Binary_OStream types that can house a std::ostream, but being a distinct type can be used to create a separate set of serialisation routines ala:

operator<<(Binary_OStream& os, const Some_Class&);

Or, you can just abandon the usual streaming notation when dealing with binary serialisation, and use function call notation instead. Obviously, it's nice to let the same code correctly output both binary serialisation and human-readable serialisation, so the operator-based approach is appealing.

If you serialise numbers, you need to decide whether to do so in a binary format or ASCII. With a pure binary format, where portable is required (even between 32-bit and 64-bit compiles on the same OS), you may need to make some effort to encode and use type size metadata (e.g. int32_t or int64_t?) as well as endianness (e.g. consider network byte order and ntohl()-family functions). With ASCII you can avoid some of those considerations, but it's variable length and can be slower to write/read. Below, I arbitrarily use ASCII with a '|' terminator for numbers.

bool write_element(std::ostream& os, const element& e)
{
    return (os << e.id << '|') && write_string(os, e.test) && (os << e.other << '|');
}

And then for your vector:

os << elements.size() << '|';
for (std::vector<element>::const_iterator i = elements.begin();
     i != elements.end(); ++i)
    write_element(os, *i);

To read this back:

std::vector<element> elements;
size_t n;
if (is >> n)
    for (int i = 0; i < n; ++i)
    {
        element e;
        if (!read_element(is, e))
            return false; // fail
        elements.push_back(e);
   }

...which needs...

bool read_element(std::istream& is, element& e)
{
    char c;
    return (is >> e.id >> c) && c == '|' &&
           read_string(is, e.test) &&
           (is >> e.other >> c) && c == '|';
}

...and...

bool read_string(std::istream& is, std::string& s)
{
    size_t n;
    char c;
    if ((is >> n >> c) && c == '|')
    {
        s.resize(n);
        return is.read(s.data(), n);
    }
    return false;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文