使用重新诠释铸件来保存结构或类以文件

发布于 2025-02-03 21:11:35 字数 692 浏览 2 评论 0原文

这是教授在他的剧本中向我们展示的东西。我没有在我编写的任何代码中使用此方法。

基本上,我们参加一堂课或结构,然后retinterpret_ctect并保存整个结构:

struct Account
{
    Account()
    {   }
    Account(std::string one, std::string two)
        : login_(one), pass_(two)
    {   }

private:
    std::string login_;
    std::string pass_;
};

int main()
{
    Account *acc = new Account("Christian", "abc123");

    std::ofstream out("File.txt", std::ios::binary);
    out.write(reinterpret_cast<char*>(acc), sizeof(Account));
    out.close();

输出(在文件中)

ÍÍÍÍChristian ÍÍÍÍÍÍ              ÍÍÍÍabc123 ÍÍÍÍÍÍÍÍÍ     

这会产生我困惑的 。该方法实际上是否起作用,还是引起UB,因为在单个编译器的异想天开的类和结构中发生了神奇的事情?

This is something the professor showed us in his scripts. I have not used this method in any code I have written.

Basically, we take a class, or struct, and reinterpret_cast it and save off the entire struct like so:

struct Account
{
    Account()
    {   }
    Account(std::string one, std::string two)
        : login_(one), pass_(two)
    {   }

private:
    std::string login_;
    std::string pass_;
};

int main()
{
    Account *acc = new Account("Christian", "abc123");

    std::ofstream out("File.txt", std::ios::binary);
    out.write(reinterpret_cast<char*>(acc), sizeof(Account));
    out.close();

This produces the output (in the file)

ÍÍÍÍChristian ÍÍÍÍÍÍ              ÍÍÍÍabc123 ÍÍÍÍÍÍÍÍÍ     

I'm confused. Does this method actually work, or does it cause UB because magical things happen within classes and structs that are at the whims of individual compilers?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

独自唱情﹋歌 2025-02-10 21:11:35

它实际上行不通,但也不会引起不确定的行为。

在C ++中,将任何对象重新诠释为char的数组是合法的,因此这里没有未定义的行为。

但是,结果通常仅在类是POD(有效地,当类是简单的C风格结构)和独立的(即结构没有指针数据成员)的情况下才能使用。

在这里,帐户不是POD,因为它具有std :: String成员。 std :: String的内部已实现定义,但它不是POD,它通常具有指向存储实际字符串的某些堆积的块(在您的特定示例中,实现是使用小弦优化的,其中字符串的值存储在std :: String对象本身中)。

有一些问题:

  • 您并不总是会获得期望的结果。如果您的字符串较长,则std :: String将使用在堆上分配的缓冲区存储字符串,因此您最终只会序列化指针,而不是尖头的字符串。<<<<<<<<<<<<<<<<< /p>

  • 您实际上无法使用您在此处序列化的数据。您不能仅仅将数据重新解释为帐户并期望它可以正常工作,因为std :: String构造函数不会被调用。

简而言之,您不能使用此方法来序列化复杂的数据结构。

It doesn't actually work, but it also does not cause undefined behavior.

In C++ it is legal to reinterpret any object as an array of char, so there is no undefined behavior here.

The results, however, are usually only usable if the class is POD (effectively, if the class is a simple C-style struct) and self-contained (that is, the struct doesn't have pointer data members).

Here, Account is not POD because it has std::string members. The internals of std::string are implementation-defined, but it is not POD and it usually has pointers that refer to some heap-allocated block where the actual string is stored (in your specific example, the implementation is using a small-string optimization where the value of the string is stored in the std::string object itself).

There are a few issues:

  • You aren't always going to get the results you expect. If you had a longer string, the std::string would use a buffer allocated on the heap to store the string and so you will end up just serializing the pointer, not the pointed-to string.

  • You can't actually use the data you've serialized here. You can't just reinterpret the data as an Account and expect it to work, because the std::string constructors would not get called.

In short, you cannot use this approach for serializing complex data structures.

情栀口红 2025-02-10 21:11:35

它不是不确定的。相反,它是平台依赖性或实现定义的行为。总的来说,这是因为同一编译器的不同版本,甚至是同一编译器上的不同开关都可以破坏您的保存文件格式。

It's not undefined. Rather, it's platform dependent or implementation defined behavior. This is, in general bad code, because differing versions of the same compiler, or even different switches on the same compiler, can break your save file format.

花桑 2025-02-10 21:11:35

这可以根据结构的内容以及数据回读的平台来起作用。这是一个有风险的,不可交付的黑客,您的老师不应该传播它。

您在结构中是否有指针或int?回读时,指针将在新过程中无效,并且int格式在所有计算机上都不相同(以这种方法为单位,而是两个令人震惊的问题)。任何指向对象图的一部分的内容都不会被处理。在目标计算机(32位与64位)上的结构包装可能有所不同,甚至可能是由于编译器选项在同一硬件上更改,从而使sizeof(account)不可靠,因为读取了返回数据大小。

有关更好的解决方案,请查看序列化库为您处理这些问题。 boost.serialization 是一个很好的例子。

在这里,我们使用“序列化”一词
表示可逆解构
一组任意的C ++数据
结构为一系列字节。
这样的系统可用于
重构等效结构
在另一个程序上下文中。取决于
在上下文中,这可能会使用
实施对象持久性,远程
参数传递或其他设施。

Google协议缓冲器也适用于简单的对象层次结构。

This could work depending on the contents of the struct, and the platform on which the data is read back. This is a risky, non-portable hack which your teacher should not be propagating.

Do you have pointers or ints in the struct? Pointers will be invalid in the new process when read back, and int format is not the same on all machines (to name but two show-stopping problems with this approach). Anything that's pointed to as part of an object graph will not be handled. Structure packing could be different on the target machine (32-bit vs 64-bit) or even due to compiler options changing on the same hardware, making sizeof(Account) unreliable as a read back data size.

For a better solution, look at a serialization library which handles those issues for you. Boost.Serialization is a good example.

Here, we use the term "serialization"
to mean the reversible deconstruction
of an arbitrary set of C++ data
structures to a sequence of bytes.
Such a system can be used to
reconstitute an equivalent structure
in another program context. Depending
on the context, this might used
implement object persistence, remote
parameter passing or other facility.

Google Protocol Buffers also works well for simple object hierarchies.

一抹苦笑 2025-02-10 21:11:35

它不能代替适当的序列化。考虑任何包含指针的复杂类型的情况 - 如果将指针保存到文件中,则在以后加载它们时,它们不会指向任何有意义的东西。

此外,如果代码更改,或者即使使用不同的编译器选项重新编译,则可能会破坏。

因此,它实际上仅对于简单类型的短期存储才是有用的 - 在此过程中,它占用了比该任务所需的更多空间。

It's no substitute for proper serialization. Consider the case of any complex type that contains pointers - if you save the pointers to a file, when you load them up later, they're not going to point to anything meaningful.

Additionally, it's likely to break if the code changes, or even if it's recompiled with different compiler options.

So it's really only useful for short-term storage of simple types - and in doing so, it takes up way more space than necessary for that task.

岁月蹉跎了容颜 2025-02-10 21:11:35

如果该方法完全有效,则远非强大。最好决定某些“串行”形式,无论是二进制,文本,XML等,并将其写出来。

这里的关键:您需要一个函数/代码来可靠地将类或结构转换为/从一系列字节中。 reinterpret_cast不执行此操作,因为用于代表类或结构的内存中的确切字节可以更改诸如填充,成员订单等之类的内容。

This method, if it works at all, is far from robust. It is much better to decide on some "serialized" form, whether it is binary, text, XML, etc., and write that out.

The key here: You need a function/code to reliably convert your class or struct to/from a series of bytes. reinterpret_cast does not do this, as the exact bytes in memory used to represent the class or struct can change for things like padding, order of members, etc.

终止放荡 2025-02-10 21:11:35

否。

为了使其正常工作,结构必须是一个POD(纯旧数据:只有简单的数据成员和POD数据成员,没有虚拟功能……可能是我不记得的其他一些限制)。

因此,如果您想这样做,则需要这样的结构:

struct Account {
    char login[20];
    char password[20];
};

请注意,STD :: String不是吊舱,因此您需要简单的数组。

尽管如此,对您来说不是一个好方法。关键字:“序列化” :)。

No.

In order for it to work, the structure must be a POD (plain old data: only simple data members and POD data members, no virtual functions... probably some other restrictions which I can't remember).

So if you wanted to do that, you'd need a struct like this:

struct Account {
    char login[20];
    char password[20];
};

Note that std::string's not a POD, so you'd need plain arrays.

Still, not a good approach for you. Keyword: "serialization" :).

Saygoodbye 2025-02-10 21:11:35

字符串的某些版本不实际使用动态内存,当字符串小时。因此,将字符串内部存储在字符串对象中。

想一想:

 struct SimpleString
 {
     char*    begin;        // beginning of string
     char*    end;          // end of string
     char*    allocEnd;     // end of allocated buffer end <= allocEnd
     int*     shareCount;   // String are usually copy on write
                            // as a result you need to track the number of people
                            // using this buffer
 };

现在在64位系统上。每个指针是8个字节。因此,小于32个字节的字符串可以适合相同的结构,而无需分配缓冲区。

 struct CompressedString
 {
     char buffer[sizeof(SimpleString)];
 };
 stuct OptString
 {
     int      type;        // Normal /Compressed
     union
     {
         SimpleString     simple;
         CompressedString compressed;
     }
 };

所以这就是我认为正在发生的事情。
正在使用一个非常有效的字符串实现,从而使您可以将对象倒入不用担心指针的情况下(因为STD :: String不使用指针)。

显然,这是不可移植的,因为这取决于STD :: String的实现详细信息。

如此有趣的技巧,但不能便携(并且在没有一些编译时间检查的情况下很容易破裂)。

Some version of string don;t actually use dynamic memory for the string when the string is small. Thus store the string internally in the string object.

Think of this:

 struct SimpleString
 {
     char*    begin;        // beginning of string
     char*    end;          // end of string
     char*    allocEnd;     // end of allocated buffer end <= allocEnd
     int*     shareCount;   // String are usually copy on write
                            // as a result you need to track the number of people
                            // using this buffer
 };

Now on a 64 bit system. Each pointer is 8 bytes. Thus a string of less than 32 bytes could fit into the same structure without allocating a buffer.

 struct CompressedString
 {
     char buffer[sizeof(SimpleString)];
 };
 stuct OptString
 {
     int      type;        // Normal /Compressed
     union
     {
         SimpleString     simple;
         CompressedString compressed;
     }
 };

So this is what I believe is happening above.
A very efficient string implementation is being used thus allowing you to dump the object to file without worrying about pointers (as the std::string are not using pointers).

Obviously this is not portable as it depends on the implementation details of std::string.

So interesting trick, but not portable (and liable to break easily without some compile time checks).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文