东京橱柜和可变尺寸 C++ 物体

发布于 2024-08-01 22:12:16 字数 492 浏览 13 评论 0原文

我正在使用 C++ 构建一个系统,该系统使用 Tokyo Cabinet(C 语言的原始 API)。 问题是我想存储一个类,例如:

    class Entity {
      public:
        string entityName;
        short type;
        vector<another_struct> x;
        vector<another_struct> y
        vector<string> z;
    };

问题是向量和字符串具有可变长度。 当我将 void* (我的对象)传递给 Tokyo Cabinet 以便它可以存储它时,我还必须传递对象的大小(以字节为单位)。 但这并非易事。

确定对象字节数的最佳方法是什么? 或者在东京橱柜中存储可变长度对象的最佳方式是什么。

我已经在考虑寻找序列化库。

谢谢

I'm building a system, with C++, that uses Tokyo Cabinet (original API in C). The problem is I want to store a class such as:

    class Entity {
      public:
        string entityName;
        short type;
        vector<another_struct> x;
        vector<another_struct> y
        vector<string> z;
    };

The problem is that vectors and strings have variable length. When I pass a void* (my object) to Tokyo Cabinet so it can store it, I also have to pass the size of the object in bytes. But that can't be trivially done.

What is the best way to determine the number of bytes of an object? Or what is the best way to store variable length objects in Tokyo Cabinet.

I'm already considering looking for serialization libs.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

感悟人生的甜 2024-08-08 22:12:16

是的,你最好使用boost序列化或protobuf来对对象进行灭菌并将其放入Cabinet中

yes, you'd better to use boost serialization or protobuf to sterilize the object and put it into Cabinet

長街聽風 2024-08-08 22:12:16

我使用协议缓冲区将我的 C++ 对象存储为 Tokyo Cabinet 数据值。

在 Protocol Buffers 中,您指定结构,然后生成 C++、Python 和 Java 的编组/解组代码。 在您的情况下,.proto 文件看起来像:

message Entity {
    optional string entityName = 1;
    optional int32 type = 2; //protobuf has no short
    short type = 3;
    repeated AnotherStruct x = 4;
    repeated AnotherStruct y = 5;
    repeated string z = 6;
};

特别是如果数据库存在很长一段时间,那么可以更新的系统(例如覆盖新字段)非常有价值。 与 XML 等相比,protobuf 相当快。

I use Protocol Buffers to store my C++ objects as Tokyo Cabinet data values.

In Protocol Buffers, you specify the structure and than generate the marshalling/unmarshalling code for C++, Python, and Java. In your case the .proto file would look like:

message Entity {
    optional string entityName = 1;
    optional int32 type = 2; //protobuf has no short
    short type = 3;
    repeated AnotherStruct x = 4;
    repeated AnotherStruct y = 5;
    repeated string z = 6;
};

Especially if the data base exists over a long timespan, a system that can be updated, e.g. to cover new fields is very value. In contrast to XML and other, protobuf is quite fast.

阳光的暖冬 2024-08-08 22:12:16

您无法将非 POD C++ 结构/类视为原始字节序列 - 这与使用指针或 std::stringstd::vector 无关,尽管后者实际上保证了它在实践中会崩溃。 您需要首先将对象序列化为字符序列 - 我建议 Boost.Serialization 一个优秀、灵活的跨平台序列化框架。

You cannot portably treat a non-POD C++ struct/class as a raw sequence of bytes - this is regardless of use of pointers or std::string and std::vector, though the latter virtually guarantee that it will break in practice. You need to serialize the object into a sequence of chars first - I'd suggest Boost.Serialization for a good, flexible cross-platform serialization framework.

泡沫很甜 2024-08-08 22:12:16

我认为情况比这更糟糕。 向量的实际存储对象的其余部分不连续。 您会看到 std::vector 将其数据保存在堆上的单独分配中(以便在需要时可以扩展它们)。 您需要一个能够理解 C++ 和 STL 的 API。

简而言之。 这行不通。

I think it is worse than that. The actual storage for the vectors is not contiguous with the rest of the object. You see std::vector<>s keep their data in separate allocations on the heap (so they can expand them if needed). You'll need a API that understands c++ and the STL.

In short. This isn't going to work.

入画浅相思 2024-08-08 22:12:16

尽管我使用 HDF5,但我也遇到了类似的问题。 就我而言,还有一个额外要求,即我可以读取对象的子部分,因此序列化并不是真正的选择。

HDF 非常类似于一个大型数组,其中使用索引来访问数据。 我使用的解决方案是向存储 another_struct 类型的表添加一个“上一个索引”。

以您的示例为例,如果“x”和“y”各有 3 个和 2 个元素,则数据将存储如下:

[ index ] [ another_struct data here ] [ previous_index ]
[   0   ] [       x data 0           ] [ -1 ]
[   1   ] [       x data 1           ] [  0 ]
[   2   ] [       x data 2           ] [  1 ]
[   3   ] [       y data 0           ] [ -1 ]
[   4   ] [       y data 1           ] [  3 ]

然后,在主实体表中,存储最后添加的索引:

[ index ] [ Entity data here ] [ x ] [  y ]
[   0   ] [        ...       ] [ 2 ] [  4 ]

我不太熟悉根据 Tokyo Cabinet 的工作原理,尽管这种方法应该可行,但对于该数据格式来说可能不是最佳选择。 理想情况下,如果您可以拥有指向真正的 Tokyo Cabinet 对象的指针,那么您可以存储这些指针,而不是像我上面那样使用索引。

I've had a similar problem although I use HDF5. In my case there is an additional requirement that I can read sub-parts of the object and so serialization is not really an option.

HDF is very much like a large array where an index is used to access the data. The solution that I use is to add a "previous index" to the table that stores the another_struct type.

Taking your example, if 'x' and 'y' had 3 and 2 elements each, then the data would be stored as follows:

[ index ] [ another_struct data here ] [ previous_index ]
[   0   ] [       x data 0           ] [ -1 ]
[   1   ] [       x data 1           ] [  0 ]
[   2   ] [       x data 2           ] [  1 ]
[   3   ] [       y data 0           ] [ -1 ]
[   4   ] [       y data 1           ] [  3 ]

And then, in the main Entity table, the last index added is stored:

[ index ] [ Entity data here ] [ x ] [  y ]
[   0   ] [        ...       ] [ 2 ] [  4 ]

I'm not that familiar with how Tokyo Cabinet works so although this approach should work, it may not be optimal for that data format. Ideally, if you can have pointers to real Tokyo Cabinet objects, then rather than using indexes as I have above you could store those pointers.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文