Google Protocol Buffers 中的参考语义

发布于 2024-11-09 01:02:07 字数 747 浏览 0 评论 0原文

我有一个稍微特殊的程序,它处理与此非常相似的情况 (在类似 C# 的伪代码中):

class CDataSet
{
   int m_nID;
   string m_sTag;
   float m_fValue;
   void PrintData()
   {
      //Blah Blah
   }
};

class CDataItem
{
  int m_nID;
  string m_sTag;
  CDataSet m_refData;
  CDataSet m_refParent;
  void Print()
  {
      if(null == m_refData)
       {
         m_refParent.PrintData();
       }
     else
       {
         m_refData.PrintData();
       }
  }
};

成员 m_refData 和 m_refParent 被初始化为 null 并按如下方式使用: m_refData ->;添加新数据集时使用 m_refParent ->;用于指向现有的数据集。 仅当字段 m_nID 与现有数据集不匹配时,才会添加新数据集。

目前,这段代码正在管理大约 500 个对象,每个对象大约有 21 个字段,并且目前选择的格式是 XML,其行数超过 100k,大小超过 5MB,非常笨重。

我计划修改整个 shebang 以使用 ProtoBuf,但目前我不确定如何处理引用语义。任何想法将不胜感激

I have slightly peculiar program which deals with cases very similar to this
(in C#-like pseudo code):

class CDataSet
{
   int m_nID;
   string m_sTag;
   float m_fValue;
   void PrintData()
   {
      //Blah Blah
   }
};

class CDataItem
{
  int m_nID;
  string m_sTag;
  CDataSet m_refData;
  CDataSet m_refParent;
  void Print()
  {
      if(null == m_refData)
       {
         m_refParent.PrintData();
       }
     else
       {
         m_refData.PrintData();
       }
  }
};

Members m_refData and m_refParent are initialized to null and used as follows:
m_refData -> Used when a new data set is added
m_refParent -> Used to point to an existing data set.
A new data set is added only if the field m_nID doesn't match an existing one.

Currently this code is managing around 500 objects with around 21 fields per object and the format of choice as of now is XML, which at 100k+ lines and 5MB+ is very unwieldy.

I am planning to modify the whole shebang to use ProtoBuf, but currently I'm not sure as to how I can handle the reference semantics. Any thoughts would be much appreciated

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

沉鱼一梦 2024-11-16 01:02:07

开箱即用的协议缓冲区不具有任何引用语义。您需要手动交叉引用它们,通常使用人工密钥。本质上,在 DTO 层上,您将获得 CDataSet 的密钥(您只需发明它,也许只是一个递增的整数),将密钥而不是 m_refData/m_refParent 中的项目存储起来,并在序列化/反序列化期间手动运行修复。您也可以将索引存储到 CDataSet 集合中,但这可能会使插入等变得更加困难。由你决定;由于这是序列化,您可能会争辩说您不会在初始群体之外插入(等),因此原始索引是良好且可靠的。

然而,这是一个非常常见的场景 - 因此作为特定于实现的功能,我在我的实现 (protobuf-net) 中添加了可选(选择加入)参考跟踪,这基本上在幕后自动执行了上述操作(因此您不需要不需要更改您的对象或将密钥暴露在二进制流之外)。

Out of the box, protocol buffers does not have any reference semantics. You would need to cross-reference them manually, typically using an artificial key. Essentially on the DTO layer you would a key to CDataSet (that you simply invent, perhaps just an increasing integer), storing the key instead of the item in m_refData/m_refParent, and running fixup manually during serialization/deserialization. You can also just store the index into the set of CDataSet, but that may make insertion etc more difficult. Up to you; since this is serialization you could argue that you won't insert (etc) outside of initial population and hence the raw index is fine and reliable.

This is, however, a very common scenario - so as an implementation-specific feature I've added optional (opt-in) reference tracking to my implementation (protobuf-net), which essentially automates the above under the covers (so you don't need to change your objects or expose the key outside of the binary stream).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文