Google Protocol Buffers 与 ASN.1 相比如何

发布于 2024-07-14 10:50:29 字数 109 浏览 6 评论 0原文

Google Protocol Buffers 和 ASN.1（使用 PER 编码）之间最显着的区别是什么？对于我的项目来说，最重要的问题是序列化数据的大小。有人对两者之间的数据大小进行过比较吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

三生路 2024-07-21 10:50:29

如果您将 ASN.1 与 Unaligned PER 一起使用，并使用适当的约束定义数据类型（例如，指定整数的下限/上限、列表长度的上限等），您的编码将非常紧凑。字段之间的对齐或填充等操作不会浪费任何位，并且每个字段都将以保存其允许的值范围所需的最小位数进行编码。例如，类型为 INTEGER (1..8) 的字段将被编码为 3 位 (1='000', 2='001', ..., 8='111')；具有四个选项的 CHOICE 将占用 2 位（指示所选选项）加上所选选项占用的位。 ASN.1 还有许多其他有趣的功能，这些功能已成功用于许多已发布的标准中。一个例子是扩展标记（“...”），当将其应用于 SEQUENCE、CHOICE、ENUMERATED 和其他类型时，可以实现实现不同版本规范的端点之间的向后和向前兼容性。

回复收藏 0 原文

甚是思念 2024-07-21 10:50:29

我已经很长时间没有完成任何 ASN.1 工作了，但是大小很可能取决于您的类型和实际数据的详细信息。

我强烈建议您对两者进行原型设计并放入一些真实数据进行比较。

如果您的协议缓冲区包含重复的原始类型，您应该查看 Subversion 中协议缓冲区的最新源代码 - 它们现在可以以“打包”格式表示，这更加节省空间。（我的 C# 移植刚刚赶上了这个功能，上周的某个时间。）

回复收藏 0 原文

谷夏 2024-07-21 10:50:29

当打包/编码消息的大小很重要时，您还应该注意这样一个事实：protobuf 无法打包不属于原始数字类型的重复字段，阅读本文了解更多信息。

这是一个问题，例如，如果您有该类型的消息：（注释定义了值的实际范围）

message P{
    required sint32 x = 1; // -0x1ffff  to  0x20000
    required sint32 y = 2; // -0x1ffff  to  0x20000
    required sint32 z = 3; // -0x319c  to   0x3200
}
message Array{
    repeated P ps = 1;
    optional uint32 somemoredata = 2;
}

如果您的数组长度为（例如）32，则使用 protobuf 会导致打包消息大小约为 250 到 450 字节，取决于数组实际包含的值。如果您使用完整的 32 位范围或，如果您使用 int32 而不是 sint32 并且具有负值，这甚至可能增加到超过 1000 个字节。

原始数据 blob（假设 z 可以定义为 int16 值）仅消耗 320 字节，因此 ASN.1 消息始终小于 320 字节，因为最大值实际上不是 32 位而是 19 位 (x,y) 和 15 位 (z)。

protobuf 消息大小可以通过以下消息定义进行优化：

message Ps{
    repeated sint32 xs = 1 [packed=true];
    repeated sint32 ys = 2 [packed=true];
    repeated sint32 zs = 3 [packed=true];
}
message Array{
    required Ps ps = 1;
    optional uint32 somemoredata = 2;
}

这导致消息大小介于大约 100 字节（所有值均为零）、300 字节（最大范围内的值）和 500 字节（所有值均为高 32 位值）之间。

When size of the packed/encoded message is important you should also note the fact that protobuf is not able to pack repeated fields that are not of a primitive numeric type, read this for more information.

This is an issue e.g. if you have messages of that type: (comment defines actual range of values)

message P{
    required sint32 x = 1; // -0x1ffff  to  0x20000
    required sint32 y = 2; // -0x1ffff  to  0x20000
    required sint32 z = 3; // -0x319c  to   0x3200
}
message Array{
    repeated P ps = 1;
    optional uint32 somemoredata = 2;
}

In case you have an array length of, e.g., 32 than you would result in a packed message size of approximately 250 to 450 bytes with protobuf, depending on what values the array actually contains. This can even increase to over 1000 bytes in case you use the full 32bit range or in case you use int32 instead of sint32 and have negative values.

The raw data blob (assuming that z can be defined as int16 value) would only consume 320 bytes and thus the ASN.1 message is always smaller than 320 bytes since the max values are actually not 32bit but 19bit (x,y) and 15bit (z).

The protobuf message size can be optimized with this message definition:

message Ps{
    repeated sint32 xs = 1 [packed=true];
    repeated sint32 ys = 2 [packed=true];
    repeated sint32 zs = 3 [packed=true];
}
message Array{
    required Ps ps = 1;
    optional uint32 somemoredata = 2;
}

which results in message sizes between approximately 100 byte (all values are zeros), 300 byte (values at range max), and 500 byte (all values are high 32bit values).

回复收藏 0 原文