Google Protocol Buffers 与 ASN.1 相比如何
Google Protocol Buffers 和 ASN.1(使用 PER 编码)之间最显着的区别是什么? 对于我的项目来说,最重要的问题是序列化数据的大小。 有人对两者之间的数据大小进行过比较吗?
What are the most noticable differences between Google Protocol Buffers and ASN.1 (with PER-encoding)? For my project the most imporant issue is the size of the serialized data. Has anyone done any data-size comparisons between the two?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您将 ASN.1 与 Unaligned PER 一起使用,并使用适当的约束定义数据类型(例如,指定整数的下限/上限、列表长度的上限等),您的编码将非常紧凑。 字段之间的对齐或填充等操作不会浪费任何位,并且每个字段都将以保存其允许的值范围所需的最小位数进行编码。 例如,类型为 INTEGER (1..8) 的字段将被编码为 3 位 (1='000', 2='001', ..., 8='111'); 具有四个选项的 CHOICE 将占用 2 位(指示所选选项)加上所选选项占用的位。 ASN.1 还有许多其他有趣的功能,这些功能已成功用于许多已发布的标准中。 一个例子是扩展标记(“...”),当将其应用于 SEQUENCE、CHOICE、ENUMERATED 和其他类型时,可以实现实现不同版本规范的端点之间的向后和向前兼容性。
If you use ASN.1 with Unaligned PER, and define your data types using the appropriate constraints (e.g., specifying lower/upper bounds for integers, upper bounds for the length of lists, etc.), your encodings will be very compact. There will be no bits wasted for things like alignment or padding between the fields, and each field will be encoded in the minimum number of bits necessary to hold its permitted range of values. For example, a field of type INTEGER (1..8) will be encoded in 3 bits (1='000', 2='001', ..., 8='111'); and a CHOICE with four alternatives will occupy 2 bits (indicating the chosen alternative) plus the bits occupied by the chosen alternative. ASN.1 has many other interesting features that have been successfully used in many published standards. An example is the extension marker ("..."), which when applied to SEQUENCE, CHOICE, ENUMERATED, and other types, enables backward- and forward compatibility between endpoints implementing different versions of the specification.
我已经很长时间没有完成任何 ASN.1 工作了,但是大小很可能取决于您的类型和实际数据的详细信息。
我强烈建议您对两者进行原型设计并放入一些真实数据进行比较。
如果您的协议缓冲区包含重复的原始类型,您应该查看 Subversion 中协议缓冲区的最新源代码 - 它们现在可以以“打包”格式表示,这更加节省空间。 (我的 C# 移植刚刚赶上了这个功能,上周的某个时间。)
It's a long time since I've done any ASN.1 work, but the size is very likely to depend on the details of your types and actual data.
I would strongly recommend that you prototype both and put some real data in to compare.
If your protocol buffer would contain repeated primitive types, you should look at the latest source in Subversion for Protocol Buffers - they can be represented in a "packed" format now which is much more space-efficient. (My C# port has just caught up with this feature, some time last week.)
当打包/编码消息的大小很重要时,您还应该注意这样一个事实:protobuf 无法打包不属于
原始数字类型
的重复
字段,阅读本文了解更多信息。这是一个问题,例如,如果您有该类型的消息:(注释定义了值的实际范围)
如果您的数组长度为(例如)32,则使用 protobuf 会导致打包消息大小约为 250 到 450 字节,取决于数组实际包含的值。 如果您使用完整的 32 位范围或,如果您使用
int32
而不是sint32
并且具有负值,这甚至可能增加到超过 1000 个字节。原始数据 blob(假设 z 可以定义为
int16
值)仅消耗 320 字节,因此 ASN.1 消息始终小于 320 字节,因为最大值实际上不是 32 位而是 19 位 (x,y) 和 15 位 (z)。protobuf 消息大小可以通过以下消息定义进行优化:
这导致消息大小介于大约 100 字节(所有值均为零)、300 字节(最大范围内的值)和 500 字节(所有值均为高 32 位值)之间。
When size of the packed/encoded message is important you should also note the fact that protobuf is not able to pack
repeated
fields that are not of aprimitive numeric type
, read this for more information.This is an issue e.g. if you have messages of that type: (comment defines actual range of values)
In case you have an array length of, e.g., 32 than you would result in a packed message size of approximately 250 to 450 bytes with protobuf, depending on what values the array actually contains. This can even increase to over 1000 bytes in case you use the full 32bit range or in case you use
int32
instead ofsint32
and have negative values.The raw data blob (assuming that z can be defined as
int16
value) would only consume 320 bytes and thus the ASN.1 message is always smaller than 320 bytes since the max values are actually not 32bit but 19bit (x,y) and 15bit (z).The protobuf message size can be optimized with this message definition:
which results in message sizes between approximately 100 byte (all values are zeros), 300 byte (values at range max), and 500 byte (all values are high 32bit values).
Protocol Buffers 不保证保留二进制编码中字段的顺序,但 ASN.1 可以。 它与大小无关,因此在您的用例中可能不是最明显的,但对于比较、数字签名、简化解析以及可能的其他应用程序来说,它是一个重要的区别。
Protocol Buffers does not guarantee preservation of the order of fields in the binary encoding but ASN.1 does. It is not related to size so might not be the most noticeable in your use case but it is an important difference for comparison, for digital signatures, for simplified parsing, and possibly other applications.