Google Protocol Buffers 与 ASN.1 相比如何

发布于 2024-07-14 10:50:29 字数 109 浏览 2 评论 0原文

Google Protocol Buffers 和 ASN.1(使用 PER 编码)之间最显着的区别是什么? 对于我的项目来说,最重要的问题是序列化数据的大小。 有人对两者之间的数据大小进行过比较吗?

What are the most noticable differences between Google Protocol Buffers and ASN.1 (with PER-encoding)? For my project the most imporant issue is the size of the serialized data. Has anyone done any data-size comparisons between the two?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

三生路 2024-07-21 10:50:29

如果您将 ASN.1 与 Unaligned PER 一起使用,并使用适当的约束定义数据类型(例如,指定整数的下限/上限、列表长度的上限等),您的编码将非常紧凑。 字段之间的对齐或填充等操作不会浪费任何位,并且每个字段都将以保存其允许的值范围所需的最小位数进行编码。 例如,类型为 INTEGER (1..8) 的字段将被编码为 3 位 (1='000', 2='001', ..., 8='111'); 具有四个选项的 CHOICE 将占用 2 位(指示所选选项)加上所选选项占用的位。 ASN.1 还有许多其他有趣的功能,这些功能已成功用于许多已发布的标准中。 一个例子是扩展标记(“...”),当将其应用于 SEQUENCE、CHOICE、ENUMERATED 和其他类型时,可以实现实现不同版本规范的端点之间的向后和向前兼容性。

If you use ASN.1 with Unaligned PER, and define your data types using the appropriate constraints (e.g., specifying lower/upper bounds for integers, upper bounds for the length of lists, etc.), your encodings will be very compact. There will be no bits wasted for things like alignment or padding between the fields, and each field will be encoded in the minimum number of bits necessary to hold its permitted range of values. For example, a field of type INTEGER (1..8) will be encoded in 3 bits (1='000', 2='001', ..., 8='111'); and a CHOICE with four alternatives will occupy 2 bits (indicating the chosen alternative) plus the bits occupied by the chosen alternative. ASN.1 has many other interesting features that have been successfully used in many published standards. An example is the extension marker ("..."), which when applied to SEQUENCE, CHOICE, ENUMERATED, and other types, enables backward- and forward compatibility between endpoints implementing different versions of the specification.

甚是思念 2024-07-21 10:50:29

我已经很长时间没有完成任何 ASN.1 工作了,但是大小很可能取决于您的类型和实际数据的详细信息。

强烈建议您对两者进行原型设计并放入一些真实数据进行比较。

如果您的协议缓冲区包含重复的原始类型,您应该查看 Subversion 中协议缓冲区的最新源代码 - 它们现在可以以“打包”格式表示,这更加节省空间。 (我的 C# 移植刚刚赶上了这个功能,上周的某个时间。)

It's a long time since I've done any ASN.1 work, but the size is very likely to depend on the details of your types and actual data.

I would strongly recommend that you prototype both and put some real data in to compare.

If your protocol buffer would contain repeated primitive types, you should look at the latest source in Subversion for Protocol Buffers - they can be represented in a "packed" format now which is much more space-efficient. (My C# port has just caught up with this feature, some time last week.)

谷夏 2024-07-21 10:50:29

当打包/编码消息的大小很重要时,您还应该注意这样一个事实:protobuf 无法打包不属于原始数字类型重复字段,阅读本文了解更多信息。

这是一个问题,例如,如果您有该类型的消息:(注释定义了值的实际范围)

message P{
    required sint32 x = 1; // -0x1ffff  to  0x20000
    required sint32 y = 2; // -0x1ffff  to  0x20000
    required sint32 z = 3; // -0x319c  to   0x3200
}
message Array{
    repeated P ps = 1;
    optional uint32 somemoredata = 2;
}

如果您的数组长度为(例如)32,则使用 protobuf 会导致打包消息大小约为 250 到 450 字节,取决于数组实际包含的值。 如果您使用完整的 32 位范围,如果您使用 int32 而不是 sint32 并且具有负值,这甚至可能增加到超过 1000 个字节。

原始数据 blob(假设 z 可以定义为 int16 值)仅消耗 320 字节,因此 ASN.1 消息始终小于 320 字节,因为最大值实际上不是 32 位而是 19 位 (x,y) 和 15 位 (z)。

protobuf 消息大小可以通过以下消息定义进行优化:

message Ps{
    repeated sint32 xs = 1 [packed=true];
    repeated sint32 ys = 2 [packed=true];
    repeated sint32 zs = 3 [packed=true];
}
message Array{
    required Ps ps = 1;
    optional uint32 somemoredata = 2;
}

这导致消息大小介于大约 100 字节(所有值均为零)、300 字节(最大范围内的值)和 500 字节(所有值均为高 32 位值)之间。

When size of the packed/encoded message is important you should also note the fact that protobuf is not able to pack repeated fields that are not of a primitive numeric type, read this for more information.

This is an issue e.g. if you have messages of that type: (comment defines actual range of values)

message P{
    required sint32 x = 1; // -0x1ffff  to  0x20000
    required sint32 y = 2; // -0x1ffff  to  0x20000
    required sint32 z = 3; // -0x319c  to   0x3200
}
message Array{
    repeated P ps = 1;
    optional uint32 somemoredata = 2;
}

In case you have an array length of, e.g., 32 than you would result in a packed message size of approximately 250 to 450 bytes with protobuf, depending on what values the array actually contains. This can even increase to over 1000 bytes in case you use the full 32bit range or in case you use int32 instead of sint32 and have negative values.

The raw data blob (assuming that z can be defined as int16 value) would only consume 320 bytes and thus the ASN.1 message is always smaller than 320 bytes since the max values are actually not 32bit but 19bit (x,y) and 15bit (z).

The protobuf message size can be optimized with this message definition:

message Ps{
    repeated sint32 xs = 1 [packed=true];
    repeated sint32 ys = 2 [packed=true];
    repeated sint32 zs = 3 [packed=true];
}
message Array{
    required Ps ps = 1;
    optional uint32 somemoredata = 2;
}

which results in message sizes between approximately 100 byte (all values are zeros), 300 byte (values at range max), and 500 byte (all values are high 32bit values).

兔姬 2024-07-21 10:50:29

Protocol Buffers 不保证保留二进制编码中字段的顺序,但 ASN.1 可以。 它与大小无关,因此在您的用例中可能不是最明显的,但对于比较、数字签名、简化解析以及可能的其他应用程序来说,它是一个重要的区别。

Protocol Buffers does not guarantee preservation of the order of fields in the binary encoding but ASN.1 does. It is not related to size so might not be the most noticeable in your use case but it is an important difference for comparison, for digital signatures, for simplified parsing, and possibly other applications.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文