I was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.
So, if serialization/deserialization is all you need, then you can probably use something else.
I think most of these points have missed the basic fact that Thrift is an RPC framework, which happens to have the ability to serialize data using a variety of methods (binary, XML, etc).
Protocol Buffers are designed purely for serialization, it's not a framework like Thrift.
One obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.
Also, both have bit less tool support than standard formats like xml (and maybe even json).
(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.
It's also important to note that not all supported languages compair consistently with thrift or protobuf. At this point it's a matter of the modules implementation in addition to the underlying serialization. Take care to check benchmarks for whatever language you plan to use.
There are some excellent points here and I'm going to add another one in case someones' path crosses here.
Thrift gives you an option to choose between thrift-binary and thrift-compact (de)serializer, thrift-binary will have an excellent performance but bigger packet size, while thrift-compact will give you good compression but needs more processing power. This is handy because you can always switch between these two modes as easily as changing a line of code (heck, even make it configurable). So if you are not sure how much your application should be optimized for packet size or in processing power, thrift can be an interesting choice.
PS: See this excellent benchmark project by thekvs which compares many serializers including thrift-binary, thrift-compact, and protobuf: https://github.com/thekvs/cpp-serializers
PS: There is another serializer named YAS which gives this option too but it is schema-less see the link above.
Protocol Buffer use variable-length integee which refers to variable-length digital encoding, turning a fixed-length number into a variable-length number to save space.
Thrift proposed different types of serialization formats (called "protocols").
In fact, Thrift has two different JSON encodings, and no less than three different binary encoding methods.
In conclusion,these two libraries are completely different. Thrift likes a one-stop shop, giving you the entire integrated RPC framework and many options (supporting cross-language), while Protocol Buffers is more inclined to "just do one thing and do it well".
They both offer many of the same features; however, there are some differences:
Thrift supports 'exceptions'
Protocol Buffers have much better documentation/examples
Thrift has a builtin Set type
Protocol Buffers allow "extensions" - you can extend an external proto to add extra fields, while still allowing external code to operate on the values. There is no way to do this in Thrift
I find Protocol Buffers much easier to read
Basically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).
RPC is another key difference. Thrift generates code to implement RPC clients and servers wheres Protocol Buffers seems mostly designed as a data-interchange format alone.
Additionally, there are plenty of interesting additional tools available for those solutions, which might decide. Here are examples for Protobuf: Protobuf-wireshark , protobufeditor.
Protocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:
We decided against some extreme storage optimizations (i.e. packing
small integers into ASCII or using a 7-bit continuation format)
for the sake of simplicity and clarity in the code. These alterations
can easily be made if and when we encounter a performance-critical
use case that demands them.
Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.
发布评论
评论(15)
与 python 上的 protobuff 相比,我能够使用基于文本的协议获得更好的性能。 然而,protobuff 没有提供类型检查或其他花哨的 utf8 转换等。
因此,如果您只需要序列化/反序列化,那么您可能可以使用其他方法。
http://dhruvbird.blogspot.com/2010/05/protocol -buffers-vs-http.html
I was able to get better performance with a text based protocol as compared to protobuff on python. However, no type checking or other fancy utf8 conversion, etc... which protobuff offers.
So, if serialization/deserialization is all you need, then you can probably use something else.
http://dhruvbird.blogspot.com/2010/05/protocol-buffers-vs-http.html
我认为这些观点中的大多数都忽略了一个基本事实:Thrift 是一个 RPC 框架,它恰好具有使用各种方法(二进制、XML 等)序列化数据的能力。
Protocol Buffers 纯粹是为了序列化而设计的,它不是像 Thrift 这样的框架。
I think most of these points have missed the basic fact that Thrift is an RPC framework, which happens to have the ability to serialize data using a variety of methods (binary, XML, etc).
Protocol Buffers are designed purely for serialization, it's not a framework like Thrift.
ProtocolBuffers 更快。
这里有一个很好的基准:
https://github.com/eishay/jvm-serializers/wiki(最后更新2016 年,但截至 2020 年,有包含更快序列化器的分叉,例如 ActiveJ 创建了一个分叉来展示其在 JVM 上的速度: https://github.com/activej/jvm-serializers)。
您可能还想研究 Avro,它可以更快。 .NET 中有两个 Avro 库:
顺便说一下,我最快的见过的是Cap'nProto;
AC# 实现可以在 Marc Gravell 的 Github 存储库 中找到。
ProtocolBuffers is FASTER.
There is a nice benchmark here:
https://github.com/eishay/jvm-serializers/wiki (last updated 2016, but there are forks that contain faster serializers as of 2020, e.g. ActiveJ created a fork to demonstrate their speed on the JVM: https://github.com/activej/jvm-serializers).
You might also want to look into Avro, which can be faster. There are two libraries for Avro in .NET:
By the way, the fastest I've ever seen is Cap'nProto;
A C# implementation can be found at the Github-repository of Marc Gravell.
尚未提及的一件明显的事情是,它们是二进制协议,这既可以是优点也可以是缺点(并且两者都相同)。 这允许更紧凑的表示和可能更高的性能(优点),但缺点是可读性(或更确切地说,可调试性)降低。
此外,两者的工具支持都比 xml(甚至可能是 json)等标准格式要少一些。
(编辑)这是一个有趣的比较,它解决了大小和大小问题。 性能差异,还包括一些其他格式(xml、json)的数字。
One obvious thing not yet mentioned is that can be both a pro or con (and is same for both) is that they are binary protocols. This allows for more compact representation and possibly more performance (pros), but with reduced readability (or rather, debuggability), a con.
Also, both have bit less tool support than standard formats like xml (and maybe even json).
(EDIT) Here's an Interesting comparison that tackles both size & performance differences, and includes numbers for some other formats (xml, json) as well.
同样重要的是要注意,并非所有受支持的语言都与 thrift 或 protobuf 一致。 此时,除了底层序列化之外,还涉及模块实现的问题。 请注意检查您计划使用的任何语言的基准。
It's also important to note that not all supported languages compair consistently with thrift or protobuf. At this point it's a matter of the modules implementation in addition to the underlying serialization. Take care to check benchmarks for whatever language you plan to use.
这里有一些很好的观点,我将添加另一个观点,以防有人在这里遇到问题。
Thrift 为您提供了在 thrift-binary 和 thrift-compact(反)序列化器之间进行选择的选项,thrift-binary 将具有出色的性能,但数据包大小更大,而 thrift-compact 将为您提供良好的压缩,但需要更多的处理能力。 这很方便,因为您始终可以像更改一行代码一样轻松地在这两种模式之间切换(哎呀,甚至可以将其配置为可配置)。 因此,如果您不确定您的应用程序应该针对数据包大小或处理能力进行多少优化,那么 thrift 可能是一个有趣的选择。
PS:请参阅
thekvs
的这个出色的基准项目,它比较了许多序列化器,包括 thrift-binary、thrift-compact 和 protobuf:https://github.com/thekvs/cpp-serializersPS:还有另一个名为
YAS
的序列化程序,它也提供了此选项,但它是架构 -少看上面的链接。There are some excellent points here and I'm going to add another one in case someones' path crosses here.
Thrift gives you an option to choose between thrift-binary and thrift-compact (de)serializer, thrift-binary will have an excellent performance but bigger packet size, while thrift-compact will give you good compression but needs more processing power. This is handy because you can always switch between these two modes as easily as changing a line of code (heck, even make it configurable). So if you are not sure how much your application should be optimized for packet size or in processing power, thrift can be an interesting choice.
PS: See this excellent benchmark project by
thekvs
which compares many serializers including thrift-binary, thrift-compact, and protobuf: https://github.com/thekvs/cpp-serializersPS: There is another serializer named
YAS
which gives this option too but it is schema-less see the link above.根据 wiki,Thrift 运行时不能在 Windows 上运行。
And according to the wiki the Thrift runtime doesn't run on Windows.
我认为基本数据结构不同
事实上,Thrift 有两种不同的 JSON 编码方式,以及不少于三种不同的二进制编码方式。
总之,这两个库是完全不同的。 Thrift 喜欢一站式商店,为你提供整个集成的 RPC 框架和很多选项(支持跨语言),而 Protocol Buffers 更倾向于“只做一件事并做好”。
I think the basic data structure is different
In fact, Thrift has two different JSON encodings, and no less than three different binary encoding methods.
In conclusion,these two libraries are completely different. Thrift likes a one-stop shop, giving you the entire integrated RPC framework and many options (supporting cross-language), while Protocol Buffers is more inclined to "just do one thing and do it well".
其一,protobuf 并不是一个完整的 RPC 实现。 它需要 gRPC 之类的东西来配合。
与 Thrift 相比,gPRC 非常慢:
http://szelei.me/rpc-benchmark-part1/
For one, protobuf isn't a full RPC implementation. It requires something like gRPC to go with it.
gPRC is very slow compared to Thrift:
http://szelei.me/rpc-benchmark-part1/
它们都提供许多相同的功能; 然而,也有一些区别:
Set
类型基本上,它们是相当等效的(从我读到的内容来看,Protocol Buffers 的效率稍高)。
They both offer many of the same features; however, there are some differences:
Set
typeBasically, they are fairly equivalent (with Protocol Buffers slightly more efficient from what I have read).
另一个重要的区别是默认支持的语言。
两者都可以扩展到其他平台,但这些是开箱即用的语言绑定。
Another important difference are the languages supported by default.
Both could be extended to other platforms, but these are the languages bindings available out-of-the-box.
RPC 是另一个关键区别。 Thrift 生成代码来实现 RPC 客户端和服务器,其中 Protocol Buffers 似乎主要被设计为单独的数据交换格式。
RPC is another key difference. Thrift generates code to implement RPC clients and servers wheres Protocol Buffers seems mostly designed as a data-interchange format alone.
选项optimize_for = SPEED
。要仔细查看差异,请查看此开源项目中的源代码差异。
option optimize_for = SPEED
.For a closer look at the differences, check out the source code diffs at this open source project.
正如我在 “Thrift 与 Protocol buffers” 主题中所说的
: Thrift 与 Protobuf 与 JSON 比较 :
此外,还有许多有趣的附加工具可用于这些解决方案,这可能会决定。 以下是 Protobuf 的示例: Protobuf-wireshark 、 protobufeditor。
As I've said as "Thrift vs Protocol buffers" topic :
Referring to Thrift vs Protobuf vs JSON comparison :
Additionally, there are plenty of interesting additional tools available for those solutions, which might decide. Here are examples for Protobuf: Protobuf-wireshark , protobufeditor.
Protocol Buffers 似乎有更紧凑的表示,但这只是我阅读 Thrift 白皮书时得到的印象。 用他们自己的话说:
另外,这可能只是我的印象,但协议缓冲区似乎对结构版本控制有一些更厚的抽象。 Thrift 确实有一些版本控制支持,但需要付出一些努力才能实现。
Protocol Buffers seems to have a more compact representation, but that's only an impression I get from reading the Thrift whitepaper. In their own words:
Also, it may just be my impression, but Protocol Buffers seems to have some thicker abstractions around struct versioning. Thrift does have some versioning support, but it takes a bit of effort to make it happen.