Thrift、Protocol Buffers、JSON、EJB 等的性能比较?

发布于 2024-07-08 16:20:34 字数 317 浏览 6 评论 0原文

我们正在研究传输/协议解决方案,并准备进行各种性能测试,所以我想我应该向社区咨询一下他们是否已经这样做了:

是否有人对简单的回显服务以及序列化进行了服务器性能测试/反序列化各种消息大小,比较 Linux 上的 EJB3、Thrift 和 Protocol Buffer?

主要语言是 Java、C/C++、Python 和 PHP。

更新:我仍然对此非常感兴趣,如果有人做了任何进一步的基准测试,请告诉我。 另外,非常有趣的基准测试显示压缩的 JSON 的性能与 Thrift / Protocol Buffers 类似/更好,所以我也将 JSON 放入这个问题中。

We're looking into transport/protocol solutions and were about to do various performance tests, so I thought I'd check with the community if they've already done this:

Has anyone done server performance tests for simple echo services as well as serialization/deserialization for various messages sizes comparing EJB3, Thrift, and Protocol Buffers on Linux?

Primarily languages will be Java, C/C++, Python, and PHP.

Update: I'm still very interested in this, if anyone has done any further benchmarks please let me know. Also, very interesting benchmark showing compressed JSON performing similar / better than Thrift / Protocol Buffers, so I'm throwing JSON into this question as well.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

说好的呢 2024-07-15 16:20:35

最新比较可在 thrift-protobuf-compare 项目 wiki 中找到。 它包括许多其他序列化库。

Latest comparison available here at the thrift-protobuf-compare project wiki. It includes many other serialization libraries.

红尘作伴 2024-07-15 16:20:35

我正在名为 thrift-protobuf- 的开源项目中编写一些代码比较protobuf和thrift之间的比较。 目前它涵盖了很少的序列化方面,但我打算涵盖更多。 结果(针对 ThriftProtobuf)在我的博客中进行了讨论,当我接触到它时我会添加更多内容。
您可以查看代码来比较 API、描述语言和生成的代码。 我很乐意为实现更全面的比较做出贡献。

I'm in the process of writing some code in an open source project named thrift-protobuf-compare comparing between protobuf and thrift. For now it covers few serialization aspects, but I intend to cover more. The results (for Thrift and Protobuf) are discussed in my blog, I'll add more when I'll get to it.
You may look at the code to compare API, description language and generated code. I'll be happy to have contributions to achieve a more rounded comparison.

哎呦我呸! 2024-07-15 16:20:35

您可能对这个问题感兴趣:“Thrift 与 Protocol Buffers 的最大区别?”

You may be interested in this question: "Biggest differences of Thrift vs Protocol Buffers?"

久光 2024-07-15 16:20:35

我使用许多其他数据格式(xml、json、默认对象序列化、hessian、一种专有格式)和用于数据绑定任务(读取和写入)的库(jaxb、快速信息集、手写)测试了 PB 的性能,但 thrift 的格式不包括在内。 具有多个转换器的格式(例如 xml)的性能差异非常大,从非常慢到非常快。 作者的主张与感知表现之间的相关性相当弱。 尤其是对于那些声称最疯狂的包装。

无论如何,我发现 PB 性能有点被夸大了(通常不是它的作者,而是其他只知道是谁写的)。 使用默认设置,它无法击败最快的文本 xml 替代方案。 使用优化模式(为什么这不是默认模式?),它的速度要快一些,与最快的 JSON 包相当。 Hessian 相当快,文本 json 也很快。 专有的二进制格式(这里没有名称,它是公司内部的)是最慢的。 Java 对象序列化对于较大的消息来说速度很快,对于小对象则较差(即每个操作的固定高开销)。
使用 PB 消息大小很紧凑,但考虑到您必须做的所有权衡(数据不是自描述的:如果您丢失模式,您就会丢失数据;当然有索引和值类型,但是从您拥有的数据来看)如果需要的话,可以对字段名称进行逆向工程),我个人只会在特定用例中选择它——大小敏感、紧密耦合的系统,其中接口/格式从不(或很少)改变。

我对此的看法是,(a)实现通常比(数据格式)规范更重要,(b)端到端,最佳品种(针对不同格式)之间的差异通常不足以决定选择。
也就是说,您可能最好选择您最喜欢使用的格式+API/lib/框架(或具有最好的工具支持),找到最佳的实现,并看看它是否运行得足够快。
如果(且仅当!)不,请考虑下一个最佳选择。

附: 不确定这里的 EJB3 是什么。 也许只是普通的 Java 序列化?

I did test performance of PB with number of other data formats (xml, json, default object serialization, hessian, one proprietary one) and libraries (jaxb, fast infoset, hand-written) for data binding task (both reading and writing), but thrift's format(s) was not included. Performance for formats with multiple converters (like xml) had very high variance, from very slow to pretty-darn-fast. Correlation between claims of authors and perceived performance was rather weak. Especially so for packages that made wildest claims.

For what it is worth, I found PB performance to be bit over hyped (usually not by its authors, but others who only know who wrote it). With default settings it did not beat fastest textual xml alternative. With optimized mode (why is this not default?), it was bit faster, comparable with the fastest JSON package. Hessian was rather fast, textual json also. Properietary binary format (no name here, it was company internal) was the slowest. Java object serialization was fast for larger messages, less so for small objects (i.e. high fixed per-operation noverhead).
With PB message size was compact, but given all trade-offs you have to do (data is not self-descriptive: if you lose the schema, you lose data; there are indexes of course, and value types, but from what you have reverse-engineer back to field names if you want), I personally would only choose it for specific use cases -- size-sensitive, closely coupled system where interface/format never (or very very rarely) changes.

My opinion in this is that (a) implementation often matters more than specification (of data format), (b) end-to-end, differences between best-of-breed (for different formats) are usually not big enough to dictate the choice.
That is, you may be better off choosing format+API/lib/framework you like using most (or has best tool support), find best implementation, and see if that works fast enough.
If (and only if!) not, consider next best alternative.

ps. Not sure what EJB3 here would be. Maybe just plain of Java serialization?

一世旳自豪 2024-07-15 16:20:35

如果原始净性能是目标,那么没有什么比 IIOP 更好的了(请参阅 RMI/IIOP)。
尽可能小的占用空间——只有二进制数据,根本没有标记。 序列化/反序列化也非常快。

既然是IIOP(即CORBA),几乎所有语言都有绑定。

但我认为性能并不是唯一要求,对吧?

If the raw net performance is the target, then nothing beats IIOP (see RMI/IIOP).
Smallest possible footprint -- only binary data, no markup at all. Serialization/deserialization is very fast too.

Since it's IIOP (that is CORBA), almost all languages have bindings.

But I presume the performance is not the only requirement, right?

我为君王 2024-07-15 16:20:35

我的 PB“待办事项”列表中最重要的事情之一就是移植 Google 的内部 Protocol Buffer 性能基准 - 这主要是采用机密消息格式并将其转变为完全平淡的格式,然后对数据。

完成后,我想您可以在 Thrift 中构建相同的消息,然后比较性能。

换句话说,我还没有给你的数据 - 但希望在接下来的几周内......

One of the things near the top of my "to-do" list for PBs is to port Google's internal Protocol Buffer performance benchmark - it's mostly a case of taking confidential message formats and turning them into entirely bland ones, and then doing the same for the data.

When that's been done, I'd imagine you could build the same messages in Thrift and then compare the performance.

In other words, I don't have the data for you yet - but hopefully in the next couple of weeks...

不顾 2024-07-15 16:20:35

为了支持 Vladimir 关于 IIOP 的观点,这里有一个有趣的性能测试,它应该提供一些关于 google 基准测试的额外信息,因为它比较了 Thrift 和 CORBA。 (Performance_TIDorb_vs_Thrift_morfeo.pdf // 链接不再有效)
引用该研究的内容:

  • 节俭是非常有效的小
    数据(基本类型如操作
    论据)
  • Thrifts 传输效率不如 CORBA 中等和
    大数据(结构和>复杂
    类型> 1 KB)。

另一个奇怪的限制,与性能无关,是 Thrift 仅限于以结构形式返回几个值 - 尽管这与性能一样,也许肯定可以改进。

有趣的是,Thrift IDL 与 CORBA IDL 非常匹配,这很好。 我没有使用过 Thrift,它看起来很有趣,尤其是对于较小的消息,而且设计目标之一是减少安装的繁琐,所以这些是 Thrift 的其他优点。 也就是说,CORBA 的名声不好,有很多优秀的实现,例如 omniORB,它具有绑定对于 Python,易于安装和使用。

编辑:Thrift 和 CORBA 链接不再有效,但我确实从 CERN 找到了另一篇有用的论文。 他们评估了 CORBA 系统的替代方案,并且,虽然他们 评估了 Thrift,他们最终选择了 ZeroMQ。 虽然 Thrift 在性能测试中表现最快,为 9000 条消息/秒,而 8000 条消息/秒(ZeroMQ)和 7000+ RDA(基于 CORBA),但他们选择不进一步测试 Thrift,因为其他问题值得注意:

它仍然是一个不成熟的产品,实施过程中存在缺陷

To back up Vladimir's point about IIOP, here's an interesting performance test, that should give some additional info over the google benchmarks, since it compares Thrift and CORBA. (Performance_TIDorb_vs_Thrift_morfeo.pdf // link no longer valid)
To quote from the study:

  • Thrift is very efficient with small
    data (basic types as operation
    arguments)
  • Thrifts transports are not so efficient as CORBA with medium and
    large data (struct and >complex
    types > 1 kilobytes).

Another odd limitation, not having to do with performance, is that Thrift is limited to returning only several values as a struct - although this, like performance, can surely be improved perhaps.

It is interesting that the Thrift IDL closely matches the CORBA IDL, nice. I haven't used Thrift, it looks interesting especially for smaller messages, and one of the design goals was for a less cumbersome install, so these are other advantages of Thrift. That said, CORBA has a bad rap, there are many excellent implementations out there like omniORB for example, which has bindings for Python, that are easy to install and use.

Edited: The Thrift and CORBA link is no longer valid, but I did find another useful paper from CERN. They evaluated replacements for their CORBA system, and, while they evaluated Thrift, they eventually went with ZeroMQ. While Thrift performed the fastest in their performance tests, at 9000 msg/sec vs. 8000 (ZeroMQ) and 7000+ RDA (CORBA-based), they chose not to test Thrift further because of other issues notably:

It is still an immature product with a buggy implementation

蔚蓝源自深海 2024-07-15 16:20:35

我为我的工作研究了 spring-boot、映射器(手动、Dozer 和 MapStruct)、Thrift、REST、SOAP 和 Protocol Buffers 集成。

服务器端: https://github.com/vlachenal/webservices-bench

客户端: https://github.com/vlachenal/webservices-bench-client

这是尚未完成,已在我的个人计算机上运行(我必须要求服务器来完成测试)...但结果可以在以下位置查阅:

作为结论:

  • Thrift 提供了最佳性能并且易于使用
  • RESTful Web 服务,具有 JSON 内容类型,与 Thrift 性能非常接近,是“浏览器随时可用”并且非常优雅(从我的角度来看)视图)
  • SOAP 的性能非常差,但提供了最好的数据控制
  • Protocol Buffers 具有良好的性能......直到 3 个同时调用......我不知道为什么。 它很难使用:我(暂时)放弃让它与 MapStruct 一起使用,并且我不尝试使用 Dozer。

项目可以通过拉取请求(修复或其他结果)来完成。

I have done a study for spring-boot, mappers (manual, Dozer and MapStruct), Thrift, REST, SOAP and Protocol Buffers integration for my job.

The server side: https://github.com/vlachenal/webservices-bench

The client side: https://github.com/vlachenal/webservices-bench-client

It is not finished and has been run on my personal computers (I have to ask for servers to complete the tests) ... but results can be consulted on:

As conclusion :

  • Thrift offers the best performance and is easy to use
  • RESTful webservice with JSON content type is pretty close to Thrift performance, is "browser ready to use" and is quite elegant (from my point of view)
  • SOAP has very poor performance but offers the best data control
  • Protocol Buffers has good performance ... until 3 simultaneous calls ... and I don't know why. It is very difficult to use: I give up (for now) to make for it work with MapStruct and I don't try with Dozer.

Projects can be completed through pull requests (either for fixes or other results).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文