Kryo 序列化库:它在生产中使用吗?
Kryo 是一个非常新且有趣的 Java 序列化库,也是 thrift-protobuf 基准测试。如果您使用过 Kryo,它是否已经足够成熟,可以在生产代码中进行尝试?
更新 (10/27/2010): 我们正在使用 Kryo,但尚未投入生产。详情请参阅下面我的回答。
更新 (3/9/2011): 更新到最新的 Jackson 和 Kryo 库表明 Jackson 的二进制 Smile 序列化非常有竞争力。
Kryo is a very new and interesting Java serialization library, and one of the fastest in the thrift-protobuf benchmark. If you've used Kryo, has it already reached enough maturity to try it out in production code?
Update (10/27/2010): We're using Kryo, though not yet in production. See my answer below for details.
Update (3/9/2011): Updating to the latest Jackson and Kryo libraries shows that Jackson's binary Smile serialization is pretty competitive.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
我会尝试回答我自己的问题(Kyro 仍然很新!)。
我们使用 Restlet 框架 实现了一组大约 120 种不同的 Web 服务。这些由通常构建在基于 Restlet 的客户端库之上的 Web 服务客户端使用。服务器和客户端之间来回发送的表示形式包括 XML(使用 XStream 序列化库)、JSON(使用 < a href="http://jackson.codehaus.org/" rel="noreferrer">Jackson)、XHTML、Java 对象序列化,截至昨天,克里奥。因此,我们可以进行一些可靠的并排比较。
Kryo 1.0.1 看起来相当稳定。当我真正阅读如何使用 API 后,我发现的唯一真正问题是默认的 java.util.Date 序列化程序似乎将日期扭曲到过去几个月。我只需要提供自己的覆盖:
但这是迄今为止我发现的唯一可能的问题。我们有一组 JavaBean,其中包含 String、Float、Integer、Long、Date、Boolean 和 List 字段。
以下是一些粗略的基准。首先,我对描述一个电视节目的对象层次结构进行了 100,000 次序列化和反序列化(即制作了 100,000 个深度副本)。速度是:
接下来,我还序列化了 2,000 个电视节目描述的目录并计算了字节数:
我还发现注册序列化器非常重要:
如果我不这样做,序列化的大小几乎是原来的两倍,并且速度是可能慢 40%。
我们还使用这四种序列化方法对多个 Web 服务进行了完整的端到端测试,它们还表明 Kryo 的运行速度比其他方法更快。
总而言之,Kryo 看起来相当稳健。我将在我们的代码库中保留对它的支持,随着我们获得使用它的经验,我希望在更多地方使用它。向 Kryo 团队致敬!
更新(2011年3月9日):我终于听从了@StaxMan的建议,尝试使用Jackson 1.6的二进制“Smile”序列化器。使用 Jackson 1.6 和 Kryo 1.04,我对一个稍微不同的电视节目对象层次结构进行了 100,000 次深度复制(序列化/反序列化):
此测试与宏观级别的测试不相符,我在 REST Web 服务中尝试了不同的序列化器,交付许多这样的对象。总体系统吞吐量支持@StaxMan 关于性能的直觉:
I'll try to answer my own question (Kyro is still very new!).
We have a set of about 120 different web services implemented using the Restlet framework. These are consumed by web service clients generally built on top of a Restlet-based client library. The representations sent back and forth between server and client include XML (using the XStream serialization library), JSON (Using Jackson), XHTML, Java Object Serialization, and as of yesterday, Kryo. So we're in a position to do some solid side-by-side comparisons.
Kryo 1.0.1 seems reasonably stable. Once I actually read up on how to use the API, the only real problem I found was that the default java.util.Date serializer seemed to warp dates a few months into the past. I just had to provide my own override:
But that was the only possible issue I've found so far. We have a set of JavaBeans that have String, Float, Integer, Long, Date, Boolean and List fields.
Here are some rough benchmarks. First, I did 100,000 serializations and deserializations of an object hierarchy that describes one TV program (ie, made 100,000 deep copies of it). The speeds were:
Next, I also serialized a catalog of 2,000 TV program descriptions and counted bytes:
I also found that registering serializers was very important:
If I didn't do that, the serializations were almost double the size, and the speed was maybe 40% slower.
We also ran complete end-to-end tests of several web services using each of these four serialization methods, and they also showed that Kryo was running faster than the others.
So in summary, Kryo seems reasonably robust. I'm going to keep support for it in our code base and as we gain experience with it I hope to use it in more places. Kudos to the Kryo team!
Update (3/9/2011): I finally got around to @StaxMan's suggestion to try Jackson 1.6's binary "Smile" serializer. Using Jackson 1.6 and Kryo 1.04, I did 100,000 deep copies (serialization/deserialiations) of a somewhat different TV program object hierarchy:
This test didn't mesh with a macro-level test, where I tried different serializers in a REST web service that delivers many of these objects. There the overall system throughput supports @StaxMan's intuition about performance:
有一个错误报告和一个讨论帖。 Kryo 附带的 DateSerializer 在大小方面比 SO 上发布的 SimpleSerializer 实现稍微高效一些,因为它使用针对正值优化的 LongSerializer。
编辑:我忘了回答原来的问题。我相信 Kryo 至少在一些生产系统中使用。这篇文章中提到了它,Jive SBS 缓存重新设计:第 3 部分。在Destroy All Humans项目中,Kryo 用于与 Android 手机进行通信,Android 手机充当机器人大脑(视频)。
不是直接答案,但您可以浏览 Kryo 源 和/或 javadocs。查看 Kryo 类上的 read* 和 write* 方法,然后查看 Serializer 类。这确实是图书馆的核心。
There is a bug report and a discussion thread. The DateSerializer that comes with Kryo is slightly more efficient size-wise than the SimpleSerializer implementation posted on SO because it uses LongSerializer optimized for positive values.
Edit: I forgot to answer the original question. I believe Kryo is used in at least a few production systems. There is mention of it in this article, Jive SBS cache redesign: Part 3. In the Destroy All Humans project, Kryo is used to communicate with an Android phone that serves as a robot brain (video here).
Not a direct answer, but you might browse the Kryo source and/or javadocs. Check out the read* and write* methods on the Kryo class, then look at the Serializer class. This is really the core of the library.
Kryo 是雅虎 S4(简单可扩展流系统)项目的一部分。据我所知,S4 还没有量产。
Kryo is part of Yahoo's S4 (Simple Scalable Streaming System) project. S4 isn't production yet as far as I know.
在上面 Jim Ferrans 的回复和评论的帮助下,我在此页面上找到了有关 Kryo 日期序列化问题的更详细说明:http://groups.google.com/group/kryo-users/browse_thread/thread/91969c6f48a45bdf/
还有如何使用 Kryo 的 DateSerializer():
kryo.register(Date.class, new DateSerializer());
我希望这可以帮助其他人。
With the help of Jim Ferrans responses and comments above I found a more detailed explanation about Date Serialization Issue with Kryo on this page: http://groups.google.com/group/kryo-users/browse_thread/thread/91969c6f48a45bdf/
and also a how to use DateSerializer() of Kryo:
kryo.register(Date.class, new DateSerializer());
I hope this could help others.
最新版本的 Kryo 在某些极端情况下存在一些竞争条件,在 Java 到 ns-3 的模拟器接口上运行。如果没有问题,可能会要求开发人员提交我的一些更改。
The latest version of Kryo has a few race conditions in some extreme cases, running on a simulator interface to ns-3 from Java. Might ask the developer to commit some of my changes back if they are problem free.
Apache Storm 在将消息从一个任务传递到另一个任务之前,使用它进行序列化。
所以是的,它一定非常稳定,因为 几家大公司(即 Twitter 和 Spotify)都使用 Storm。
Apache Storm uses it for serialization before passing messages from one task to another.
So yes it must be quite stable since Storm is used by several huge companies, i.e., Twitter and Spotify.
Kryo 2.x 也被 Mule ESB 使用,因此在生产中得到广泛应用。
Kryo 2.x is also used by Mule ESB, and so widely used in production.
2017 年更新:
Flink 使用 Kryo。
因此,几乎任何使用 Flink 框架的东西都依赖于 Kryo。
参考:https://ci. apache.org/projects/flink/flink-docs-release-0.8/programming_guide.html#specifying-keys
2017 update:
Kryo is used by Flink.
So practically anything that is using Flink framework is relying on Kryo.
Reference: https://ci.apache.org/projects/flink/flink-docs-release-0.8/programming_guide.html#specifying-keys
Kryo 网站有关于使用 Kryo 进行生产的项目的部分
The Kryo site has section on projects in production using Kryo