使用 Google V8 实现最快的 Javascript 对象序列化

发布于 2024-11-11 14:59:33 字数 853 浏览 2 评论 0原文

我需要序列化具有 1-100 个混合类型属性的中等复杂对象。

最初使用 JSON,然后我改用 BSON,它的速度稍快一些。

编码 10000 个样本对象

JSON:        1807mS
BSON:        1687mS
MessagePack: 2644mS (JS, modified for BinaryF)

我想要一个数量级的增加;它对系统的其他部分产生了极其严重的影响。

转向 BSON 的部分动机是需要对二进制数据进行编码,因此 JSON(现在)不适合。而且因为它只是跳过对象中存在的二进制数据,所以它在这些基准测试中是“作弊”的。

分析 BSON 性能热点

  • (不可避免?)UTF16 V8 JS 字符串到 UTF8 的转换。
  • BSON 库内的 malloc 和字符串操作

BSON 编码器基于 Mongo BSON 库。

原生 V8 二进制序列化器可能很棒,但由于 JSON 是原生的并且可以快速序列化,我担心即使这样也可能无法提供答案。也许我最好的选择是优化 BSON 库,或者编写我自己的库,并找出更有效的方法从 V8 中提取字符串。一种策略可能是向 BSON 添加 UTF16 支持。

所以我来这里是为了寻求想法,也许是为了进行健全性检查。

编辑

添加了 MessagePack 基准测试。这是从原始 JS 修改为使用 BinaryF。

C++ MessagePack 库可能会提供进一步的改进,我可能会单独对它进行基准测试,以直接与 BSON 库进行比较。

I need to serialize moderately complex objects with 1-100's of mixed type properties.

JSON was used originally, then I switched to BSON which is marginally faster.

Encoding 10000 sample objects

JSON:        1807mS
BSON:        1687mS
MessagePack: 2644mS (JS, modified for BinaryF)

I want an order of magnitude increase; it is having a ridiculously bad impact on the rest of the system.

Part of the motivation to move to BSON is the requirement to encode binary data, so JSON is (now) unsuitable. And because it simply skips the binary data present in the objects it is "cheating" in those benchmarks.

Profiled BSON performance hot-spots

  • (unavoidable?) conversion of UTF16 V8 JS strings to UTF8.
  • malloc and string ops inside the BSON library

The BSON encoder is based on the Mongo BSON library.

A native V8 binary serializer might be wonderful, yet as JSON is native and quick to serialize I fear even that might not provide the answer. Perhaps my best bet is to optimize the heck out of the BSON library or write my own plus figure out far more efficient way to pull strings out of V8. One tactic might be to add UTF16 support to BSON.

So I'm here for ideas, and perhaps a sanity check.

Edit

Added MessagePack benchmark. This was modified from the original JS to use BinaryF.

The C++ MessagePack library may offer further improvements, I may benchmark it in isolation to compare directly with the BSON library.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

小姐丶请自重 2024-11-18 14:59:33

我最近(2020)发表了一篇文章和基准测试,比较了 JavaScript 中的二进制序列化库。

比较了以下格式和库:

  • Protocol Buffer:protobuf-jspbfprotonsgoogle-protobuf
  • Avro: avsc
  • BSON: bson
  • BSER: bser
  • JSBinary: js-binary

根据当前的基准测试结果,我会秩按以下顺序排列顶级库(值越高越好,给出的测量值比 JSON 快 x 倍):

  1. avsc:10 倍编码,3-10 倍解码
  2. js-binary :2x编码,2-8x解码
  3. protobuf-js:0.5-1x编码,2-6x解码,
  4. pbf:1.2x编码, 1.0x 解码
  5. bser:0.5x 编码,0.5x 解码
  6. bson:0.5x 编码,0.7x 解码

我没有在基准测试中包含 msgpack因为根据其 NPM 描述,它目前比内置 JSON 库慢。

有关详细信息,请参阅完整的文章

I made a recent (2020) article and benchmark comparing binary serialization libraries in JavaScript.

The following formats and libraries are compared:

  • Protocol Buffer: protobuf-js, pbf, protons, google-protobuf
  • Avro: avsc
  • BSON: bson
  • BSER: bser
  • JSBinary: js-binary

Based on the current benchmark results I would rank the top libraries in the following order (higher values are better, measurements are given as x times faster than JSON):

  1. avsc: 10x encoding, 3-10x decoding
  2. js-binary: 2x encoding, 2-8x decoding
  3. protobuf-js: 0.5-1x encoding, 2-6x decoding,
  4. pbf: 1.2x encoding, 1.0x decoding
  5. bser: 0.5x encoding, 0.5x decoding
  6. bson: 0.5x encoding, 0.7x decoding

I did not include msgpack in the benchmark as it is currently slower than the build-in JSON library according to its NPM description.

For details, see the full article.

著墨染雨君画夕 2024-11-18 14:59:33

对于序列化/反序列化,protobuf 很难被击败。我不知道你是否可以切换传输协议。但如果可以protobuf绝对应该考虑。

查看协议缓冲区与 JSON 或 BSON 的所有答案。

接受的答案选择thrift。然而它比 protobuf 慢。我怀疑选择它是为了易于使用(使用 Java)而不是速度。 这些 Java 基准测试非常有说服力。
值得注意的是

  • MongoDB-BSON 45042
  • protobuf 6539
  • protostuff/protobuf 3318

基准测试是 Java,我想你可以达到接近 protobuf 的 protostuff 实现的速度,即快 13.5 倍。最坏的情况(如果出于某种原因 Java 更适合序列化),你可以使用普通的未优化的 protobuf 实现,它的运行速度提高了 6.8 倍。

For serialization / deserialization protobuf is pretty tough to beat. I don't know if you can switch out the transport protocol. But if you can protobuf should definitely be considered.

Take a look at all the answers to Protocol Buffers versus JSON or BSON.

The accepted answer chooses thrift. It is however slower than protobuf. I suspect it was chosen for ease of use (with Java) not speed. These Java benchmarks are very telling.
Of note

  • MongoDB-BSON 45042
  • protobuf 6539
  • protostuff/protobuf 3318

The benchmarks are Java, I'd imagine that you can achieve speeds near the protostuff implementation of protobuf, ie 13.5 times faster. Worst case (if for some reason Java is just better for serialization) you can do no worse the the plain unoptimized protobuf implementation which runs 6.8 times faster.

走野 2024-11-18 14:59:33

查看 MessagePack。它与 JSON 兼容。来自文档:

快速且紧凑的序列化

MessagePack 是一个基于二进制的
高效的对象序列化
图书馆。它可以实现交换
许多之间的结构化对象
像 JSON 这样的语言。但与 JSON 不同的是,
它非常快而且很小。

典型的小整数(如标志或
错误代码)仅保存在 1 个字节中,
而典型的短字符串只需要 1
除字符串长度外的字节
本身。 [1,2,3](3 个元素数组)是
使用 4 个字节序列化
消息包如下:

Take a look at MessagePack. It's compatible with JSON. From the docs:

Fast and Compact Serialization

MessagePack is a binary-based
efficient object serialization
library. It enables to exchange
structured objects between many
languages like JSON. But unlike JSON,
it is very fast and small.

Typical small integer (like flags or
error code) is saved only in 1 byte,
and typical short string only needs 1
byte except the length of the string
itself. [1,2,3] (3 elements array) is
serialized in 4 bytes using
MessagePack as follows:

抽个烟儿 2024-11-18 14:59:33

如果您对反序列化速度更感兴趣,请查看 JBB (Javascript Binary Bundles) 库。它比 BSON 或 MsgPack 更快。

来自 Wiki 页面 JBB vs BSON vs MsgPack

...

  • 在解码速度上,JBB 比 Binary-JSON (BSON) 快约 70%,比 MsgPack 快约 30%,即使只有一个负面测试用例 (#3)。
  • JBB 创建的文件(甚至是压缩版本)比 Binary-JSON (BSON) 小约 61%,比 MsgPack 小约 55%。

...

不幸的是,它不是流格式,这意味着您必须离线预处理数据。但是,有计划将其转换为流格式(检查里程碑)。

If you are more interested on the de-serialisation speed, take a look at JBB (Javascript Binary Bundles) library. It is faster than BSON or MsgPack.

From the Wiki, page JBB vs BSON vs MsgPack:

...

  • JBB is about 70% faster than Binary-JSON (BSON) and about 30% faster than MsgPack on decoding speed, even with one negative test-case (#3).
  • JBB creates files that (even their compressed versions) are about 61% smaller than Binary-JSON (BSON) and about 55% smaller than MsgPack.

...

Unfortunately, it's not a streaming format, meaning that you must pre-process your data offline. However there is a plan for converting it into a streaming format (check the milestones).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文