当前位置：文江博客话题详情

如何加快 C# 的 MongoDB 反序列化速度

发布于 2024-12-29 13:11:57 字数 102 浏览 4 评论 0原文

当从查询返回许多结果时，代码需要花费很长时间才能将数据转换为 .net 对象。这些是基本对象，带有一些字符串作为字段。我不确定，但我认为它使用反射来创建实例，速度很慢。有办法加快这个速度吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

蓝咒 2025-01-05 13:11:57

10gen 驱动程序不基于每个对象使用反射。它对每个类型使用一次反射，以使用 Reflection.Emit 生成序列化器，因此第一个对象的序列化或反序列化可能会很慢，但之后的任何对象都很快（相对）。

您的问题 - 有什么办法可以加快速度吗？

如果您的对象很简单（不是嵌套文档、一些公共字段等），那么您可能无能为力。您可以为该类实现一个自定义序列化器来勉强提高一点性能，但我怀疑它不会超过几个百分点。

我还没有研究过它，罗伯特·斯塔姆（Robert Stam）（他也回答了这个问题）将是这方面的权威，但是通过在驱动程序中并行化反序列化可能会在多核或多处理器系统上获得一些性能。我还没有从这个角度研究过驱动程序代码，所以这可能是罗伯特已经追求的东西。

总的来说，我认为 10 秒内 30,000 个对象对于任何平台来说都是相当标准的 - SQL、Mongo、XML 等不直接将对象存储为内存 blob 的平台（就像您可以使用 C++ 等语言一样）。

编辑：

看起来 10gen 驱动程序在返回游标供您枚举之前执行反序列化。因此，如果您的查询返回 30,000 个结果，则在驱动程序使游标可用于枚举之前，必须对所有 30,000 个对象进行反序列化。我没有看过 jmongo 驱动程序，但我希望它会做相反的事情，并将反序列化推迟到游标中枚举对象之后。

最终结果是，虽然两者可能花费相同的总时间来枚举和反序列化 30,000 个对象，但 jmongo 驱动程序中的反序列化分布在整个枚举中，而在 c# 驱动程序中它是前端加载的。

差异很微妙，但可能可以解释您所看到的内容。

坏消息是“修复”是驱动程序的更改。您可以做的一件事是将查询分成多个块，一次查询 10 或 100 个对象。

The 10gen driver doesn't use reflection on a per object basis. It uses reflection once per type to generate a serializer using Reflection.Emit, so serialization or deserialization of the first object might be slow, but any objects afterward are fast (relatively).

Your question - is there any way to speed this up?

If your objects are simple (not nested documents, a few public fields, etc.), there probably isn't much you can do. You could implement a custom serializer for the class to eke out a little performance, but I doubt it would be more than a few percent.

I haven't looked into it, and Robert Stam (who answered this question as well) would be the authority on it, but there may be some performance to be gained on multicore or multiprocessor systems by parallelizing deserialization in the driver. I haven't looked at the driver code from that perspective yet, so it may be something Robert has already pursued.

On a general note, I think 30,000 objects in 10 seconds is pretty standard for just about any platform - SQL, Mongo, XML, etc that isn't storing objects as memory blobs directly (like you could using a language like C++).

EDIT:

It looks like the 10gen driver performs deserialization before it returns a cursor for you to enumerate. So if your query returns 30,000 results, all 30,000 objects have to be deserialized before the driver makes a cursor available for enumeration. I haven't looked at the jmongo driver, but I expect that it does the opposite, and defers deserialization until after an object is enumerated in the cursor.

The net result is that while both probably take the same amount of total time to enumerate and deserialize 30,000 objects, deserialization in the jmongo driver is spread across the entire enumeration, where in the c# driver it is frontloaded.

The difference is subtle, but likely to explain what you are seeing.

The bad news is the "fix" is a driver change. One thing you could do is break your query up in chunks, querying for 10 or 100 objects at a time.

回复收藏 0 原文