如何在 RavenDB 中批量上传关系数据并将其转换为聚合?

发布于 2024-11-09 13:25:03 字数 723 浏览 0 评论 0原文

我正在尝试了解如何将关系数据高效地批量插入 RavenDB,特别是从关系数据转换为聚合的情况。

假设我们有两个表的两个转储文件:OrdersOrderItems。它们太大而无法加载到内存中,因此我将它们作为流读取。我可以通读每个表并在 RavenDB 中创建与每一行相对应的文档。我可以使用批量请求来执行批量操作。到目前为止既简单又高效。

然后我想在服务器上对其进行转换,删除 OrderItems 并将它们集成到其父 Order 文档中。我怎样才能做到这一点而不需要数千次往返?

答案似乎介于基于集合的更新实时投影非规范化更新,但我不知道在哪里。

I'm trying to get my head around how to do efficient bulk inserts of relational data into RavenDB, particularly where converting from relational data to aggregates.

Let's say we have two dump files of two tables: Orders and OrderItems. They're too big to load into memory, so I read them as streams. I can read through each table and create a document in RavenDB corresponding to each row. I can do this as bulk operations using batched requests. Easy and efficient so far.

Then I want to transform this on the server, getting rid of the OrderItems and integrating them in to their parent Order documents. How can I do this without thousands of roundtrips?

The answer seems to lie somewhere between set-based updates, live projections and denormalized updates, but I don't know where.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

紫竹語嫣☆ 2024-11-16 13:25:03

您需要使用 非规范化更新基于集合的更新。查看 PATCH API 以了解它提供的功能。虽然如果您计划一次更新多个文档,则只需要基于集合的更新,但您可以直接使用 PATCH api 对已知文档进行修补。

实时投影仅在您获取查询/索引的结果时才会为您提供帮助,它们不会更改文档本身,只会更改从服务器返回到客户端的内容。

不过,我建议如果可能的话,在将订单和相应的 OrderItems 发送到 RavenDB 之前,将它们合并到内存中。您仍然可以从转储文件中传输数据,只需根据需要使用一些缓存即可。这将是最简单的选择。

已更新
我制作了一些示例代码来展示如何执行此操作。这会修补特定 Post 文档中的 Comments 数组/列表,在本例中为“Posts/1”

You're going to need to do this with denormalised updates and set-based updates. Take a look at the PATCH API to see what it offers. Although you only need the set-based updates if you plan on updating several docs at once, you can just patch against a know doc directly using the PATCH api.

Live projections will only help you when you are getting the results of a query/index, they don't change the docs themselves, only what is returned from the server to the client.

However I'd recommend that if possible you combine a Order and the corresponding OrderItems in-memory before you send them to RavenDB. You could still stream the data from the dump files, just use some caching if needed. This will be the simplest option.

Updated
I've made some sample code that shows how to do this. This patches the Comments array/list within a particular Post doc, in this case "Posts/1"

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文