使用 Datamapper 迭代整个表的最有效方法是什么?

发布于 2024-11-07 22:44:42 字数 239 浏览 5 评论 0原文

使用 Datamapper 迭代整个表的最有效方法是什么?

如果我这样做,Datamapper 是否会在执行迭代之前尝试将整个结果集拉入内存?为了便于论证,假设我有数百万条记录,这是不可行的:

Author.all.each do |a|
  puts a.title
end

有没有一种方法可以告诉 Datamapper 以块的形式加载结果?它是否足够聪明,知道自动执行此操作?

What's the most efficient way to iterate through an entire table using Datamapper?

If I do this, does Datamapper try to pull the entire result set into memory before performing the iteration? Assume, for the sake of argument, that I have millions of records and that this is infeasible:

Author.all.each do |a|
  puts a.title
end

Is there a way that I can tell Datamapper to load the results in chunks? Is it smart enough to know to do this automatically?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

ま昔日黯然 2024-11-14 22:44:42

谢谢,尼古拉斯,我实际上想出了一个类似的解决方案。我已经接受了你的答案,因为它使用了 Datamapper 的 dm-pagination 系统,但我想知道这是否会同样有效(或更糟):

while authors = Author.slice(offset, CHUNK) do
  authors.each do |a|
    # do something with a
  end
  offset += CHUNK
end

Thanks, Nicolas, I actually came up with a similar solution. I've accepted your answer since it makes use of Datamapper's dm-pagination system, but I'm wondering if this would do equally as well (or worse):

while authors = Author.slice(offset, CHUNK) do
  authors.each do |a|
    # do something with a
  end
  offset += CHUNK
end
梦罢 2024-11-14 22:44:42

对于上面的示例,Datamapper 将仅运行一个 sql 查询,因此它必须将整个结果集保留在内存中。

我认为如果您的收藏很大,您应该使用某种分页。
使用 dm-pagination 您可以执行以下操作:

PAGE_SIZE = 20
pager = Author.page(:per_page => PAGE_SIZE).pager # This will run a count query
(1..pager.total_pages).each do |page_number|
  Author.page(:per_page => PAGE_SIZE, :page => page_number).each do |a|
    puts a.title
  end
end

您可以使用 PAGE_SIZE 的不同值来查找sql查询数量和内存使用之间的一个很好的权衡。

Datamapper will run just one sql query for the example above so it will have to keep the whole result set in memory.

I think you should use some sort of pagination if your collection is big.
Using dm-pagination you could do something like:

PAGE_SIZE = 20
pager = Author.page(:per_page => PAGE_SIZE).pager # This will run a count query
(1..pager.total_pages).each do |page_number|
  Author.page(:per_page => PAGE_SIZE, :page => page_number).each do |a|
    puts a.title
  end
end

You can play around with different values for PAGE_SIZE to find a good trade-off between the number of sql queries and memory usage.

愁以何悠 2024-11-14 22:44:42

您想要的是 dm-chunked_query 插件:(文档中的示例)

require 'dm-chunked_query'

MyModel.each_chunk(20) do |chunk|
  chunk.each do |resource|
    # ...
  end
end

这将允许您一次迭代 20 条记录的块,迭代模型中的所有记录。

编辑:上面的示例在 #each_chunk 之后有一个额外的 #each,这是不必要的。 gem 作者更新了 README 示例,我更改了上面的代码以匹配。

What you want is the dm-chunked_query plugin: (example from the docs)

require 'dm-chunked_query'

MyModel.each_chunk(20) do |chunk|
  chunk.each do |resource|
    # ...
  end
end

This will allow you to iterate over all the records in the model, in chunks of 20 records at a time.

EDIT: the example above had an extra #each after #each_chunk, and it was unnecessary. The gem author updated the README example, and I changed the above code to match.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文