长时间运行至死亡迁移/find_each

发布于 2024-11-04 14:09:29 字数 3019 浏览 3 评论 0原文

使用 PostgreSQL 运行 Rails 3,

我进行了迁移,更新了数百万条小记录。

Record.find_each do |r|
  r.doing_incredibly_complex_stuff
  r.save!
  puts "#{r.id} updated"
end

由于我认为 ActiveRecord 将此类更新包装在事务中,因此“提交”时间非常长,占用的内存也很大,而在上面的示例中,每条记录都已“打印”在屏幕上。

那么,我可以在事务之外运行这个 find_each —— 虽然它非常安全 —— 从而节省大量“提交”时间和内存吗?

一种ActiveRecord::Base.without_transaction 做...;结束 我猜:-)

或者: 我错了,迁移没有包装到事务中,我看到的时间只是应用 SQL 更新语句?

编辑:似乎与事务没有链接,这是我中断迁移后得到的堆栈跟踪,当时所有内容都已打印在屏幕上,并且 RAM 从 500MB 可用空间减少到 ~30MB :

IRB::Abort: abort then interrupt!!
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `call'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `method_missing'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `flatten'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `block in select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `map'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:56:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/base.rb:467:in `find_by_sql'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:64:in `to_a'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:356:in `inspect'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:44:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:8:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands.rb:23:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'

编辑(2):哇。结果发现它很长,因为 find_each 在迭代后返回所有元素。

我尝试过:

Record.tap do |record_class|
  record_class.find_each do |r|
    r.doing_incredibly_complex_stuff
    r.save!
    puts "#{r.id} updated"
  end
end
=> Record(id: integer, ...)

所以它按预期立即返回了控制台。 :)

但我仍然看到一个奇怪的行为:RAM 没有释放。相反,一旦我退出学期,RAM 仍然会暴跌......

也许我的 Tap 解决方案并不令人满意?仍然是大众选拔吗?如何避免 find_each 之后的批量选择?

谢谢!

Running Rails 3 with PostgreSQL,

I've a migration, updating millions of small records.

Record.find_each do |r|
  r.doing_incredibly_complex_stuff
  r.save!
  puts "#{r.id} updated"
end

Since I think ActiveRecord wraps such updates in a transaction, the "commit" time is very long and the memory taken is HUGE, while every record has been "printed" on screen in the above example.

So, could I run this find_each outside a transaction -- while it is quite safe--, saving a lot of "commit" time and memory?

A kind of ActiveRecord::Base.without_transaction do ... ; end I guess :-)

OR :
I'm wrong, migrations are not wrapped into transactions, and the time I see is just SQL update statements applying?

EDIT : It seems threre is no link with transactions, here's the stack trace I got once I interrupt the migration, when all have been printed on screen and the RAM decreasing from 500MB free to ~30MB :

IRB::Abort: abort then interrupt!!
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `call'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `method_missing'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `flatten'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `block in select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `map'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:56:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/base.rb:467:in `find_by_sql'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:64:in `to_a'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:356:in `inspect'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:44:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:8:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands.rb:23:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'

EDIT(2): Wow. It turned out it was very long because find_each returns all elements after it iterated.

I tried :

Record.tap do |record_class|
  record_class.find_each do |r|
    r.doing_incredibly_complex_stuff
    r.save!
    puts "#{r.id} updated"
  end
end
=> Record(id: integer, ...)

So it gave back the console instantly as expected. :)

But then I still see a strange behavior : RAM doesn't free. Instead once I exited term, RAM still plunges...

Maybe my solution with tap is not satisfying? Is it still mass selecting? How would I avoid the mass select after the find_each?

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

只有一腔孤勇 2024-11-11 14:09:29

ActiveRecord::Migrationfind_each 都不会执行任何操作将代码包装在数据库事务中。 r.save! 将被包装在一个单独的事务中,该事务涵盖了保存的任何级联效果。

如上面的评论所示,使用 update_all 或原始 execute 进行批量更新会更快。我无法判断这是否适合您正在做的事情。另外,如果您遇到内存问题,您应该能够调整 find_each 上的批量大小并查看是否有效果。如果没有,您可能在某处抓住了这些物体。

Neither ActiveRecord::Migration nor find_each do anything to wrap your code in a database transaction. The r.save! will be wrapped in an individual transaction that covers any cascading effects of the save.

As in the comments above, using update_all or a raw execute will be faster for mass updating. I have no way to tell if that would be appropriate for what you're doing. Also, if you're having memory issues, you should be able to tweak the batch size on find_each and see if it has an effect. If not, you may be holding onto those objects somewhere.

凹づ凸ル 2024-11-11 14:09:29

也许您可以构造该方法以添加最后一条语句作为返回值,而不是返回 find_each 的返回值。您可以将最后一个“结束”替换为

end ; nil

Perhaps you could structure the method to add a last statement to serve as a return value, rather than returning the return value of find_each. You could replace the last "end" with

end ; nil
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文