长时间运行至死亡迁移/find_each
使用 PostgreSQL 运行 Rails 3,
我进行了迁移,更新了数百万条小记录。
Record.find_each do |r|
r.doing_incredibly_complex_stuff
r.save!
puts "#{r.id} updated"
end
由于我认为 ActiveRecord 将此类更新包装在事务中,因此“提交”时间非常长,占用的内存也很大,而在上面的示例中,每条记录都已“打印”在屏幕上。
那么,我可以在事务之外运行这个 find_each —— 虽然它非常安全 —— 从而节省大量“提交”时间和内存吗?
一种ActiveRecord::Base.without_transaction 做...;结束 我猜:-)
或者: 我错了,迁移没有包装到事务中,我看到的时间只是应用 SQL 更新语句?
编辑:似乎与事务没有链接,这是我中断迁移后得到的堆栈跟踪,当时所有内容都已打印在屏幕上,并且 RAM 从 500MB 可用空间减少到 ~30MB :
IRB::Abort: abort then interrupt!!
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `call'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `method_missing'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `flatten'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `block in select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `map'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:56:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/base.rb:467:in `find_by_sql'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:64:in `to_a'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:356:in `inspect'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:44:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:8:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands.rb:23:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'
编辑(2):哇。结果发现它很长,因为 find_each 在迭代后返回所有元素。
我尝试过:
Record.tap do |record_class|
record_class.find_each do |r|
r.doing_incredibly_complex_stuff
r.save!
puts "#{r.id} updated"
end
end
=> Record(id: integer, ...)
所以它按预期立即返回了控制台。 :)
但我仍然看到一个奇怪的行为:RAM 没有释放。相反,一旦我退出学期,RAM 仍然会暴跌......
也许我的 Tap 解决方案并不令人满意?仍然是大众选拔吗?如何避免 find_each 之后的批量选择?
谢谢!
Running Rails 3 with PostgreSQL,
I've a migration, updating millions of small records.
Record.find_each do |r|
r.doing_incredibly_complex_stuff
r.save!
puts "#{r.id} updated"
end
Since I think ActiveRecord wraps such updates in a transaction, the "commit" time is very long and the memory taken is HUGE, while every record has been "printed" on screen in the above example.
So, could I run this find_each outside a transaction -- while it is quite safe--, saving a lot of "commit" time and memory?
A kind of ActiveRecord::Base.without_transaction do ... ; end I guess :-)
OR :
I'm wrong, migrations are not wrapped into transactions, and the time I see is just SQL update statements applying?
EDIT : It seems threre is no link with transactions, here's the stack trace I got once I interrupt the migration, when all have been printed on screen and the RAM decreasing from 500MB free to ~30MB :
IRB::Abort: abort then interrupt!!
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `call'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activesupport-3.0.4/lib/active_support/whiny_nil.rb:46:in `method_missing'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `flatten'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:978:in `block in select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `map'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:977:in `select'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:7:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:56:in `select_all'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/base.rb:467:in `find_by_sql'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:64:in `to_a'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/activerecord-3.0.4/lib/active_record/relation.rb:356:in `inspect'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:44:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands/console.rb:8:in `start'
from /Users/clement/.rvm/gems/ruby-1.9.2-p136@gemset/gems/railties-3.0.4/lib/rails/commands.rb:23:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'
EDIT(2): Wow. It turned out it was very long because find_each returns all elements after it iterated.
I tried :
Record.tap do |record_class|
record_class.find_each do |r|
r.doing_incredibly_complex_stuff
r.save!
puts "#{r.id} updated"
end
end
=> Record(id: integer, ...)
So it gave back the console instantly as expected. :)
But then I still see a strange behavior : RAM doesn't free. Instead once I exited term, RAM still plunges...
Maybe my solution with tap is not satisfying? Is it still mass selecting? How would I avoid the mass select after the find_each?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
ActiveRecord::Migration
和find_each
都不会执行任何操作将代码包装在数据库事务中。r.save!
将被包装在一个单独的事务中,该事务涵盖了保存的任何级联效果。如上面的评论所示,使用
update_all
或原始execute
进行批量更新会更快。我无法判断这是否适合您正在做的事情。另外,如果您遇到内存问题,您应该能够调整find_each
上的批量大小并查看是否有效果。如果没有,您可能在某处抓住了这些物体。Neither
ActiveRecord::Migration
norfind_each
do anything to wrap your code in a database transaction. Ther.save!
will be wrapped in an individual transaction that covers any cascading effects of the save.As in the comments above, using
update_all
or a rawexecute
will be faster for mass updating. I have no way to tell if that would be appropriate for what you're doing. Also, if you're having memory issues, you should be able to tweak the batch size onfind_each
and see if it has an effect. If not, you may be holding onto those objects somewhere.也许您可以构造该方法以添加最后一条语句作为返回值,而不是返回 find_each 的返回值。您可以将最后一个“结束”替换为
Perhaps you could structure the method to add a last statement to serve as a return value, rather than returning the return value of find_each. You could replace the last "end" with