查找每 50,000 个,然后查找下一个 50,000 个,依此类推,并将它们保存在不同的文件中
我有以下 Rake 文件。使用 RoR 2.3.8。
desc "Create shops sitemap"
task(:shops => :environment) do
sitemap = Sitemap.new
#add every item
for i in shop.find(:all, :select => 'id, updated_at', :order => 'updated_at DESC', :limit => 50000)
sitemap.add_url("http://abc.com/shops/#{i.id}",w3c_date(i.updated_at),'daily','1.0')
end
puts "#{sitemap.urls.length} total urls"
#delete the file
FileUtils.rm(File.join(RAILS_ROOT, "public/sitemap_shops_1.xml.gz"), :force => true)
f =File.new(File.join(RAILS_ROOT, "public/sitemap_shops_1.xml"), 'w')
sitemap.write(f,2)
f.close
system("gzip #{File.join(RAILS_ROOT, 'public/sitemap_shops_1.xml')}")
end
上面的文件根据上次更新搜索前 50,000 条记录,然后保存在编号为 1 的文件中。
如何修改代码以使其搜索接下来的 50,000 条记录,并保存编号为 2 的文件,然后保存下一个 50,000 条记录,另存为编号为 1 的文件3,等等?
谢谢。
I have the following Rake file. Using RoR 2.3.8.
desc "Create shops sitemap"
task(:shops => :environment) do
sitemap = Sitemap.new
#add every item
for i in shop.find(:all, :select => 'id, updated_at', :order => 'updated_at DESC', :limit => 50000)
sitemap.add_url("http://abc.com/shops/#{i.id}",w3c_date(i.updated_at),'daily','1.0')
end
puts "#{sitemap.urls.length} total urls"
#delete the file
FileUtils.rm(File.join(RAILS_ROOT, "public/sitemap_shops_1.xml.gz"), :force => true)
f =File.new(File.join(RAILS_ROOT, "public/sitemap_shops_1.xml"), 'w')
sitemap.write(f,2)
f.close
system("gzip #{File.join(RAILS_ROOT, 'public/sitemap_shops_1.xml')}")
end
The file above searches the first 50,000 records based on last updated, then save in a file numbered 1.
How do I modify the code to have it search the next 50,000, and save the file numbered 2, then next 50,000, save as file numbered 3, etc.?
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可以使用
find_in_batches
代替find
,它一次返回 1,000 个组(但您可以使用:batch_size
将其覆盖为 50,000) > 选项)。添加一个计数器变量(因为我认为find_in_batches
没有类似each_with_index
的东西),您就可以处理您需要的所有文件。Instead of
find
, you can usefind_in_batches
which will return groups of 1,000 at a time (but you can override this to be 50,000 with the:batch_size
option). Throw in a counter variable (since I don't thinkfind_in_batches
has anything like aneach_with_index
) and you can handle all the files you need.