批量插入 monoid
我目前在 Heroku 上有这个 Web 应用程序,它采用主要是逗号分隔值(或其他分隔符分隔值)的纯文本,用户将其复制并粘贴到 Web 表单中,然后应用程序将从每一行获取数据并保存它到 mongo 数据库。
例如,
45nm, 180, 3
44nm, 180, 3.5
45nm, 90, 7
...
@project = Project.first # project embeds_many :simulations
@array_of_array_of_data_from_csv.each do |line|
@project.simulations.create(:thick => line[0], :ang => line[1], :pro => line[2])
#e.g. line[0] => 45nm, line[1] => 180, line[2] => 3
end
出于此应用程序的目的,我不能让用户进行任何类型的导入,我们必须从文本区域获取数据。用户每次最多可以粘贴 30,000 行。我尝试在 Heroku 上执行此操作(30,000 个数据点),并在控制台中使用一些虚假数据,它终止说控制台不支持长进程,请尝试使用 rake 任务。
所以我想知道是否有人知道插入30,000个文档需要这么长时间(当然,也可以是这样),或者知道另一种快速插入30,000个文档的方法?
感谢您的帮助
I have this web app currently on Heroku that takes plain text of mostly comma separated values (or other delimiter separated values) that a user copies-and-pastes into a web form, and the app will then get data from each line and save it to a mongo db.
for instance
45nm, 180, 3
44nm, 180, 3.5
45nm, 90, 7
...
@project = Project.first # project embeds_many :simulations
@array_of_array_of_data_from_csv.each do |line|
@project.simulations.create(:thick => line[0], :ang => line[1], :pro => line[2])
#e.g. line[0] => 45nm, line[1] => 180, line[2] => 3
end
For this app's purposes, I can't let the user do any kind of import, we have to get the data from them from a textarea. And each time, the user can paste upto 30,000 lines. I tried doing that (30,000 data points) on Heroku with some fake data in a console, it terminated saying long processes are not supported in console, try rake tasks instead.
So I was wondering if anyone knows either way it takes so long to insert 30,000 documents (of course, it can be that's the the way it is), or knows another way to speedily insert 30,000 documents?
Thanks for your help
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您要插入那么多文档,则应该批量执行...我通常会批量插入 200,000 个文档,它们会立即创建!
因此,不必每次都创建一个“创建”/插入新文档的循环,只需让循环附加一组文档,然后将其作为一大批插入到 MongoDB 中即可。
可以在这个问题中找到如何使用 mongoid 执行此操作的示例< /a>.
但是,您应该记住,这可能最终会占用大量内存(因为在构建它时,整个哈希/文档数组都将位于内存中。)
请小心:)
If are you inserting that many documents you should be doing it as a batch ... I routinely insert 200,000 document batches and they get created in a snap!
So, instead of making a loop that "creates" / inserts a new document each time just have your loop append an array of documents and then insert that into MongoDB as one big batch.
An example of how to do that with mongoid can be found in this question.
However you should keep in mind this might end up being fairly memory intensive (as the whole array of hashes/documents would be in memory as you build it.)
Just be careful :)