优化php中的大导入
我有一个简单的导入器,它会遍历相当大的 csv 的每一行并将其导入到数据库中。
我的问题是:我应该调用另一个方法来插入每个对象(生成 DO 并告诉它的映射器插入)还是应该在导入方法中对插入过程进行硬编码,复制代码?
我知道最优雅的做法是调用第二种方法,但我一直在脑海中听到函数调用的成本很高。
你怎么认为?
I have a simple importer, it goes through each line of a rather big csv and imports it to the database.
My question is: Should I call another method to insert each object (generating a DO and telling it's mapper to insert) or should I hardcode the insert process in the import method, duplicating the code?
I know the elegant thing to do is to call the second method, but I keep hearing in my head that function calls are expensive.
What do you think?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
许多 RDBMS 品牌都支持特殊命令来进行批量导入。例如:
加载数据文件
复制
< /a>批量插入
SQL*Loader
使用这些命令是优于从 CSV 数据源一次插入一行,因为批量加载命令的运行速度通常至少快一个数量级。
Many RDBMS brands support a special command to do bulk imports. For example:
LOAD DATA INFILE
COPY
BULK INSERT
SQL*Loader
Using these commands is preferred over inserting one row at a time from a CSV data source because the bulk-loading command usually runs at least an order of magnitude faster.
我认为这并不重要。考虑批量插入。至少确保您正在使用事务,并考虑在插入之前禁用索引。
I don't think this matters too much. Consider a bulk insert. At least make sure you're using a transaction, and consider to disable indices before inserting.
这应该不重要,因为插入所花费的时间可能比 php 代码长几个数量级。
正如其他人所说,批量插入将为您带来更多好处。
这些行级优化只会让你对更高级别的优化视而不见。
如果您不确定,请用两种方式进行简单的计时,应该不会超过几分钟就能找到答案。
如果一次性达到某些内存/时间/...限制,请考虑结合使用这两种方法来进行批量插入。
It shouldn't matter, as the insertion will take probably orders of magnitude longer than the php code.
As others have stated, bulk insert will give you much more benefit.
Those line-level optimizations will only make you blind for the good higher level optimizations.
If you are unsure, do a simple timing with both ways, it shouldn't take longer than a couple of minutes to find out.
Consider combining both approaches to make batch inserts, if all-at-once hits some memory/time/.... limits.