Mssql/Perl DBI 性能问题
我正在编写一个应用程序,该应用程序使用 Perl DBI 对远程 mssql 服务器进行大量大型插入。还不确定它是 05 还是 08 sql server,但到目前为止我已经对它进行了计时,并且性能相似。基本上,到目前为止,需要插入的大量行被证明是瓶颈。我在 08 中尝试过多行插入(以及 05 中的子选择 UNION ALL 技巧),更改是否在上一次获取期间或之后触发插入,在单行插入上使用 execut_array() ,所有这些都带有/没有绑定参数。
Psuedocode:
select data query
while fetchrow {
do lots of calculations
construct insert
1) do inserts here
}
2) or do inserts here
SQL Server 上的活动监视器平均多行插入为每条 70 毫秒。查询本身被限制为每条 58 行,因为插入中有 36 个字段,并且很容易达到 2100 个参数限制。
有什么明显的我忽略的事情吗?我可以尝试其他方法来改善时代吗?忽略延迟或硬件等问题,我觉得我的 Perl 工作流程或查询本身必须有另一个改进。 (我正在研究 sql server bcp、批量插入等)。
感谢您的任何建议
Im writing an application that does a lot of large inserts to a remote mssql server with Perl DBI. Not sure if it will be 05 or 08 sql server yet, but I've timed it in both so far and performance is similar. Basically, there are a large number of rows that need to be inserted are turning out to be the bottleneck by far. I have tried multi-row inserts in 08 ( and the sub-select UNION ALL trick in 05), changing whether the inserts are fired off during a previous fetch or after, using execut_array() on single-row inserts, all these with/without binding params.
Psuedocode:
select data query
while fetchrow {
do lots of calculations
construct insert
1) do inserts here
}
2) or do inserts here
The activity monitor on the sql server averages the multi-row inserts at 70ms a piece. The queries themselves have been limited to 58ish rows a piece because there are 36 fields on the insert, and it easily hits the 2100 parameter limit.
Is there anything obvious I am overlooking? Any other method I could try to improve the times? Ignoring issues like latency or hardware, I feel like there has to be another improvement to my Perl workflow or the queries themselves. (I was looking into sql server bcp, bulk inserts, etc).
Thanks for any advice
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
execute_array 不会有太大区别,因为它没有在 DBD::ODBC 中实现(因为我从来没有抽出时间来实现它)。因此,它实际上是 DBI,只是为您重复调用执行。
我不明白 58 行、36 个参数和达到 2100 个参数限制,尽管我可以看到 58 * 36 几乎是 2100。如果您的插入需要 36 个参数,只需在 select 之前准备它并在 while 循环中执行它 - 这应该更快,因为 SQL 不需要在每次插入时传递/解析。
除此之外,您可以禁用自动提交并在最后或定期提交,但每次插入少于一次,它应该会快得多。表上还可能有索引/触发器等,这些索引/触发器等可能会减慢插入速度,因此您可以考虑在之后禁用它们。
execute_array will not make much difference because it is not implemented in DBD::ODBC (because I've never got around to it). As such it is really DBI just calling execute repeatedly for you.
I don't understand the 58ish rows, 36 parameters and hitting 2100 parameter limit although I can see 58 * 36 is pretty much 2100. If your insert needs 36 parameters just prepare it before the select and execute it in the while loop - that should be faster since the SQL does not need to be passed/parsed each insert.
Apart from that, it you can disable AutoCommit and commit at the end or regularly but less than once per insert it should be a lot faster. You may also have indexes/triggers etc on the table which can slow inserts down so you could look at disabling them until afterwards.
我想 Perl/DBI 是一个要求(因为这可能不在 Windows 上运行)?
是否有可能使用 BCP(您将编写一个中间文本文件)或 SSIS(您可能能够在 SSIS 中完成所有操作)甚至 ADO.NET SqlBulkCopy(在某种 .NET 环境中)?所有这些都是专门为大负载而设计的。
I guess Perl/DBI is a requirement (because this isn't running on Windows perhaps)?
Any possibility of using BCP (you would write an intermediate text file) or SSIS (you might be able to do everything in SSIS) or even ADO.NET SqlBulkCopy (in some kind of .NET environment)? All these are specifically designed for large loads.