使用 SSIS 导入数百万条记录
有什么加快导入流程的技巧吗?数据库中有很多连接。
另外,当 SSIS 任务完成时,通过代码或使用电子邮件通知 SSIS 处理下一个功能的最佳方法是......?
Any tips for speeding up the import processes? Theres alot of Joins in the db.
Also, when a SSIS task is completed, would the best way to handle the next functions by code or using the Emailing notification SSIS has..?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
下面是我用来说明在 3 分钟内将 100 万行从文本文件加载到 SQL Server 数据库的示例。示例中的包是使用 SSIS 208 R2 创建的,并在 Xeon 单核 CPU 2.5GHz 和 3.00 GB RAM 上执行。
查找后在 SSIS 上导入记录
导入大量行的主要瓶颈之一是目标组件。目标组件插入行的速度越快,前面的源或转换组件处理行的速度就越快。同样,如果您碰巧有像排序转换这样的组件,那么情况会有所不同,因为排序转换会在将数据发送到管道之前保留所有要排序的数据。
发送电子邮件取决于您想要做什么。
如果您需要简单的成功或失败,您可以简单地使用发送邮件任务。其他选项是,您还可以启用 SQL 作业的警报通知,您可以从中安排包定期运行。
如果您需要将更多信息添加到电子邮件中,那么您可能需要使用脚本任务来制定邮件正文。创建邮件正文后,您可以从脚本任务中发送邮件或使用发送邮件任务。
希望该示例以及@Nikhil S 提供的文章能够帮助您微调您的软件包。
Here is a sample that I have used to illustrate loading 1 million rows in under 3 minutes from text file to SQL Server database. The package in the sample was created using SSIS 208 R2 and was executed on Xeon single core CPU 2.5GHz and 3.00 GB RAM.
Import records on SSIS after lookup
One of the main bottlenecks in importing large number of rows will be the destination component. Faster the destination component can insert the rows, the faster the preceding source or transformation components can process the rows. Again if you happen to have components like Sort transformation that will be different because Sort transformation would hold up all the data to sort before sending it down the pipeline.
Sending email depends on what you would like to do.
If you need simple success or failure, you could simply use Send Mail task. Other option is that you could also enable the Alert notification on SQL jobs from where you might schedule the package to run on regular basis.
If you need more information to be added to the email, then you might need to use a Script Task to formulate the message body. After creating the message body, you can send the mail from within Script Task or use Send Mail task.
Hope that example along with the article @Nikhil S provided should help you fine tune your package.
这篇 SimpleTalk 文章 讨论优化数据流任务的方法
将要传输的数据水平划分为 N 个数据流。其中 N 是安装 ssis 的服务器上可用的 cpu 核心数。
使用 ssis 缓冲区大小属性来确定我们的设置最适合您的数据类型。
This SimpleTalk article discusses ways to optimize your data flow task
Horizontally partition your data-to-be transferred into N data flows. Where N is the number of cpu cores available on your server where ssis is installed.
Play with the ssis buffer size property to figure our setting optimal for your kind of data.