SqlBulkCopy 速度慢,未充分利用网络速度

发布于 2024-10-07 16:05:10 字数 2435 浏览 5 评论 0原文

在过去的几周里,我一直在创建能够复制数据库的通用脚本。目标是能够指定某个服务器上的任何数据库并将其复制到其他位置,并且它应该只复制指定的内容。要复制的确切内容在配置文件中指定。该脚本将在大约 10 个不同的数据库上使用,并每周运行一次。最终我们仅复制大约 3%-20% 的数据库(大小达到 500GB)。我一直在使用 SMO 组件来实现这一目标。这是我第一次使用 SMO,花了一些时间来创建复制架构对象、文件组等的通用方法。 (实际上帮助找到了一些错误的存储过程)。

总的来说,我有一个工作脚本,但缺乏性能(有时会超时),希望你们能够提供帮助。当执行 WriteToServer 命令复制大量数据(> 6GB)时,它达到了 1 小时的超时时间。下面是复制表数据的核心代码。该脚本是用 PowerShell 编写的。

$query = ("SELECT * FROM $selectedTable " + $global:selectiveTables.Get_Item($selectedTable)).Trim()
Write-LogOutput "Copying $selectedTable : '$query'"            
$cmd = New-Object Data.SqlClient.SqlCommand -argumentList $query, $source
$cmd.CommandTimeout = 120;
$bulkData = ([Data.SqlClient.SqlBulkCopy]$destination)
$bulkData.DestinationTableName = $selectedTable;
$bulkData.BulkCopyTimeout = $global:tableCopyDataTimeout # = 3600
$reader = $cmd.ExecuteReader();
$bulkData.WriteToServer($reader); # Takes forever here on large tables

源数据库和目标数据库位于不同的服务器上,因此我也跟踪网络速度。网络利用率从未超过 1%,这让我感到非常惊讶。但当我在服务器之间传输一些大文件时,网络利用率飙升至 10%。我尝试将 $bulkData.BatchSize 设置为 5000 但没有真正改变。将 BulkCopyTimeout 增加到更大的值只能解决超时问题。我真的很想知道为什么网络没有得到充分利用。

还有其他人遇到这个问题吗?任何有关网络或批量复制的建议将不胜感激。如果您需要更多信息,请告诉我。

谢谢。

更新

我调整了几个选项来提高 SqlBulkCopy 的性能,例如将事务日志记录设置为简单并向 SqlBulkCopy 提供表锁而不是默认的行锁。此外,一些表针对某些批量大小进行了更好的优化。总体而言,副本的持续时间减少了约 15%。我们要做的是在不同的服务器上同时执行每个数据库的副本。但我在复制其中一个数据库时仍然遇到超时问题。

当复制一个较大的数据库时,有一个表我始终遇到以下异常:

System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. 

在开始复制表后大约 16 分钟抛出该表,该表与我的 BulkCopyTimeout 相差甚远。尽管我得到的例外是表最终被完全复制。另外,如果我截断该表并仅重新启动该表的进程,则这些表将被复制而不会出现任何问题。但是对于该表来说,复制整个数据库的过程总是失败。

我尝试执行整个过程并在复制该错误表之前重置连接,但仍然出错。我的 SqlBulkCopy 和 Reader 在每个表后关闭。关于还有什么可能导致脚本每次都失败的任何建议吗?

CREATE TABLE [dbo].[badTable](
[someGUID] [uniqueidentifier] NOT NULL,
[xxx] [uniqueidentifier] NULL,
[xxx] [int] NULL,
[xxx] [tinyint] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NULL,
[xxx] [uniqueidentifier] NOT NULL,
[xxx] [uniqueidentifier] NULL,
CONSTRAINT [PK_badTable] PRIMARY KEY NONCLUSTERED 
(
[someGUID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

目标数据库上不存在该表的索引。

for that past couple of weeks I have been creating generic script that is able to copy databases. The goal is to be able to specify any database on some server and copy it to some other location, and it should only copy the specified content. The exact content to be copied over is specified in a configuration file. This script is going to be used on some 10 different databases and run weekly. And in the end we are copying only about 3%-20% of databases which are as large as 500GB. I have been using the SMO assemblies to achieve this. This is my first time working with SMO and it took a while to create generic way to copy the schema objects, filegroups ...etc. (Actually helped find some bad stored procs).

Overall I have a working script which is lacking on performance (and at times times out) and was hoping you guys would be able to help. When executing the WriteToServer command to copy large amount of data (> 6GB) it reaches my timeout period of 1hr. Here is the core code for copying table data. The script is written in PowerShell.

$query = ("SELECT * FROM $selectedTable " + $global:selectiveTables.Get_Item($selectedTable)).Trim()
Write-LogOutput "Copying $selectedTable : '$query'"            
$cmd = New-Object Data.SqlClient.SqlCommand -argumentList $query, $source
$cmd.CommandTimeout = 120;
$bulkData = ([Data.SqlClient.SqlBulkCopy]$destination)
$bulkData.DestinationTableName = $selectedTable;
$bulkData.BulkCopyTimeout = $global:tableCopyDataTimeout # = 3600
$reader = $cmd.ExecuteReader();
$bulkData.WriteToServer($reader); # Takes forever here on large tables

The source and target databases are located on different servers so I kept track of the network speed as well. The network utilization never went over 1% which was quite surprising to me. But when I just transfer some large files between the servers, the network utilization spikes up to 10%. I have tried setting the $bulkData.BatchSize to 5000 but nothing really changed. Increasing the BulkCopyTimeout to an even greater amount would only solve the timeout. I really would like to know why the network is not being used fully.

Anyone else had this problem? Any suggestions on networking or bulk copy will be appreciated. And please let me know if you need more information.

Thanks.

UPDATE

I have tweaked several options that increase the performance of SqlBulkCopy, such as setting the transaction logging to simple and providing a table lock to SqlBulkCopy instead of the default row lock. Also some tables are better optimized for certain batch sizes. Overall, the duration of the copy was decreased by some 15%. And what we will do is execute the copy of each database simultaneously on different servers. But I am still having a timeout issue when copying one of the databases.

When copying one of the larger databases, there is a table for which I consistently get the following exception:

System.Data.SqlClient.SqlException: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. 

It is thrown about 16 minutes after it starts copying the table which is no where near my BulkCopyTimeout. Even though I get the exception that table is fully copied in the end. Also, if I truncate that table and restart my process for that table only, the tables is copied over without any issues. But going through the process of copying that entire database fails always for that one table.

I have tried executing the entire process and reseting the connection before copying that faulty table, but it still errored out. My SqlBulkCopy and Reader are closed after each table. Any suggestions as to what else could be causing the script to fail at the point each time?

CREATE TABLE [dbo].[badTable](
[someGUID] [uniqueidentifier] NOT NULL,
[xxx] [uniqueidentifier] NULL,
[xxx] [int] NULL,
[xxx] [tinyint] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NOT NULL,
[xxx] [datetime] NULL,
[xxx] [uniqueidentifier] NOT NULL,
[xxx] [uniqueidentifier] NULL,
CONSTRAINT [PK_badTable] PRIMARY KEY NONCLUSTERED 
(
[someGUID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

No indexes exist for this table on the target DB.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

执手闯天涯 2024-10-14 16:05:10

您是否考虑过删除索引,进行插入,然后重新索引?

Have you considered removing indexes, doing the insert, and then reindexing?

南冥有猫 2024-10-14 16:05:10

我使用了一个数据集,想知道这是否会更快:

$ds=New-Object system.Data.DataSet
$da=New-Object system.Data.SqlClient.SqlDataAdapter($cmd)
[void]$da.fill($ds)
bulkData.WriteToServer($ds.Tables[0])

I've used a dataset and wonder if this would be faster:

$ds=New-Object system.Data.DataSet
$da=New-Object system.Data.SqlClient.SqlDataAdapter($cmd)
[void]$da.fill($ds)
bulkData.WriteToServer($ds.Tables[0])
淡看悲欢离合 2024-10-14 16:05:10

SqlBulk Copy 是将数据复制到 SQL 表中的最快方法。
您的速度应该超过每秒 10,000 行。
为了测试批量复制功能,请尝试 DBSourceTools。 ( http://dbsourcetools.codeplex.com )
该实用程序旨在将数据库脚本写入磁盘,然后在目标服务器上重新创建它们。
复制数据时,DBSourceTools会先将所有数据导出到本地.xml文件,然后批量复制到目标数据库。
通过将过程分为两遍,这将有助于进一步确定瓶颈在哪里:一次用于读取,一次用于写入。

SqlBulk Copy is by far the fastest way of copying data into SQL tables.
You should be getting speeds in excess of 10,000 rows per second.
In order to test the bulk copy functionality, try DBSourceTools. ( http://dbsourcetools.codeplex.com )
This utility is designed to script Databases to disk, and then re-create them on a target server.
When copying data, DBSourceTools will first export all data to a local .xml file, and then do a Bulk Copy to the target database.
This will help to further identify where your bottleneck is, by breaking the process up into two passes : one for reading and one for writing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文