如何加快更新大量行的速度
我有一个包含 1.400.000 个条目的表。它是一个简单的文档列表
表 - 文档
- ID int
- DocumentPath nvarchar
- DocumentValid 位
我扫描一个目录并将在该目录中找到的任何文档设置为有效。
public void SetReportsToValidated(List<int> validatedReports)
{
SqlConnection myCon = null;
try
{
myCon = new SqlConnection(_conn);
myCon.Open();
foreach (int id in validatedReports)
{
SqlDataAdapter myAdap = new SqlDataAdapter("update_DocumentValidated", myCon);
myAdap.SelectCommand.CommandType = CommandType.StoredProcedure;
SqlParameter pId = new SqlParameter("@Id", SqlDbType.Int);
pId.Value = id;
myAdap.SelectCommand.Parameters.Add(pId);
myAdap.SelectCommand.ExecuteNonQuery();
}
}
catch (SystemException ex)
{
_log.Error(ex);
throw;
}
finally
{
if (myCon != null)
{
myCon.Close();
}
}
}
更新的性能还可以,但我想要更多。 100万份文档更新生效需要1个多小时。有什么好的办法可以加快更新速度吗?我正在考虑使用某种批处理(例如表值参数)。 在 SQLServer 上进行分析时,每次更新大约需要 5-10 毫秒。
I have a table that has 1.400.000 entries. Its is a simple list of documents
Table - Document
- ID int
- DocumentPath nvarchar
- DocumentValid
bit
I scan a directory and set any document found in the directory as valid.
public void SetReportsToValidated(List<int> validatedReports)
{
SqlConnection myCon = null;
try
{
myCon = new SqlConnection(_conn);
myCon.Open();
foreach (int id in validatedReports)
{
SqlDataAdapter myAdap = new SqlDataAdapter("update_DocumentValidated", myCon);
myAdap.SelectCommand.CommandType = CommandType.StoredProcedure;
SqlParameter pId = new SqlParameter("@Id", SqlDbType.Int);
pId.Value = id;
myAdap.SelectCommand.Parameters.Add(pId);
myAdap.SelectCommand.ExecuteNonQuery();
}
}
catch (SystemException ex)
{
_log.Error(ex);
throw;
}
finally
{
if (myCon != null)
{
myCon.Close();
}
}
}
The performance of Updates is ok, but I want more. It takes more than 1 hour to update 1000000 of the documents to valid. Is there any good way to speed up the updates? I am thinking of using some kind of batch (like table valued parameters).
Each update takes some 5-10ms when profiled on SQLServer.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
读取报告并将它们一起附加到数据表中(因为它们具有相同的维度),然后使用 SqlBulkCopy 对象用于上传整个内容。可能会更适合你。鉴于列数和行数较少,我认为您不会遇到内存问题。
Read the reports in and append them together in a DataTable (since they have the same dimensions) then use the SqlBulkCopy object for to upload the entire thing. Will probably work better for you. I don't think you will have memory issues given the small number of columns and rows.
目前,您正在分别为每条记录调用数据库。您可以使用 SqlDataAdapter 进行批量更新(简而言之):
1) 定义一个 SqlDataAdapter
2) 将适配器上的 .UpdateCommand 设置为您的更新存储过程
3) 调用适配器上的 .Update 方法,向其传递一个包含要更新的文档 ID 的 DataTable。这会将数据表中更新的行批量存储到数据库中,以批量方式为每条记录调用存储过程。您可以通过 .BatchSize 属性控制批量大小。
4)所以你要做的就是删除手动的逐行循环,这对于批量更新来说效率低下。
查看示例:
http://support.microsoft.com/kb/308055
http://www.c-sharpcorner.com/UploadFile/61b832/4430/
或者,您可以:
1)使用SqlBulkCopy将所有ID批量插入到数据库中的新表中(效率很高)
2) 加载到该临时表后,运行单个 SQL 语句从该临时表更新主表以验证文档。
查看示例:
http://www.adathedev.co .uk/2010/02/sqlbulkcopy-bulk-load-to-sql-server.html
http://www.adathedev.co .uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html
At the moment you are calling the db for each record individually. You can use the SqlDataAdapter to do bulk updates by (in a very brief nutshell):
1) define one SqlDataAdapter
2) set the .UpdateCommand on the adapter to your update sproc
3) call the .Update method on the adapter, passing it a DataTable containing the ids of documents to be updated. This will batch up the updated rows from the DataTable in to the DB, calling the sproc for each record in a batched manner. You can control the Batch Size via the .BatchSize property.
4) So what you're doing is removing the manual, row by row looping which is inefficient for batched updates.
See examples:
http://support.microsoft.com/kb/308055
http://www.c-sharpcorner.com/UploadFile/61b832/4430/
Alternatively, you could:
1) Use SqlBulkCopy to bulk insert all the IDs into a new table in the database (highly efficient)
2) Once loaded in to that staging table, run a single SQL statement to update your main table from that staging table to validate the documents.
See examples:
http://www.adathedev.co.uk/2010/02/sqlbulkcopy-bulk-load-to-sql-server.html
http://www.adathedev.co.uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html
不必每次在循环中创建适配器和参数,只需创建一次并为参数分配不同的值:
这可能不会带来非常显着的改进,但与原始代码相比更好。此外,当您手动执行 SqlCommand 对象时,您根本不需要适配器。直接使用SqlCommand就可以了。
Instead of creating the adapter and parameter every time in the loop just create them once and assign different value to the parameter:
This might not result in a very dramatic improvement but is better compared to the original code. Also, as you are manually executing the SqlCommand object you do not need the adapter at all. Just use the SqlCommand directly.