如何减少批量 nvarchar(max) 更新的事务日志增长

发布于 2024-08-17 04:05:31 字数 804 浏览 5 评论 0原文

我们的应用程序需要向 SQL Server 2005 数据库添加大量文本（单个记录最多 1 GB）。出于性能原因，这是通过对每个块进行存储过程调用（例如 usp_AddChunk）来以块的形式完成的。 usp_AddChunk 没有任何显式事务。

我看到的是，将块大小从 100MB 减少到 10MB 会导致事务日志变得更大。有人告诉我这是因为每次调用 usp_AddChunk 时，“隐式”（我的术语）事务将记录所有现有文本。因此，对于 150MB 记录：

100MB 块大小：100（记录 0 字节）+ 50（记录 100 MB）= 记录 100 MB

将小于

10 MB 块大小：10（记录 0 字节）+ 10（记录 10 MB） + 10（20 MB 记录）... + 10（140 MB 记录）= 1050 MB 记录

我认为通过在我的 C# 代码中打开一个事务（在添加第一个块之前，并在最后一个块之后提交），这个“隐式”事务不会发生，并且我可以避免巨大的日志文件。但我的测试显示，使用 ADO.NET 事务时事务日志增长了 5 倍。

我不会发布代码，但这里有一些细节：

我调用 SqlConnection.BeginTransaction()
我为每个块使用不同的 SqlCommand
我将 (1) 中的 SqlTransaction 分配给每个 SqlCommand
我通常在每个 SqlCommand 执行后关闭连接，但我也尝试过不关闭连接，结果相同

这个方案有什么缺陷？如果您需要更多信息，请告诉我。谢谢！

注意：不能选择使用简单或批量日志恢复模型

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

放血 2024-08-24 04:05:31

如果“块”的意思是：

UPDATE table
SET blob = blob + @chunk
WHERE key = @key;

那么您是对的，该操作已完全记录。您应该遵循 BLOB 使用指南并使用用于分块更新的 .Write 方法：

UPDATE table
SET blob.Write(@chunk, NULL, NULL)
WHERE key = @key;

这将最少记录更新（如果可能，请参阅可以最少记录的操作）：

UPDATE 语句已完全记录；
然而，对大的部分更新
使用 .WRITE 的值数据类型
子句的记录最少。

不仅这是最少记录的，而且因为更新是在 BLOB 末尾显式写入，所以引擎会知道您只更新了 BLOB 的一部分，并且只会记录该部分。当您使用 SET blob=blob+@chunk 进行更新时，引擎将看到整个 BLOB 已收到新值，并且不会检测到您实际上只是通过附加新数据更改了 BLOB 的事实，因此它会记录整个 BLOB（多次，正如您已经发现的）。

顺便说一句，你应该使用大小为 8040 倍数的块：

为了获得最佳性能，我们建议
数据被插入或更新
块大小是 8040 的倍数
字节。

If by 'chunks' you mean something like:

UPDATE table
SET blob = blob + @chunk
WHERE key = @key;

Then you are right that the operation is fully logged. You should follow the BLOB usage guidelines and use the .Write methods for chuncked updates:

UPDATE table
SET blob.Write(@chunk, NULL, NULL)
WHERE key = @key;

This will minimally log the update (if possible, see Operations That Can Be Minimally Logged):

The UPDATE statement is fully logged;
however, partial updates to large
value data types using the .WRITE
clause are minimally logged.

Not only that this is minimally logged, but because the update is an explicit write at the end of the BLOB, the engine will know that you only updated a portion of the BLOB and will only log that. When you update with SET blob=blob+@chunk te engine will see that the entire BLOB has received a new value and won't detect the fact that you really only changed the BLOB by appending new data, so the it will log the entire BLOB (several times, as you already found out).

BTW you should use chunks of size multiple of 8040: