MSSQL 大表游标无法分配空间
我们正在尝试设置一个游标来运行由同一个巨大表(超过 150 M 记录)的两个“实例”之间的联接生成的记录。
出现以下异常信息:
无法为数据库“tempdb”中的对象“dbo.SORT 临时运行存储:165282123350016”分配空间,因为“PRIMARY”文件组已满。 通过删除不需要的文件、删除文件组中的对象、向文件组添加其他文件或为文件组中的现有文件设置自动增长来创建磁盘空间。
你们有人知道这是什么原因吗? 或者如何让下面的查询更加高效?
我发现它发生在 DECLARE CURSOR
和第一个 FETCH NEXT
之间,但我还不知道它是否在...
DECLARE CURSOR
之间code> 和OPEN
或
OPEN
和第一个FETCH NEXT
之间。
更多详细信息:sql 语句如下所示:
DECLARE cData CURSOR LOCAL FORWARD_ONLY READ_ONLY FOR SELECT ... FROM HugeTable HT1 JOIN HugeTable HT2 ON .. JOIN Table3 ON .. JOIN Table4 ON .. JOIN Table5 ON .. WHERE ... ORDER BY HT1..., HT1... INSERT INTO SysLog (Description) VALUES ('A') OPEN cData BEGIN TRANSACTION ProcessData -- Currently trying new logging here: -- INSERT INTO SysLog (Description) VALUES ('B') FETCH NEXT FROM cData INTO ... INSERT INTO SysLog (Description) VALUES ('C') ... etc.
我收到的最后一条日志消息是“A”,一小时后它失败并显示上述消息,从未到达“C”。 我现在正在尝试在“B”点进行日志记录。
根据要求,我发布了确切的 sql 表达式:
DECLARE cSource CURSOR LOCAL FORWARD_ONLY READ_ONLY FOR SELECT MD.sFieldName, MD.sFieldValue, TR.sTargetDataType, MD2.sFieldValue AS sUniqueID, TR.sTargetTableName, TR.sTargetFieldName, I.iRefCustomerID, I.iInterfaceID, IL.iRefInterfaceSessionID FROM MasterData MD JOIN MasterData MD2 ON MD.iRowIndex = MD2.iRowIndex AND MD.iBatchNumber = MD2.iBatchNumber AND MD.sTableName = MD2.sTableName AND MD2.sFieldName = 'sUniqueID' JOIN SourceTargetRelation TR ON MD.sFieldName = TR.sSourceFieldName AND MD.sTableName = TR.sSourceTableName JOIN InterfaceLog IL ON IL.iInterfaceLogID = MD.iBatchNumber JOIN Interface I ON I.iInterfaceID = IL.iRefInterfaceID AND TR.iRefSystemID = I.iRefSystemID WHERE MD.iBatchNumber = @iBatchNumber ORDER BY MD.sTableName, MD.iRowIndex
在 Quassnoi 更新答案后,我还在表上发布了原始索引:
我在此表上有一个非聚集索引,其中包含列 iBatchNumber
、sFieldName
、sTableName
、iRowIndex
。 该索引将 sFieldValue
作为包含列。
正如 Quassnoi 建议的那样(我想我现在明白为什么了),我已更改索引以按以下顺序排列列:iBatchNumber
、sTableName
、iRowIndex
, sFieldName
。 我使用 sFieldValue 作为包含列。 执行计划不再包含任何SORT
,并且执行计划中的步骤数不到原来的一半,我希望这也更快......
We are trying to set up a cursor to run through records generated from a join between two 'instances' of the same huge table (more than 150 M records).
The following exception message comes out:
Could not allocate space for object 'dbo.SORT temporary run storage: 165282123350016' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
Do any of you know the reason for this? Or how to make the query below below more efficient?
I have found that it occurs somewhere between DECLARE CURSOR
and the first FETCH NEXT
, but I do not know yet if it is between...
DECLARE CURSOR
andOPEN
or between
OPEN
and the firstFETCH NEXT
.
More details: The sql statement looks like:
DECLARE cData CURSOR LOCAL FORWARD_ONLY READ_ONLY FOR SELECT ... FROM HugeTable HT1 JOIN HugeTable HT2 ON .. JOIN Table3 ON .. JOIN Table4 ON .. JOIN Table5 ON .. WHERE ... ORDER BY HT1..., HT1... INSERT INTO SysLog (Description) VALUES ('A') OPEN cData BEGIN TRANSACTION ProcessData -- Currently trying new logging here: -- INSERT INTO SysLog (Description) VALUES ('B') FETCH NEXT FROM cData INTO ... INSERT INTO SysLog (Description) VALUES ('C') ... etc.
where the last log message I get is 'A' and then one hour later it fails with the message described above, never reaching 'C'. I am now trying with logging at point 'B'.
On request I post the exact sql expression:
DECLARE cSource CURSOR LOCAL FORWARD_ONLY READ_ONLY FOR SELECT MD.sFieldName, MD.sFieldValue, TR.sTargetDataType, MD2.sFieldValue AS sUniqueID, TR.sTargetTableName, TR.sTargetFieldName, I.iRefCustomerID, I.iInterfaceID, IL.iRefInterfaceSessionID FROM MasterData MD JOIN MasterData MD2 ON MD.iRowIndex = MD2.iRowIndex AND MD.iBatchNumber = MD2.iBatchNumber AND MD.sTableName = MD2.sTableName AND MD2.sFieldName = 'sUniqueID' JOIN SourceTargetRelation TR ON MD.sFieldName = TR.sSourceFieldName AND MD.sTableName = TR.sSourceTableName JOIN InterfaceLog IL ON IL.iInterfaceLogID = MD.iBatchNumber JOIN Interface I ON I.iInterfaceID = IL.iRefInterfaceID AND TR.iRefSystemID = I.iRefSystemID WHERE MD.iBatchNumber = @iBatchNumber ORDER BY MD.sTableName, MD.iRowIndex
After the updated answer from Quassnoi, I also post the original index on the table:
I have a nonclustered index on this table with the columns iBatchNumber
, sFieldName
, sTableName
, iRowIndex
. And that index has sFieldValue
as an included column.
As Quassnoi suggested (and I think I understand why now) I have changed the index to have the columns in this order: iBatchNumber
, sTableName
, iRowIndex
, sFieldName
. And I use sFieldValue
as an included column. The execution plan does not contain any SORT
anymore, and the number of steps in the execution plan is less than half of original, which I hope is also faster...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您的查询使用
ORDER BY
。这就需要排序,排序需要临时空间。 你已经离开这个空间了。
为了避免这种情况,请在巨大的表上创建一个复合索引:
(col_filter_1, col_filter_2, col_order_1, col_order_2)
,其中col_filter_n
是您筛选的列,col_order_n
是您排序所依据的列。这样的索引既可以用于过滤,也可以用于对过滤结果进行排序。
如果您发布实际的查询(即您过滤和排序的表达式),我可能可以更准确地告诉您如何创建这样的索引。
更新:
从您的查询中,我可以看到您需要
(iBatchNumber, sTableName, iRowIndex, sFieldName)
上的索引(按顺序)。如果您让
MD2
在连接中领先,也可能会有所帮助:查看执行计划并确保没有使用
SORT
操作。Your query uses
ORDER BY
.This needs sorting and sorting needs temporary space. You are out of this space.
To avoid this, create a composite index on your huge table:
(col_filter_1, col_filter_2, col_order_1, col_order_2)
, wherecol_filter_n
are the columns you filter on, andcol_order_n
are the columns you order by.Such an index can be used both for filtering and ordering the filtered results.
If you post your actual query (that is expressions you filter on and order by), I'll probably can tell you more exactly how to create such an index.
Update:
From your query, I can see that you need an index on
(iBatchNumber, sTableName, iRowIndex, sFieldName)
(in that order).It may also help if you make
MD2
leading in the join:See the execution plan and make sure that no
SORT
operation is used.你为什么使用光标? 尤其是在一张大桌子上? 你在做什么不能基于集合完成的事情? Cursprs 对性能极其不利,如果存在其他替代方案,则不应使用。 如果您根据选择找到的记录插入到另一个表中,则无需游标可以做得更好。
Why are you using a cursor? Especially on a large table? What are you doing that can't be done set-based? Cursprs are extremely bad for performance and should not be used if another alternative exists. If you are inserteding to another table based on records found by your select that can be done much better without a cursor.