MSSQL 大表游标无法分配空间

发布于 2024-07-16 10:03:44 字数 2599 浏览 7 评论 0原文

我们正在尝试设置一个游标来运行由同一个巨大表(超过 150 M 记录)的两个“实例”之间的联接生成的记录。

出现以下异常信息:

无法为数据库“tempdb”中的对象“dbo.SORT 临时运行存储:165282123350016”分配空间,因为“PRIMARY”文件组已满。 通过删除不需要的文件、删除文件组中的对象、向文件组添加其他文件或为文件组中的现有文件设置自动增长来创建磁盘空间。

你们有人知道这是什么原因吗? 或者如何让下面的查询更加高效?

我发现它发生在 DECLARE CURSOR 和第一个 FETCH NEXT 之间,但我还不知道它是否在...

  • DECLARE CURSOR 之间code> 和 OPEN

  • OPEN 和第一个 FETCH NEXT 之间。

更多详细信息:sql 语句如下所示:

DECLARE cData CURSOR LOCAL FORWARD_ONLY READ_ONLY
FOR
  SELECT ...
  FROM HugeTable HT1 JOIN HugeTable HT2 ON .. 
  JOIN Table3 ON .. JOIN Table4 ON .. JOIN Table5 ON ..
  WHERE ...
  ORDER BY HT1..., HT1...

INSERT INTO SysLog (Description) VALUES ('A')

OPEN cData
BEGIN TRANSACTION ProcessData
  -- Currently trying new logging here:
  -- INSERT INTO SysLog (Description) VALUES ('B') 
  FETCH NEXT FROM cData INTO ...
  INSERT INTO SysLog (Description) VALUES ('C')
  ... etc.

我收到的最后一条日志消息是“A”,一小时后它失败并显示上述消息,从未到达“C”。 我现在正在尝试在“B”点进行日志记录。


根据要求,我发布了确切的 sql 表达式:

DECLARE cSource CURSOR LOCAL FORWARD_ONLY READ_ONLY
FOR
    SELECT MD.sFieldName, 
        MD.sFieldValue, 
        TR.sTargetDataType,
        MD2.sFieldValue AS sUniqueID,
        TR.sTargetTableName,
        TR.sTargetFieldName,
        I.iRefCustomerID, 
        I.iInterfaceID, 
        IL.iRefInterfaceSessionID
    FROM MasterData MD
    JOIN MasterData MD2
        ON MD.iRowIndex = MD2.iRowIndex
        AND MD.iBatchNumber = MD2.iBatchNumber
        AND MD.sTableName = MD2.sTableName 
        AND MD2.sFieldName = 'sUniqueID'
    JOIN SourceTargetRelation TR
        ON MD.sFieldName = TR.sSourceFieldName
        AND MD.sTableName = TR.sSourceTableName
    JOIN InterfaceLog IL
        ON IL.iInterfaceLogID = MD.iBatchNumber
    JOIN Interface I
        ON I.iInterfaceID = IL.iRefInterfaceID
        AND TR.iRefSystemID = I.iRefSystemID
    WHERE
        MD.iBatchNumber = @iBatchNumber
    ORDER BY MD.sTableName, MD.iRowIndex

在 Quassnoi 更新答案后,我还在表上发布了原始索引:

我在此表上有一个非聚集索引,其中包含列 iBatchNumbersFieldNamesTableNameiRowIndex。 该索引将 sFieldValue 作为包含列。


正如 Quassnoi 建议的那样(我想我现在明白为什么了),我已更改索引以按以下顺序排列列:iBatchNumbersTableNameiRowIndex, sFieldName。 我使用 sFieldValue 作为包含列。 执行计划不再包含任何SORT,并且执行计划中的步骤数不到原来的一半,我希望这也更快......

We are trying to set up a cursor to run through records generated from a join between two 'instances' of the same huge table (more than 150 M records).

The following exception message comes out:

Could not allocate space for object 'dbo.SORT temporary run storage: 165282123350016' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

Do any of you know the reason for this? Or how to make the query below below more efficient?

I have found that it occurs somewhere between DECLARE CURSOR and the first FETCH NEXT, but I do not know yet if it is between...

  • DECLARE CURSOR and OPEN

or between

  • OPEN and the first FETCH NEXT.

More details: The sql statement looks like:

DECLARE cData CURSOR LOCAL FORWARD_ONLY READ_ONLY
FOR
  SELECT ...
  FROM HugeTable HT1 JOIN HugeTable HT2 ON .. 
  JOIN Table3 ON .. JOIN Table4 ON .. JOIN Table5 ON ..
  WHERE ...
  ORDER BY HT1..., HT1...

INSERT INTO SysLog (Description) VALUES ('A')

OPEN cData
BEGIN TRANSACTION ProcessData
  -- Currently trying new logging here:
  -- INSERT INTO SysLog (Description) VALUES ('B') 
  FETCH NEXT FROM cData INTO ...
  INSERT INTO SysLog (Description) VALUES ('C')
  ... etc.

where the last log message I get is 'A' and then one hour later it fails with the message described above, never reaching 'C'. I am now trying with logging at point 'B'.


On request I post the exact sql expression:

DECLARE cSource CURSOR LOCAL FORWARD_ONLY READ_ONLY
FOR
    SELECT MD.sFieldName, 
        MD.sFieldValue, 
        TR.sTargetDataType,
        MD2.sFieldValue AS sUniqueID,
        TR.sTargetTableName,
        TR.sTargetFieldName,
        I.iRefCustomerID, 
        I.iInterfaceID, 
        IL.iRefInterfaceSessionID
    FROM MasterData MD
    JOIN MasterData MD2
        ON MD.iRowIndex = MD2.iRowIndex
        AND MD.iBatchNumber = MD2.iBatchNumber
        AND MD.sTableName = MD2.sTableName 
        AND MD2.sFieldName = 'sUniqueID'
    JOIN SourceTargetRelation TR
        ON MD.sFieldName = TR.sSourceFieldName
        AND MD.sTableName = TR.sSourceTableName
    JOIN InterfaceLog IL
        ON IL.iInterfaceLogID = MD.iBatchNumber
    JOIN Interface I
        ON I.iInterfaceID = IL.iRefInterfaceID
        AND TR.iRefSystemID = I.iRefSystemID
    WHERE
        MD.iBatchNumber = @iBatchNumber
    ORDER BY MD.sTableName, MD.iRowIndex

After the updated answer from Quassnoi, I also post the original index on the table:

I have a nonclustered index on this table with the columns iBatchNumber, sFieldName, sTableName, iRowIndex. And that index has sFieldValue as an included column.


As Quassnoi suggested (and I think I understand why now) I have changed the index to have the columns in this order: iBatchNumber, sTableName, iRowIndex, sFieldName. And I use sFieldValue as an included column. The execution plan does not contain any SORT anymore, and the number of steps in the execution plan is less than half of original, which I hope is also faster...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

泪是无色的血 2024-07-23 10:03:44

有谁知道这是什么原因吗? 或者如何让下面的查询更加高效?

您的查询使用 ORDER BY

这就需要排序,排序需要临时空间。 你已经离开这个空间了。

为了避免这种情况,请在巨大的表上创建一个复合索引:(col_filter_1, col_filter_2, col_order_1, col_order_2),其中 col_filter_n 是您筛选的列,col_order_n 是您排序所依据的列。

这样的索引既可以用于过滤,也可以用于对过滤结果进行排序。

如果您发布实际的查询(即您过滤和排序的表达式),我可能可以更准确地告诉您如何创建这样的索引。

更新:

从您的查询中,我可以看到您需要 (iBatchNumber, sTableName, iRowIndex, sFieldName) 上的索引(按顺序)。

如果您让 MD2 在连接中领先,也可能会有所帮助:

WHERE
    MD2.iBatchNumber = @iBatchNumber
ORDER BY
    MD2.sTableName, MD2.iRowIndex

查看执行计划并确保没有使用 SORT 操作。

Do any of you know the reason for this? Or how to make the query below below more efficient?

Your query uses ORDER BY.

This needs sorting and sorting needs temporary space. You are out of this space.

To avoid this, create a composite index on your huge table: (col_filter_1, col_filter_2, col_order_1, col_order_2), where col_filter_n are the columns you filter on, and col_order_n are the columns you order by.

Such an index can be used both for filtering and ordering the filtered results.

If you post your actual query (that is expressions you filter on and order by), I'll probably can tell you more exactly how to create such an index.

Update:

From your query, I can see that you need an index on (iBatchNumber, sTableName, iRowIndex, sFieldName) (in that order).

It may also help if you make MD2 leading in the join:

WHERE
    MD2.iBatchNumber = @iBatchNumber
ORDER BY
    MD2.sTableName, MD2.iRowIndex

See the execution plan and make sure that no SORT operation is used.

酒中人 2024-07-23 10:03:44

你为什么使用光标? 尤其是在一张大桌子上? 你在做什么不能基于集合完成的事情? Cursprs 对性能极其不利,如果存在其他替代方案,则不应使用。 如果您根据选择找到的记录插入到另一个表中,则无需游标可以做得更好。

Why are you using a cursor? Especially on a large table? What are you doing that can't be done set-based? Cursprs are extremely bad for performance and should not be used if another alternative exists. If you are inserteding to another table based on records found by your select that can be done much better without a cursor.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文