具有 FAST_FORWARD 游标循环的存储过程启动快,结束慢

发布于 2024-09-27 03:06:55 字数 1886 浏览 6 评论 0原文

我有一个存储过程,它使用 FAST_FORWARD 游标按时间顺序循环一组约 300k 的记录,并根据大量运行变量和标志的状态将它们分配给声明集,部分实现为表变量。我已经考虑了很多关于如何做这个基于集合的事情,但我就是做不到。所以我坚持使用光标方法,并且需要优化这段代码。

我注意到,前 10% 的进度加载和处理速度非常快(2000 行/秒),接近 20% 的进度已减慢至约 300 行/秒,最后减慢至约 60 行/秒。

IMO 这可能是由于 4 个原因造成的:

  • 游标速度减慢,我认为使用 FAST_FORWARD 游标不太可能出现这种情况
  • 处理速度减慢。对于我的“组”,我使用表变量来插入、更新和删除。在任何给定时刻都有最大值。这些变量大约有 10 行。
  • 插入目标表的速度变慢。我不明白为什么会这样,我没有在它们上定义触发器,它们只是普通的表。
  • 邪恶的魔法

要么是这个,要么是我的百分比计数器坏了:

SET @curprogress = @curprogress + 1
IF (@curprogress - ((@totprogress / 100) * (FLOOR(@curprogress * 100 / @totprogress)))) BETWEEN 0 AND 1 BEGIN
    SET @msg = CAST(FLOOR(@curprogress * 100 / @totprogress) AS VARCHAR)
    RAISERROR('%s%s', 0, 1, @msg, '%...') WITH NOWAIT;
END

有人知道要寻找什么以及如何继续加速这个查询吗?

我的代码的符号摘录:

WHILE....
-- Fetch new record to be assigned to one of the open declaration sets
FETCH NEXT INTO @row_field1, @row_field2....
IF (@flag2 = 1) AND ((@flag1 = 0) OR (@row_field1 <> @prevrow_field1)) 
BEGIN
    -- Logging info: we are closing a child declaration set
    INSERT INTO @logtable SELECT '--> LOG MESSAGE'
    INSERT INTO @logtable
    SELECT format_message(@row_field1, @calc_field2, field3...)
    FROM @runningtable_sub S LEFT JOIN @runningtable_main M ON S.MainID = M.ID

    -- Update enddate of parent
    UPDATE M SET M.enddate = DATEADD(day,365,S.enddate)
    FROM @runningtable_sub S
    LEFT JOIN @runningtable_main M
    ON S.MainID = M.ID

    -- close and save child
    INSERT INTO outputtable_main
    SELECT @field1, COALESCE(Z.Field1,'NULL'), S.startdate, S.enddate,
        M.Startdate, M.Enddate
    FROM @runningtable_sub S 
    LEFT JOIN @runningtable_main M ON S.MainID = M.ID

    -- delete child from running table
    DELETE FROM @runningtable_sub WHERE S.enddate < @curdate
END

I have a stored procedure that uses a FAST_FORWARD cursor to chronologically loop over a set of ~300k records and assigns them to declaration sets based on the state of a lot of running variables and flags, partly implemented as table variables. I have given a lot of thought on how to do this set-based and I just can't do it. So I am stuck with the cursor approach, and need to optimize this code.

What I have noticed is that the first 10% of progress is loaded and processed very quickly (2000 rows/sec), near 20% progress it has slowed to about 300 rows/sec and in the end it has slowed down to about 60 rows/sec.

IMO this can be due to 4 reasons:

  • The cursor slows down, which I would think unlikely with a FAST_FORWARD cursor
  • The processing slows down. For my "groups" I am using table variables in which I am inserting, updating and deleting. At any given moment there are max. about 10 rows in those variables.
  • The inserting into target tables slows down. I don't see why this would be, I have no triggers defined on them and they are just ordinary tables.
  • Evil magic

Either that or my percentage counter is broken:

SET @curprogress = @curprogress + 1
IF (@curprogress - ((@totprogress / 100) * (FLOOR(@curprogress * 100 / @totprogress)))) BETWEEN 0 AND 1 BEGIN
    SET @msg = CAST(FLOOR(@curprogress * 100 / @totprogress) AS VARCHAR)
    RAISERROR('%s%s', 0, 1, @msg, '%...') WITH NOWAIT;
END

Has anybody any clue what to look for and how to go on speeding up this query?

Symbolic exerpt of my code:

WHILE....
-- Fetch new record to be assigned to one of the open declaration sets
FETCH NEXT INTO @row_field1, @row_field2....
IF (@flag2 = 1) AND ((@flag1 = 0) OR (@row_field1 <> @prevrow_field1)) 
BEGIN
    -- Logging info: we are closing a child declaration set
    INSERT INTO @logtable SELECT '--> LOG MESSAGE'
    INSERT INTO @logtable
    SELECT format_message(@row_field1, @calc_field2, field3...)
    FROM @runningtable_sub S LEFT JOIN @runningtable_main M ON S.MainID = M.ID

    -- Update enddate of parent
    UPDATE M SET M.enddate = DATEADD(day,365,S.enddate)
    FROM @runningtable_sub S
    LEFT JOIN @runningtable_main M
    ON S.MainID = M.ID

    -- close and save child
    INSERT INTO outputtable_main
    SELECT @field1, COALESCE(Z.Field1,'NULL'), S.startdate, S.enddate,
        M.Startdate, M.Enddate
    FROM @runningtable_sub S 
    LEFT JOIN @runningtable_main M ON S.MainID = M.ID

    -- delete child from running table
    DELETE FROM @runningtable_sub WHERE S.enddate < @curdate
END

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

流年已逝 2024-10-04 03:06:55

我可以想到这种速度减慢的很多原因,但在不知道数据基数的情况下很难缩小范围。

随机观察:

  1. 你的 format_message() 函数肯定是一只狗;所有 UDF 都是。但是每次要插入多少行?
  2. @runningtable_main 永远不会被清除。
  3. 更新是昂贵的,
  4. 删除是昂贵的。如果您使用临时表并重新设计您的实现,您可以截断而不是删除

要自己弄清楚其中一些内容,请添加检测:

DECLARE @now DATETIME, @duration INT, @rowcount INT

WHILE....
-- Fetch new record to be assigned to one of the open declaration sets
FETCH NEXT INTO @row_field1, @row_field2....
IF (@flag2 = 1) AND ((@flag1 = 0) OR (@row_field1 <> @prevrow_field1)) 
BEGIN
    PRINT '---------------'

    -- Logging info: we are closing a child declaration set
    INSERT INTO @logtable SELECT '--> LOG MESSAGE'
    SET @now = GETDATE()
    INSERT INTO @logtable
    SELECT format_message(@row_field1, @calc_field2, field3...)
    FROM @runningtable_sub S LEFT JOIN @runningtable_main M ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) inserted into @logtable, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- Update enddate of parent
    SET @now = GETDATE()
    UPDATE M SET M.enddate = DATEADD(day,365,S.enddate)
    FROM @runningtable_sub S
    LEFT JOIN @runningtable_main M
    ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) updated in @runningtable_main, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- close and save child
    SET @now = GETDATE()
    INSERT INTO outputtable_main
    SELECT @field1, COALESCE(Z.Field1,'NULL'), S.startdate, S.enddate,
        M.Startdate, M.Enddate
    FROM @runningtable_sub S 
    LEFT JOIN @runningtable_main M ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) inserted into outputtable_main, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- delete child from running table
    SET @now = GETDATE()
    DELETE FROM @runningtable_sub WHERE S.enddate < @curdate
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) deleted from @runningtable_sub, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT
END

如果您包含整个代码(包括游标声明),而不仅仅是符号摘录,这里有人也许可以重新设计它,以提高效率和/或避免光标。

I could think of lots of reasons this is slowing down, but it is hard to narrow it down without knowing the cardinality of your data.

Random observations:

  1. Your format_message() function is surely a dog; all UDFs are. But how many rows are you inserting per pass?
  2. @runningtable_main never gets cleared out.
  3. updates are expensive
  4. deletes are expensive. If you use temp tables and rework your implementation, you can truncate instead of delete

To figure some of it out for yourself, add instrumentation:

DECLARE @now DATETIME, @duration INT, @rowcount INT

WHILE....
-- Fetch new record to be assigned to one of the open declaration sets
FETCH NEXT INTO @row_field1, @row_field2....
IF (@flag2 = 1) AND ((@flag1 = 0) OR (@row_field1 <> @prevrow_field1)) 
BEGIN
    PRINT '---------------'

    -- Logging info: we are closing a child declaration set
    INSERT INTO @logtable SELECT '--> LOG MESSAGE'
    SET @now = GETDATE()
    INSERT INTO @logtable
    SELECT format_message(@row_field1, @calc_field2, field3...)
    FROM @runningtable_sub S LEFT JOIN @runningtable_main M ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) inserted into @logtable, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- Update enddate of parent
    SET @now = GETDATE()
    UPDATE M SET M.enddate = DATEADD(day,365,S.enddate)
    FROM @runningtable_sub S
    LEFT JOIN @runningtable_main M
    ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) updated in @runningtable_main, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- close and save child
    SET @now = GETDATE()
    INSERT INTO outputtable_main
    SELECT @field1, COALESCE(Z.Field1,'NULL'), S.startdate, S.enddate,
        M.Startdate, M.Enddate
    FROM @runningtable_sub S 
    LEFT JOIN @runningtable_main M ON S.MainID = M.ID
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) inserted into outputtable_main, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT

    -- delete child from running table
    SET @now = GETDATE()
    DELETE FROM @runningtable_sub WHERE S.enddate < @curdate
    SELECT @rowcount = @@ROWCOUNT, @duration = DATEDIFF(ms,@now,GETDATE)
    RAISERROR('%i row(s) deleted from @runningtable_sub, %i milliseconds',-1,-1,@rowcount,@duration) WITH NOWAIT
END

If you included your entire code (including the cursor declaration), rather than just a symbolic excerpt, someone here could probably rework it for you to be much more efficient and/or avoid cursors all together.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文