SQL脚本需要很长时间才能提取数据
这是一个需要很长时间的脚本
USE [r_prod]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Drew Borden
-- Create date: 4/16/2009
-- Description: Procedure to populated subdivision extract table
-- =============================================
IF EXISTS(SELECT * FROM sys.procedures WHERE name='sp_extract_subdivision')
BEGIN
DROP PROCEDURE sp_extract_subdivision
END
GO
CREATE PROCEDURE sp_extract_subdivision
@subdivsion_cd char(2)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare @strap varchar(25)
-- Clear existing record
delete from dbo.subdivision_extract
-- Select list of straps to loop through
declare strapList cursor for
select strap from dbo.parcel where county_cd = @subdivsion_cd
--Loop through straps and populate extract table values
BEGIN TRY
OPEN strapList
FETCH NEXT FROM strapList INTO @strap
WHILE @@FETCH_STATUS = 0
BEGIN
IF @strap IS NOT NULL
BEGIN
insert into dbo.subdivision_extract (acct_num) values (RTRIM(@strap))
exec sp_extract_parcel @strap
exec sp_extract_detail @strap
exec sp_extract_lnd_c @strap
exec sp_extract_parcel_flg @strap
exec sp_extract_owner @strap
exec sp_extract_mail @strap
exec sp_extract_legal_ln @strap
exec sp_extract_site @strap
exec sp_extract_condo_unit @strap
exec sp_extract_personal_x @strap
exec sp_extract_personal_x_dist @strap
exec sp_extract_phase_in @strap
exec sp_extract_p_tax_dist @strap
exec sp_extract_parcel_rel @strap
exec sp_extract_entzone @strap
exec sp_extract_dates @strap
exec sp_extract_sales @strap
exec sp_extract_sale_dtl @strap
exec sp_extract_pchar @strap
exec sp_extract_protest @strap
END
FETCH NEXT FROM strapList INTO @strap
END
CLOSE strapList
DEALLOCATE strapList
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() as ErrorNumber,
ERROR_MESSAGE() as ErrorMessage,
ERROR_PROCEDURE() as ExecutingProcedure,
ERROR_LINE() as LineNumber
CLOSE strapList
DEALLOCATE strapList
END CATCH
END
GO
有什么办法可以加快速度吗?
This is the script that is taking a very long time
USE [r_prod]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Drew Borden
-- Create date: 4/16/2009
-- Description: Procedure to populated subdivision extract table
-- =============================================
IF EXISTS(SELECT * FROM sys.procedures WHERE name='sp_extract_subdivision')
BEGIN
DROP PROCEDURE sp_extract_subdivision
END
GO
CREATE PROCEDURE sp_extract_subdivision
@subdivsion_cd char(2)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare @strap varchar(25)
-- Clear existing record
delete from dbo.subdivision_extract
-- Select list of straps to loop through
declare strapList cursor for
select strap from dbo.parcel where county_cd = @subdivsion_cd
--Loop through straps and populate extract table values
BEGIN TRY
OPEN strapList
FETCH NEXT FROM strapList INTO @strap
WHILE @@FETCH_STATUS = 0
BEGIN
IF @strap IS NOT NULL
BEGIN
insert into dbo.subdivision_extract (acct_num) values (RTRIM(@strap))
exec sp_extract_parcel @strap
exec sp_extract_detail @strap
exec sp_extract_lnd_c @strap
exec sp_extract_parcel_flg @strap
exec sp_extract_owner @strap
exec sp_extract_mail @strap
exec sp_extract_legal_ln @strap
exec sp_extract_site @strap
exec sp_extract_condo_unit @strap
exec sp_extract_personal_x @strap
exec sp_extract_personal_x_dist @strap
exec sp_extract_phase_in @strap
exec sp_extract_p_tax_dist @strap
exec sp_extract_parcel_rel @strap
exec sp_extract_entzone @strap
exec sp_extract_dates @strap
exec sp_extract_sales @strap
exec sp_extract_sale_dtl @strap
exec sp_extract_pchar @strap
exec sp_extract_protest @strap
END
FETCH NEXT FROM strapList INTO @strap
END
CLOSE strapList
DEALLOCATE strapList
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() as ErrorNumber,
ERROR_MESSAGE() as ErrorMessage,
ERROR_PROCEDURE() as ExecutingProcedure,
ERROR_LINE() as LineNumber
CLOSE strapList
DEALLOCATE strapList
END CATCH
END
GO
Any way to speed this up?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
加快速度的最佳方法是编写您在每一行中调用的存储过程的版本,以便它们针对整个集合运行,并完全放弃光标。否则,将游标指定为 FORWARD_ONLY 可能会带来一点好处,但我看不出还有什么可以做的。
The best way to speed this up involves writing versions of the stored procedures that you're calling with every row so that they run against the whole set, and ditching your cursor altogether. Otherwise, you might get a small benefit from specifying the cursor as FORWARD_ONLY but I don't see much else that can be done.
这里真正的问题是您通过游标顺序调用 20 个存储过程。
我一开始就讨厌光标,并在之前的项目中提出了解决方案。
您是否能够对所有数据依次运行 20 个存储过程,而不是从游标获取变量?
我建议使用一个临时表,其中包含数据的主键和一个状态整数,该整数显示哪些数据已处理以及处理到哪一步。然后可以调用每个存储过程来处理所有行。
如果您确实想很好地利用它,请让每个存储过程进程一次处理 5% 的行,然后在循环之前使用 WAITFOR 允许短暂暂停,直到每个阶段处理完所有记录。如果每个进程的时间合理,它将确保锁仍然可以分配给其他进程,这样更重要的进程就不会因为无法获取锁而超时。
从 dbo.subdivision_extract 中删除
需要多长时间?如果需要一段时间并且不需要日志(并且表上没有触发器),请尝试将其更改为 TRUNCATE TABLE dbo.subdivision_extractTLDR:重新开发存储过程以处理所有数据,那么您只需调用 20 个存储过程,每个存储过程一次。
The real problem here is the fact that you're calling 20 stored procedures sequentially via a cursor.
I hate cursors for a start, and have come up with solution for this on previous projects.
Instead of getting a variable from the cursor, are you able to run the 20 stored procedures sequentially for all of the data?
I suggest having a temporary table with the primary key of the data and a status integer which shows which have been processed and to which point. Each stored procedure can then be called in order to process all of the rows.
If you really want to do a nice job with it, have each stored proc process say 5% of the rows at a time, and then allow a small pause using WAITFOR before looping until all of the records have been processed by each stage. If the process time for each is reasonable, it will make sure that locks can still be allocated to other processes so more important processes do not time out because they can't acquire a lock.
How long does the
delete from dbo.subdivision_extract
take? If it takes a while and the log is not required (and you have no triggers on the table), try changing it toTRUNCATE TABLE dbo.subdivision_extract
TLDR: Redevelop the stored procs to process all of the data, then you'll only need to call 20 stored procs once each.
您正在为此存储中的每个循环调用多个存储过程。我不知道其他人做了什么,但似乎他们也在查询/修改一些数据。您应该考虑将存储连接到单个记录中,并在多条记录的块中执行查询,而不是对每个带进行循环。
You are calling several stored procedures for each loop in this stored. I don't know what the others do, but it seems that they are also querying/modifying few data. You should consider joining the storeds in a single one and perform the querys in blocks of several records instead of loop for every strap.
如果您要将数据提取到文本文件,则您自己应该以基于集合的方式执行此操作,或者至少使用 SSIS。为每一行运行多个存储过程的游标绝对是您可以用于此类事情的最糟糕的方法。我敢打赌您可以在 SSIS 包中完成此操作,并且只需几分钟而不是 9 个小时。
If you are extracting data to a text file, you owe it to yourself to do it in a set-based manner or at least use SSIS. A cursor running mulitple stored procs for each row is the absolute worst method you can use for this sort of thing. I'd bet you can do this in an SSIS package and take minutes instead of 9 hours.
是的,实际上非常容易修复:测量缓慢的部分,然后优化缓慢的部分。
您发布的只是一个 T-SQL 脚本,该脚本在性能方面尽可能不透明。一个 DELETE、一个 SELECT、一个带有 INSERT 的游标迭代和一堆 EXEC。问题可能出现在其中的任何位置,因此最好的解决方案是测量并查看问题可能出在哪里。
获取脚本并在开始处、DELETE 之后、第一次 FETCH 之后、然后在每次 EXEC 之后添加一个
PRINT GETDATE();
并执行一次迭代(删除循环内的 FETCH)。查看 PRINT 输出,您可以从中推断出执行每个步骤所需的时间。他们中有人脱颖而出吗?附加事件探查器和监视器 SP:StmtCompleted,并使用筛选器期间。再次运行提取循环的一次迭代。哪些陈述的持续时间最长?
在 SSMS 中运行单次迭代脚本,但选中工具栏中的“包括实际执行计划”按钮。在结果执行显示中,哪些语句相对于批处理而言成本较高?
您必须缩小问题范围,因为该脚本无法诊断,该脚本中没有完成任何实际工作,它只是调用其他程序来完成工作。一旦您确定了此脚本调用的过程中实际缓慢的语句,那么您应该再次发布,给出存在问题的确切语句以及数据的确切架构 (包括所有指标定义),并针对具体问题提出解决方案。
虽然从理论上讲,面向集合的处理可能比游标具有更好的性能,但实际上,可能不可能编写一条语句来提取与 20 个存储过程执行调用相同的信息,除非这些过程是极其琐碎的单个 SELECT。
Yes, actualy is extreamly easy to fix: measure what is slow, then optimize the slow part.
All you posted is a T-SQL script that is as opaque as it can get in regard to performance. A DELETE, a SELECT, a cursor iteration with an INSERT and a bunch of EXECs. The problem can be anywhere in these, so the best solution is to measure and see where the problem might be.
Take the script and add a
PRINT GETDATE();
at start, after the DELETE, after the first FETCH, then after each EXEC and execute one single iteration (remove the FETCH inside the loop). Look at the PRINT output, you can deduce the time it takes to execute each step from it. Does any of them stand out?Attach Profiler and monitor for the event SP:StmtCompleted with a filter on Duration. Run again one single iteration of the extraction loop. Which statements stand out highest in Duration?
Run the script for a single iteration in SSMS, but check the Include Actual Execution Plan button in the toolbar. In the resulted execution show, which statements stand out as high cost relative to the batch?
You must narrow down your problem, the script as it is it's impossible to diagnose, there isn't any actual work done in this script, it just calls other procedures to do the work. Once you identified the actual slow statements inside the procedures invoked by this script, then you should post again, give the exact statment that has the problem and the exact schema of your data (including all index definitions), and ask solutions for specific problems.
While in theory a set oriented processing could have better performance than a cursor, in practice it would be probably impossible to write a single statement that extracts the same information as 20 stored procedure execution invocations, unless those procedure are extremly trivial single SELECTs.