需要一些帮助来优化存储过程
我有一个存储过程,它正在构建动态 sql 查询,然后通过 exec(@sql) 运行它。
存储过程正在连接大约 12 个表。事实上,它的运行速度相对较快。但后来我需要添加一个额外的字段。为此,我创建了一个标量函数,如下所示:
SELECT @weight = @weight +COUNT(*) FROM dbo.UserPDMedication WHERE UserID = @userid
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND HoehnYarhID IS NOT null
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND DateOfBirth IS NOT NULL
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND GenderID IS NOT NULL
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND DateDiagnosed IS NOT null
它基本上只是一个根据用户填写的问题数量返回 int 的函数。因此,对于存储过程中的每个用户,都会调用此函数。存储过程如下所示:(
SELECT DISTINCT u.UserID, u.Healthy, u.DateOfBirth, u.City, st.StateCode AS State, u.GenderID, g.Gender, u.Latitude, u.Longitude, u.PDConditionID, u.Zip, u.Distance,
(SELECT TOP 1 EmailID FROM Messages m WHERE TrialID = ' + @trialID + ' AND ToUserID = u.userid AND LocationID = ' + @locationID + ') AS MessageID, dbo.UserWeightedValue(u.UserID) as wt
FROM [User] u
INNER JOIN aspnet_UsersInRoles uir ON u.AspnetUserID = uir.UserId
INNER JOIN aspnet_Roles r ON uir.RoleId = r.RoleId
FULL JOIN UserHealthCondition uhc ON u.UserID = uhc.UserID
FULL JOIN UserMotorSymptom ums ON u.UserID = ums.UserID
FULL JOIN UserNonMotorSymptom unms ON u.UserID = unms.UserID
FULL JOIN UserPDMedication updm ON u.UserID = updm.UserID
FULL JOIN UserPDTreatment updt ON u.UserID = updt.UserID
FULL JOIN UserSupplement us ON u.UserID = us.UserID
FULL JOIN UserPDGeneticMarker updgm ON u.UserID = updgm.UserID
FULL JOIN UserFamilyMember ufm ON u.UserID = ufm.UserID
FULL JOIN State st ON u.StateID = st.ID
FULL JOIN Gender g ON u.GenderID = g.ID
WHERE u.UserID IS NOT NULL
我删除了一些块以尝试保持简短)。该 get 在存储过程中作为动态字符串执行。关于如何优化它以加快速度有什么建议吗?
谢谢
编辑:我使用这里的建议组合来完成这项工作。尽管我将多个 select 语句合并为 2 个语句,但我保持了函数原样。然后,我采用原始存储过程并将 select 更改为 select into ##temp。然后我针对该临时表运行我的函数。执行时间下降至 3-4 秒。我想我必须承认这个问题,因为正是他的明确指出让我走上了正确的道路。但谢谢大家。
I have a stored procedure which is building a dynamic sql query and then running it via exec(@sql).
The stored proc is joining about 12 tables. As it was, it was running relatively quickly. But then i needed to added in an additional field. To do this, i created a scalar function, which looks like this:
SELECT @weight = @weight +COUNT(*) FROM dbo.UserPDMedication WHERE UserID = @userid
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND HoehnYarhID IS NOT null
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND DateOfBirth IS NOT NULL
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND GenderID IS NOT NULL
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND DateDiagnosed IS NOT null
It's basically just a function that will return an int based on how many questions a user has filled out. So for each user in the stored proc, this function gets called. The stored proc looks like this:
SELECT DISTINCT u.UserID, u.Healthy, u.DateOfBirth, u.City, st.StateCode AS State, u.GenderID, g.Gender, u.Latitude, u.Longitude, u.PDConditionID, u.Zip, u.Distance,
(SELECT TOP 1 EmailID FROM Messages m WHERE TrialID = ' + @trialID + ' AND ToUserID = u.userid AND LocationID = ' + @locationID + ') AS MessageID, dbo.UserWeightedValue(u.UserID) as wt
FROM [User] u
INNER JOIN aspnet_UsersInRoles uir ON u.AspnetUserID = uir.UserId
INNER JOIN aspnet_Roles r ON uir.RoleId = r.RoleId
FULL JOIN UserHealthCondition uhc ON u.UserID = uhc.UserID
FULL JOIN UserMotorSymptom ums ON u.UserID = ums.UserID
FULL JOIN UserNonMotorSymptom unms ON u.UserID = unms.UserID
FULL JOIN UserPDMedication updm ON u.UserID = updm.UserID
FULL JOIN UserPDTreatment updt ON u.UserID = updt.UserID
FULL JOIN UserSupplement us ON u.UserID = us.UserID
FULL JOIN UserPDGeneticMarker updgm ON u.UserID = updgm.UserID
FULL JOIN UserFamilyMember ufm ON u.UserID = ufm.UserID
FULL JOIN State st ON u.StateID = st.ID
FULL JOIN Gender g ON u.GenderID = g.ID
WHERE u.UserID IS NOT NULL
(i removed some chunks to try and keep this short). This get's executed as a dynamic string in the stored proc. Any tips on how i can optimize this to speed things up?
Thanks
EDIT: i got this working using a combination of suggestions here. I kept my function as is although i combined the multiple select statements into 2 statements.I then took the original stored proc and changed the select to a select into ##temp. And then i ran my function against that temp table. Execution time dropped down to 3-4 seconds. I think I will have to give credit to grant for this question since it was his pointing out distinct that put me on the right trail. But thank you to everyone.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
DISTINCT 绝对会像聚合一样导致性能下降。你真的需要它吗?通常,当您看到 DISTINCT 时,这表明数据或结构问题被消除重复项的能力所掩盖,而结构本应自行消除重复项。
之后,我希望将其移动为 JOIN,而不是 SELECT 列表中的相关查询。这并不是一定会成功,但优化器通常能够更好地将其纳入计划中。
根据您所呈现内容的复杂性,我还会查看执行计划。首先要检查,您是否进行了全面优化或是否超时。如果超时,那么您正在处理的是最佳猜测,而不是经过完全计算的“足够好”计划。如果是这样,您需要考虑简化此查询。如果你有一个足够好的计划,看看其中的瓶颈在哪里。
The DISTINCT is absolutely going to cause a performance hit as it does aggregations. Do you really need it? Frequently when you see DISTINCT it's an indication of a data or structural issue that is getting papered over by the ability to eliminate duplicates that the structure should elminate on it's own.
After that, instead of a correlated query in the SELECT list, I'd look to move that as a JOIN. It's not a sure fire win, but frequently the optimizer is better able to work that into the plan.
Based on the complexity of what you're presenting, I'd also look at the execution plan. First thing to check, do you have a full optimization or did it timeout. If it timed out, then you're dealing with a best guess, not a fully calculated "good enough" plan. If that's so, you need to look at simpllifying this query. If you have a good enough plan, see where the bottlenecks are within it.
如果
UserID
是User
表的主键,那么就不需要对用户填写的问题进行一次SELECT
,您可以将其包装在一个SELECT
中:If
UserID
is the primary key of the tableUser
, then there is no need to do oneSELECT
for question filled by the user, you can wrap it in just oneSELECT
:将标量值函数转换为内联表值函数。
Convert the scalar valued function into an inline table valued function.