使用 LIKE 和 EXISTS 子句时优化数据库架构/索引以获得更快的查询结果
在 SQL 2005 服务器数据库上实现树结构时,当使用 LIKE 子句 与 EXISTS 子句 结合使用时,查询响应时间过长(下面的查询超过 5 秒)强>。
慢速查询涉及两个表 - [SitePath_T] 和 [UserSiteRight_T] :
CREATE TABLE [dbo].[UserSiteRight_T](
[UserID_i] [int] NOT NULL
, [SiteID_i] [int] NOT NULL
, CONSTRAINT [PKC_UserSiteRight_UserIDSiteID] PRIMARY KEY CLUSTERED ( [UserID_i] ASC, [SiteID_i] ASC )
, CONSTRAINT [FK_UserSiteRight_UserID] FOREIGN KEY( [UserID_i] ) REFERENCES [dbo].[User_T] ( [ID_i] )
, CONSTRAINT [FK_UserSiteRight_SiteID] FOREIGN KEY( [SiteID_i] ) REFERENCES [dbo].[Site_T] ( [ID_i] )
)
UserID_i = 2484< 的行数 ( rights ) [UserSiteRight_T] 表中的 /strong> 非常小:545
(UserID_i = 2484 是随机选择的)
此外,数据库相对较小 - [SitePath_T] 表中只有 23000 行
CREATE TABLE [dbo].[SitePath_T] (
[SiteID_i] INT NOT NULL,
[Path_v] VARCHAR(255) NOT NULL,
CONSTRAINT [PK_SitePath_PathSiteID] PRIMARY KEY CLUSTERED ( [Path_v] ASC, [SiteID_i] ASC ),
CONSTRAINT [AK_SitePath_Path] UNIQUE NONCLUSTERED ( [Path_v] ASC ),
CONSTRAINT [FK_SitePath_SiteID] FOREIGN KEY( [SiteID_i] ) REFERENCES [Site_T] ( [ID_i] )
:)
我试图仅获取具有可由特定UserID(由[UserSiteRight_T]表给出)访问的子网站的SiteID,如下
SELECT sp.SiteID_i
FROM SitePath_t sp
WHERE EXISTS ( SELECT *
FROM [dbo].[SitePath_T] usp
, [dbo].[UserSiteRight_T] uusr
WHERE uusr.SiteID_i = usp.SiteID_i
AND uusr.UserID_i = 2484
AND usp.Path_v LIKE sp.Path_v+'%' )
所示 :可以找到结果的一部分,其中只需要/返回列 sp.SiteID_i - 我还添加了相关的相应 Path_v、UserSiteRight_T.SiteID_i WHERE UserID = 2484 以及符合 LIKE 条件的相应 SitePath_T SiteID_i 和 Path_v :
sp.SiteID_i sp.Path_v [UserSiteRight_T].SiteID_i usp.SiteID_i usp.Path_v
1 '1.' NULL 10054 '1.10054.'
10054 '1.10054.' 10054 10054 '1.10054.'
10275 '1.10275.' 10275 10275 '1.10275.'
1533 '1.1533.' NULL 2697 '1.1533.2689.2693.2697.'
2689 '1.1533.2689.' NULL 2697 '1.1533.2689.2693.2697.'
2693 '1.1533.2689.2693.' NULL 2697 '1.1533.2689.2693.2697.'
2697 '1.1533.2689.2693.2697.' 2697 2697 '1.1533.2689.2693.2697.'
1580 '1.1580.' NULL 1581 '1.1580.1581.'
1581 '1.1580.1581.' 1581 1581 '1.1580.1581.'
1585 '1.1580.1581.1585.' 1585 1585 '1.1580.1581.1585.'
222 '1.222.' 222 222 '1.222.'
223 '1.222.223.' 223 223 '1.222.223.'
224 '1.222.223.224.' 224 224 '1.222.223.224.'
3103 '1.3103.' NULL 3537 '1.3103.3529.3533.3537.'
3529 '1.3103.3529.' NULL 3537 '1.3103.3529.3533.3537.'
3533 '1.3103.3529.3533.' NULL 3537 '1.3103.3529.3533.3537.'
3537 '1.3103.3529.3533.3537.' 3537 3537 '1.3103.3529.3533.3537.'
上述查询的执行计划:
|--Nested Loops(Left Semi Join, WHERE:([MyTestDB].[dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1007]))
|--Compute Scalar(DEFINE:([Expr1007]=[MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1008]=LikeRangeStart([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1009]=LikeRangeEnd([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1010]=LikeRangeInfo([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
| |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [sp]))
|--Table Spool
|--Hash Match(Inner Join, HASH:([uusr].[SiteID_i])=([usp].[SiteID_i]))
|--Clustered Index Seek(OBJECT:([MyTestDB].[dbo].[UserSiteRight_T].[PKC_UserSiteRight_UserIDSiteID] AS [uusr]), SEEK:([uusr].[UserID_i]=(2484)) ORDERED FORWARD)
|--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [usp]))
以及重写的查询:
SELECT DISTINCT
sp.SiteID_i
FROM [dbo].[SitePath_t] sp
, [dbo].[SitePath_T] usp
, [dbo].[UserSiteRight_T] uusr
WHERE ( uusr.SiteID_i = usp.SiteID_i
AND uusr.UserID_i = 2484
AND usp.Path_v LIKE sp.Path_v+'%' )
ORDER BY SiteID_i ASC
执行计划:
|--Hash Match(Aggregate, HASH:([sp].[SiteID_i]))
|--Nested Loops(Inner Join, WHERE:([MyTestDB].[dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1006]))
|--Hash Match(Inner Join, HASH:([uusr].[SiteID_i])=([usp].[SiteID_i]))
| |--Clustered Index Seek(OBJECT:([MyTestDB].[dbo].[UserSiteRight_T].[PKC_UserSiteRight_UserIDSiteID] AS [uusr]), SEEK:([uusr].[UserID_i]=(2484)) ORDERED FORWARD)
| |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [usp]))
|--Table Spool
|--Compute Scalar(DEFINE:([Expr1006]=[MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1007]=LikeRangeStart([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1008]=LikeRangeEnd([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1009]=LikeRangeInfo([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
|--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [sp]))
所有索引均已就位 - 数据库引擎优化顾问不建议新的架构修改 - 但两个查询都在 5 秒以上返回正确的结果 - 并且,因为它是 Ajax 请求的响应 - 在更新导航树时感觉(并且)非常慢 有
任何建议来优化/修改数据库模式/索引/查询以获得更快的响应吗?
谢谢
While implementing a tree structure over a SQL 2005 server database, the query response is taking too long ( below queries are talking more than 5 sec ) when using LIKE clause combined with the EXISTS clause.
The slow queries involve two tables - [SitePath_T] and [UserSiteRight_T] :
CREATE TABLE [dbo].[UserSiteRight_T](
[UserID_i] [int] NOT NULL
, [SiteID_i] [int] NOT NULL
, CONSTRAINT [PKC_UserSiteRight_UserIDSiteID] PRIMARY KEY CLUSTERED ( [UserID_i] ASC, [SiteID_i] ASC )
, CONSTRAINT [FK_UserSiteRight_UserID] FOREIGN KEY( [UserID_i] ) REFERENCES [dbo].[User_T] ( [ID_i] )
, CONSTRAINT [FK_UserSiteRight_SiteID] FOREIGN KEY( [SiteID_i] ) REFERENCES [dbo].[Site_T] ( [ID_i] )
)
Number of rows ( rights ) for UserID_i = 2484 in [UserSiteRight_T] table is quite small : 545
( UserID_i = 2484 was randomly chosen )
Also, the database is relatively small - only 23000 rows in the [SitePath_T] table :
CREATE TABLE [dbo].[SitePath_T] (
[SiteID_i] INT NOT NULL,
[Path_v] VARCHAR(255) NOT NULL,
CONSTRAINT [PK_SitePath_PathSiteID] PRIMARY KEY CLUSTERED ( [Path_v] ASC, [SiteID_i] ASC ),
CONSTRAINT [AK_SitePath_Path] UNIQUE NONCLUSTERED ( [Path_v] ASC ),
CONSTRAINT [FK_SitePath_SiteID] FOREIGN KEY( [SiteID_i] ) REFERENCES [Site_T] ( [ID_i] )
)
I am trying to get only the SiteIDs which have subsites accessible by a certain UserID ( given by the [UserSiteRight_T] table ) as :
SELECT sp.SiteID_i
FROM SitePath_t sp
WHERE EXISTS ( SELECT *
FROM [dbo].[SitePath_T] usp
, [dbo].[UserSiteRight_T] uusr
WHERE uusr.SiteID_i = usp.SiteID_i
AND uusr.UserID_i = 2484
AND usp.Path_v LIKE sp.Path_v+'%' )
Below you can find a part of the result where only column sp.SiteID_i is needed/returned - also I added the related corresponding Path_v, UserSiteRight_T.SiteID_i WHERE UserID = 2484 and the corresponding SitePath_T SiteID_i and Path_v matching the LIKE condition :
sp.SiteID_i sp.Path_v [UserSiteRight_T].SiteID_i usp.SiteID_i usp.Path_v
1 '1.' NULL 10054 '1.10054.'
10054 '1.10054.' 10054 10054 '1.10054.'
10275 '1.10275.' 10275 10275 '1.10275.'
1533 '1.1533.' NULL 2697 '1.1533.2689.2693.2697.'
2689 '1.1533.2689.' NULL 2697 '1.1533.2689.2693.2697.'
2693 '1.1533.2689.2693.' NULL 2697 '1.1533.2689.2693.2697.'
2697 '1.1533.2689.2693.2697.' 2697 2697 '1.1533.2689.2693.2697.'
1580 '1.1580.' NULL 1581 '1.1580.1581.'
1581 '1.1580.1581.' 1581 1581 '1.1580.1581.'
1585 '1.1580.1581.1585.' 1585 1585 '1.1580.1581.1585.'
222 '1.222.' 222 222 '1.222.'
223 '1.222.223.' 223 223 '1.222.223.'
224 '1.222.223.224.' 224 224 '1.222.223.224.'
3103 '1.3103.' NULL 3537 '1.3103.3529.3533.3537.'
3529 '1.3103.3529.' NULL 3537 '1.3103.3529.3533.3537.'
3533 '1.3103.3529.3533.' NULL 3537 '1.3103.3529.3533.3537.'
3537 '1.3103.3529.3533.3537.' 3537 3537 '1.3103.3529.3533.3537.'
Execution plan for the above query :
|--Nested Loops(Left Semi Join, WHERE:([MyTestDB].[dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1007]))
|--Compute Scalar(DEFINE:([Expr1007]=[MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1008]=LikeRangeStart([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1009]=LikeRangeEnd([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1010]=LikeRangeInfo([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
| |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [sp]))
|--Table Spool
|--Hash Match(Inner Join, HASH:([uusr].[SiteID_i])=([usp].[SiteID_i]))
|--Clustered Index Seek(OBJECT:([MyTestDB].[dbo].[UserSiteRight_T].[PKC_UserSiteRight_UserIDSiteID] AS [uusr]), SEEK:([uusr].[UserID_i]=(2484)) ORDERED FORWARD)
|--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [usp]))
And the rewritten query :
SELECT DISTINCT
sp.SiteID_i
FROM [dbo].[SitePath_t] sp
, [dbo].[SitePath_T] usp
, [dbo].[UserSiteRight_T] uusr
WHERE ( uusr.SiteID_i = usp.SiteID_i
AND uusr.UserID_i = 2484
AND usp.Path_v LIKE sp.Path_v+'%' )
ORDER BY SiteID_i ASC
Execution plan :
|--Hash Match(Aggregate, HASH:([sp].[SiteID_i]))
|--Nested Loops(Inner Join, WHERE:([MyTestDB].[dbo].[SitePath_T].[Path_v] as [usp].[Path_v] like [Expr1006]))
|--Hash Match(Inner Join, HASH:([uusr].[SiteID_i])=([usp].[SiteID_i]))
| |--Clustered Index Seek(OBJECT:([MyTestDB].[dbo].[UserSiteRight_T].[PKC_UserSiteRight_UserIDSiteID] AS [uusr]), SEEK:([uusr].[UserID_i]=(2484)) ORDERED FORWARD)
| |--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [usp]))
|--Table Spool
|--Compute Scalar(DEFINE:([Expr1006]=[MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%', [Expr1007]=LikeRangeStart([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1008]=LikeRangeEnd([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%'), [Expr1009]=LikeRangeInfo([MyTestDB].[dbo].[SitePath_T].[Path_v] as [sp].[Path_v]+'%')))
|--Index Scan(OBJECT:([MyTestDB].[dbo].[SitePath_T].[AK_SitePath_Path] AS [sp]))
All the indexes are in place - Database Engine Tuning Advisor is not suggesting new schema modification - but both queries are returning the correct result in more that 5 seconds - and, as it's a response of an Ajax reques - feels ( and is ) very slow when updating the navigation tree
Any suggestions to optimize / modify database schema / indexes / queries in order to get a faster response ?
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
基于:(
基于您正在进行半连接的事实,这很好)。
它首先(正确地)关注 uusr 表,以查找该用户的记录。它已经对此进行了 CIX Seek,这很好。从那里,它根据 SiteID_i 字段在 usp 中查找相应的记录。
因此,接下来考虑它想要通过 SiteID_i 查找站点的事实,以及您希望这是哪种类型的联接。
合并连接怎么样?那就太好了,但需要双方对数据进行排序。如果索引的顺序正确,那就没问题了……
之后,您希望根据路径查找内容。那么怎么样:
然后在 SitePath_T 上使用另一个索引来查找您想要的 SiteID:
最后一个索引可能使用了嵌套循环,但这希望不会太糟糕。将影响您的系统的是前两个索引,这应该让您在 EXISTS 子句中看到两个表之间的合并联接。
Based on:
(which is just fine based on the fact that you're doing a Semi Join).
It's focussing (rightly) on the uusr table first, to find the records for that user. It's already doing a CIX Seek on that, which is good. From there, it's finding the corresponding records in usp according to the SiteID_i fields.
So next consider the fact that it wants to find the Sites by SiteID_i, and what kind of join you want this to be.
How about a Merge Join? That would be nice, but requires the data to be sorted on both sides. That's fine if the indexes are in the right order...
...and after that, you want to be finding stuff based on the Path. So how about:
And then another index on SitePath_T that finds the SiteIDs you want:
There may be a Nested Loop used on this final one, but that's hopefully not too bad. The thing that's going to impact your system will be the first two indexes, which should let you see a Merge Join between the two tables in your EXISTS clause.
我会尝试在您的
UserSiteRight_T
表中的外键上添加索引 - 它们尚未建立索引,并且这些字段上的索引应该会加快查找速度:在您的 SitePath_T 表上也是如此:
尝试将这些放在适当的位置,然后再次运行查询,并比较运行时间和执行计划 - 您看到任何改进吗?
这是一个常见的误解,但 SQL Server 不会自动在外键列上放置索引(如
SitePath_T
上的SiteID_i
),即使普遍的共识是,外键很有用,并且有可能加速引用完整性的实施以及对这些外键的 JOIN。I would try to add an index on the foreign keys in your
UserSiteRight_T
table - they're not yet indexed, and an index on those fields should speed up the lookups:and on your SitePath_T table as well:
Try to put these in place, then run your queries again, and compare the run times and the execution plans - do you see any improvement??
It's a common misconception, but SQL Server does not automatically put an index on a foreign key column (like
SiteID_i
onSitePath_T
), even though the general consensus is that a foreign key is useful and potentially speeds up both enforcement of referential integrity, as well as JOINs over those foreign keys.在 SitePath_T 上自我加入寻找父母会害死你。也许您应该为 ParentSiteID_i 添加一列并使用正常的递归 CTE?
然后它变成:
在 SitePath_T (ParentSiteID_i) 上添加一个索引,性能应该很快。
The self join on SitePath_T to find parents is killing you. Perhaps you should add a column for ParentSiteID_i and use a normal recursive CTE?
Then it becomes:
Throw an index on SitePath_T (ParentSiteID_i) and performance should be snappy.