关于SQL Server HierarchyID深度优先性能的问题

发布于 2024-08-31 00:40:16 字数 853 浏览 4 评论 0原文

我正在尝试在包含大约 50,000 行(将来会大幅增长)的表 (dbo.[Message]) 中实现 HierarchyID。然而,检索大约 25 个结果需要 30-40 秒。

根节点是一个填充符,以提供唯一性,因此每个后续行都是该虚拟行的子级。

我需要能够深度优先遍历表,并将 HierarchyID 列 (dbo.[Message].MessageID) 作为集群主键,还添加了一个计算的smallint (dbo.[Message].Hierarchy),它存储节点的级别。

用法:.Net应用程序将hierarchyID值传递到数据库中,我希望能够检索该节点的所有(如果有)子节点和父节点(除了根节点,因为它是填充符)。

我正在使用的查询的简化版本:

@MessageID hierarchyID   /* passed in from application */

SELECT 
m.MessageID, m.MessageComment 

FROM 
dbo.[Message] as m

WHERE 
m.Messageid.IsDescendantOf(@MessageID.GetAncestor((@MessageID.GetLevel()-1))) = 1

ORDER BY 
m.MessageID

据我了解,索引应该在没有提示的情况下自动检测到。

在搜索论坛中,我看到人们在处理广度优先索引时使用索引提示,但没有在深度优先情况下观察到此应用程序。这是否适合我的场景?

这几天我一直在尝试寻找解决这个问题的方法,但没有成功。 我将非常感谢任何帮助,因为这是我的第一篇文章,如果这被认为是一个“菜鸟”问题,我提前表示歉意,我已经阅读了 MS 文档并搜索了无数的论坛,但没有找到简洁的描述具体问题。

I am trying to implement hierarchyID in a table (dbo.[Message]) containing roughly 50,000 rows (will grow substantially in the future). However it takes 30-40 seconds to retrieve about 25 results.

The root node is a filler in order to provide uniqueness, therefor every subsequent row is a child of that dummy row.

I need to be able to traverse the table depth-first and have made the hierarchyID column (dbo.[Message].MessageID) the clustering primary key, have also added a computed smallint (dbo.[Message].Hierarchy) which stores the level of the node.

Usage: A .Net application passes through a hierarchyID value into the database and I want to be able to retrieve all (if any) children AND parents of that node (besides the root, as it is filler).

A simplified version of the query I am using:

@MessageID hierarchyID   /* passed in from application */

SELECT 
m.MessageID, m.MessageComment 

FROM 
dbo.[Message] as m

WHERE 
m.Messageid.IsDescendantOf(@MessageID.GetAncestor((@MessageID.GetLevel()-1))) = 1

ORDER BY 
m.MessageID

From what I understand, the index should be detected automatically without a hint.

From searching forums I have seen people utilizing index hints when dealing with breadth-first indexes, but have not observed this application in depth-first situations. Would that be a relevant approach for my scenario?

I have spent the past few days trying to find a solution for this issue, but to no avail.
I would greatly appreciate any assistance, and as this is my first post, I apologize in advance if this would be considered a 'noobish' question, I have read the MS documentation and searched countless forums, but have not came across a succinct description of the specific issue.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

柒七 2024-09-07 00:40:16

目前尚不完全清楚您是要针对深度优先搜索还是广度优先搜索进行优化;问题建议深度优先,但最后的评论是关于广度优先的。

您拥有深度优先所需的所有索引(只需索引 hierarchyid 列)。对于广度优先,仅仅创建计算的level列是不够的,您还必须对其建立索引:(

ALTER TABLE Message
ADD [Level] AS MessageID.GetLevel()

CREATE INDEX IX_Message_BreadthFirst
ON Message (Level, MessageID)
INCLUDE (...)

请注意,对于非聚集索引,您很可能会需要 INCLUDE - 否则,SQL Server 可能会采取聚集索引扫描的方式。)

现在,如果您尝试查找某个节点的所有祖先,您需要采取稍微不同的策略。您可以使这些搜索快如闪电,因为 - 这就是 hierarchyid 的优点 - 每个节点都已经“包含”其所有祖先。

我使用 CLR 函数来尽可能快地完成此操作,但您可以使用递归 CTE 来完成:

CREATE FUNCTION dbo.GetAncestors
(
    @h hierarchyid
)
RETURNS TABLE
AS RETURN
WITH Hierarchy_CTE AS
(
    SELECT @h AS id

    UNION ALL

    SELECT h.id.GetAncestor(1)
    FROM Hierarchy_CTE h
    WHERE h.id <> hierarchyid::GetRoot()
)
SELECT id FROM Hierarchy_CTE

现在,要获取所有祖先和后代,请像这样使用它:

DECLARE @MessageID hierarchyID   /* passed in from application */

SELECT m.MessageID, m.MessageComment 
FROM Message as m
WHERE m.MessageId.IsDescendantOf(@MessageID) = 1
OR m.MessageId IN (SELECT id FROM dbo.GetAncestors(@MessageID.GetAncestor(1)))
ORDER BY m.MessageID

尝试一下 - 这应该可以解决您的性能问题。

It's not entirely clear whether you're trying to optimize for depth-first or breadth-first search; the question suggests depth-first, but the comments at the end are about breadth-first.

You have all the indexes you need for depth-first (just index the hierarchyid column). For breadth-first, it's not enough just to create the computed level column, you have to index it too:

ALTER TABLE Message
ADD [Level] AS MessageID.GetLevel()

CREATE INDEX IX_Message_BreadthFirst
ON Message (Level, MessageID)
INCLUDE (...)

(Note that for non-clustered indexes you'll most likely need the INCLUDE - otherwise, SQL Server may resort to doing a clustered index scan instead.)

Now, if you're trying to find all ancestors of a node, you want to take a slightly different tack. You can make these searches lightning-fast, because - and here's what's cool about hierarchyid - each node already "contains" all of its ancestors.

I use a CLR function to make this as fast as possible, but you can do it with a recursive CTE:

CREATE FUNCTION dbo.GetAncestors
(
    @h hierarchyid
)
RETURNS TABLE
AS RETURN
WITH Hierarchy_CTE AS
(
    SELECT @h AS id

    UNION ALL

    SELECT h.id.GetAncestor(1)
    FROM Hierarchy_CTE h
    WHERE h.id <> hierarchyid::GetRoot()
)
SELECT id FROM Hierarchy_CTE

Now, to get all of the ancestors and descendants, use it like this:

DECLARE @MessageID hierarchyID   /* passed in from application */

SELECT m.MessageID, m.MessageComment 
FROM Message as m
WHERE m.MessageId.IsDescendantOf(@MessageID) = 1
OR m.MessageId IN (SELECT id FROM dbo.GetAncestors(@MessageID.GetAncestor(1)))
ORDER BY m.MessageID

Try it out - this should solve your performance problems.

愁以何悠 2024-09-07 00:40:16

在这里找到解决方法:
http ://connect.microsoft.com/SQLServer/feedback/details/532406/performance-issue-with-hierarchyid-fun-isdescendantof-in-where-clause#

只是提醒我从 heirarchyID 开始 从应用程序传入,我的目标是检索该值的所有亲戚(祖先和后代)。

在我的具体示例中,我必须在 SELECT 语句之前添加以下声明:

declare @topNode hierarchyid = (select @messageID.GetAncestor((@messageID.GetLevel()-1)))
declare @topNodeParent hierarchyid = (select @topNode.GetAncestor(1))
declare @leftNode hierarchyid= (select @topNodeParent.GetDescendant (null, @topNode))
declare @rightNode hierarchyid= (select @topNodeParent.GetDescendant (@topNode, null))

WHERE 子句已更改为:

messageid.IsDescendantOf(@topNode)=1 AND (messageid > @leftNode ) AND (messageid < @rightNode )

查询性能的提高非常显着:

对于每个结果传入后,寻道时间现在平均为 20 毫秒(从 120 到 420)。

当查询 25 个值时,以前需要 25 - 35 秒才能返回所有相关节点(在某些情况下每个值有很多亲戚,在某些情况下没有)。现在只需要 2 秒。

非常感谢所有在本网站和其他网站上对此问题做出贡献的人。

Found workaround here:
http://connect.microsoft.com/SQLServer/feedback/details/532406/performance-issue-with-hierarchyid-fun-isdescendantof-in-where-clause#

Just reminding that I started with a heirarchyID passed in from the application and my goal is to retrieve any and all relatives of that value (both Ancestors and Descendants).

In my specific example, I had to add the following declarations before the SELECT statement:

declare @topNode hierarchyid = (select @messageID.GetAncestor((@messageID.GetLevel()-1)))
declare @topNodeParent hierarchyid = (select @topNode.GetAncestor(1))
declare @leftNode hierarchyid= (select @topNodeParent.GetDescendant (null, @topNode))
declare @rightNode hierarchyid= (select @topNodeParent.GetDescendant (@topNode, null))

The WHERE clause has been changed to:

messageid.IsDescendantOf(@topNode)=1 AND (messageid > @leftNode ) AND (messageid < @rightNode )

The querying performance increase is very significant:

For every result passed in, seek time is now 20ms on average (was from 120 to 420).

When querying 25 values, it previously took 25 - 35 seconds to return all related nodes (in some cases each value had many relatives, in some there were none). It now takes only 2 seconds.

Thank you very much to all who have contributed to this issue on this site and on others.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文