索引如何在 SQL 用户定义类型 (UDT) 上工作?
这已经困扰我一段时间了,我希望 SQL Server 专家之一能够对此进行一些说明。
问题是:
当您对包含 UDT(CLR 类型)的 SQL Server 列建立索引时,SQL Server 如何确定对给定查询执行什么索引操作?
具体来说,我正在考虑 hierarchyid
(又名SqlHierarchyID
)类型。 Microsoft 建议您使用它的方式 - 以及我使用它的方式 - 是:
在
hierarchyid
列本身上创建索引(我们称之为ID
) 。这可以实现深度优先搜索,因此当您编写WHERE ID.IsDescendantOf(@ParentID) = 1
时,它可以执行索引查找。创建一个持久计算的
Level
列,并在(Level, ID)
上创建索引。这可以实现广度优先搜索,因此当您编写WHERE ID.GetAncestor(1) = @ParentID
时,它可以为此表达式执行索引查找(在第二个索引上)。
但我不明白的是这怎么可能?它似乎违反了正常的查询计划规则 - 对GetAncestor
和IsDescendantOf
的调用似乎不可控制,因此这应该导致完整索引扫描,但事实并非如此。显然,我并不是在抱怨,而是我试图了解是否可以在我自己的 UDT 上复制此功能。
hierarchyid 是否只是 SQL Server 具有特殊感知能力的“神奇”类型,如果它发现查询元素和索引的某种组合,就会自动更改执行计划?或者,SqlHierarchyID
CLR 类型是否只是定义 SQL Server 引擎可以理解的特殊属性/方法(类似于 IsDeterministic
用于持久计算列的方式)?
我似乎找不到任何有关此的信息。我所能找到的只是一段文字,指出 IsByteOrdered 属性通过保证每个实例有一个唯一的表示来使索引和检查约束之类的事情成为可能;虽然这有点有趣,但它并没有解释 SQL Server 如何使用某些实例方法执行seek。
那么问题又来了 - 索引操作如何适用于像 hierarchyid 这样的类型,以及是否有可能在新的 UDT 中获得相同的行为?
This has been bugging me for a while and I'm hoping that one of the SQL Server experts can shed some light on it.
The question is:
When you index a SQL Server column containing a UDT (CLR type), how does SQL Server determine what index operation to perform for a given query?
Specifically I am thinking of the hierarchyid
(AKA SqlHierarchyID
) type. The way Microsoft recommends that you use it - and the way I do use it - is:
Create an index on the
hierarchyid
column itself (let's call itID
). This enables a depth-first search, so that when you writeWHERE ID.IsDescendantOf(@ParentID) = 1
, it can perform an index seek.Create a persisted computed
Level
column and create an index on(Level, ID)
. This enables a breadth-first search, so that when you writeWHERE ID.GetAncestor(1) = @ParentID
, it can perform an index seek (on the second index) for this expression.
But what I don't understand is how is this possible? It seems to violate the normal query plan rules - the calls to GetAncestor
and IsDescendantOf
don't appear to be sargable, so this should result in a full index scan, but it doesn't. Not that I am complaining, obviously, but I am trying to understand if it's possible to replicate this functionality on my own UDTs.
Is hierarchyid
simply a "magical" type that SQL Server has a special awareness of, and automatically alters the execution plan if it finds a certain combination of query elements and indexes? Or does the SqlHierarchyID
CLR type simply define special attributes/methods (similar to the way IsDeterministic
works for persisted computed columns) that are understood by the SQL Server engine?
I can't seem to find any information about this. All I've been able to locate is a paragraph stating that the IsByteOrdered
property makes things like indexes and check constraints possible by guaranteeing one unique representation per instance; while this is somewhat interesting, it doesn't explain how SQL Server is able to perform a seek with certain instance methods.
So the question again - how do the index operations work for types like hierarchyid
, and is it possible to get the same behaviour in a new UDT?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
查询优化器团队正在尝试处理不改变事物顺序的场景。例如,
cast(someDateTime as date)
仍然是可控制的。我希望随着时间的推移,他们会修复一些旧的,例如带有常量的 dateadd/datediff 。所以...处理 Ancestor 实际上就像在字符串开头使用 LIKE 运算符一样。它不会改变顺序,而且你仍然可以逃脱惩罚。
The query optimizer team is trying to handle scenarios that don't change the order of things. For example,
cast(someDateTime as date)
is still sargable. I'm hoping that as time continues, they fix up a bunch of old ones, such as dateadd/datediff with a constant.So... handling Ancestor is effectively like using the LIKE operator with the start of a string. It doesn't change the order, and you can still get away with stuff.
您是对的 - HierarchyId 和 Geometry/Geography 都是查询优化器能够识别并重写计划以生成优化查询的“神奇”类型 - 它并不像识别可优化操作符那么简单。无法模拟其他 UDT 的等效行为。
对于 HierarchyId,类型的二进制序列化是特殊的,以便以二进制有序方式表示层次结构。它类似于 SQL Xml 类型使用的机制,并在研究论文中描述 ORDPATH :插入友好的 XML 节点标签。因此,虽然用于转换使用 IsDescendant 和 GetAncestor 的查询的 QO 规则很特殊,但实际的底层索引是二进制 Hierarchyid 数据上的常规关系索引,如果您愿意编写原始查询来执行范围搜索,则可以实现相同的行为而不是调用简单的方法。
You are correct - HierarchyId and Geometry/Geography are both "magical" types that the Query Optimizer is able to recognize and rewrite the plans for in order to produce optimized queries - it's not as simple as just recognizing sargable operators. There is no way to simulate equivalent behavior with other UDTs.
For HierarchyId, the binary serialization of the type is special in order to represent the hierarchical structure in a binary ordered fashion. It is similar to the mechanism used by the SQL Xml type and described in a research paper ORDPATHs: Insert-Friendly XML Node Labels. So while the QO rules to translate queries that use IsDescendant and GetAncestor are special, the actual underlying index is a regular relational index on the binary hierarchyid data and you could achieve the same behavior if you were willing to write your original queries to do range seeks instead of calling the simple method.