一列上有多个索引
使用 Oracle,有一个名为 User 的表。
列:Id、FirstName、LastName
索引:1. PK
(Id)、2. UPPER
(FirstName)、3. LOWER
(FirstName) , 4. Index
(FirstName)
如您所见,索引 2、3、4 是同一列上的索引 - FirstName。
我知道这会产生开销,但我的问题是选择数据库将如何反应/优化?
例如:
从用户 u 的位置选择 ID u.FirstName LIKE 'MIKE%'
Oracle 会找到正确的索引吗?
问题是,通过 Hibernate 这会大大减慢查询速度(因此它使用准备好的语句)。
谢谢。
更新:只是为了澄清索引 2 和 3 是功能索引。
Using Oracle, there is a table called User.
Columns: Id, FirstName, LastName
Indexes: 1. PK
(Id), 2. UPPER
(FirstName), 3. LOWER
(FirstName), 4. Index
(FirstName)
As you can see index 2, 3, 4 are indexes on the same column - FirstName.
I know this creates overhead, but my question is on selecting how will the database react/optimize?
For instance:
SELECT Id FROM User u WHERE
u.FirstName LIKE 'MIKE%'
Will Oracle hit the right index or will it not?
The problem is that via Hibernate this slows down the query VERY much (so it uses prepared statements).
Thanks.
UPDATE: Just to clarify indexes 2 and 3 are functional indexes.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
除了 Mat 的观点(索引 2 或 3 应该是多余的,因为您应该选择一种方法来进行不区分大小写的搜索)以及 Richard 的观点(这将取决于索引的选择性)之外,请注意,当您正在使用 LIKE 子句。
假设您正在使用绑定变量(听起来您是基于预准备语句的使用),优化器必须猜测实际绑定值的选择性。像“S%”这样的短字符将非常没有选择性,导致优化器通常更喜欢表扫描。另一方面,像“Smithfield-Manning%”这样的较长字符串可能具有很强的选择性,并且可能会使用索引 4。Oracle 如何处理这种可变性将取决于版本。
在 Oracle 10 中,Oracle 引入了绑定变量查看。这意味着 Oracle 在重新引导后(或查询计划从共享池中老化后)第一次解析查询时,Oracle 会查看绑定值并根据该值决定使用哪个计划。假设您的大多数查询都将从索引扫描中受益,因为用户通常搜索相对选择性的值,如果重新启动后的第一个查询具有选择性条件,那就太好了。但是,如果您运气不好,有人在重新启动后立即执行了
WHERE firstname LIKE 'S%'
,那么您将一直使用表扫描查询计划,直到该查询计划从共享池中删除为止。然而,从 Oracle 11 开始,优化器能够进行自适应游标共享。这意味着优化器将尝试找出
WHERE firstname LIKE 'S%'
应该执行表扫描,而WHERE firstname LIKE 'Smithfield-Manning%'
应该执行表扫描索引扫描,并且会在共享池中维护同一条语句的多个查询计划。这解决了我们在早期版本中使用绑定变量查看所遇到的大部分问题。但即使在这里,优化器选择性估计的准确性对于中等长度的字符串通常也会出现问题。通常会知道单字符字符串的选择性非常弱,而 20 个字符的字符串选择性很高,但即使使用 256 个桶直方图,也不会提供有关诸如
之类的选择性如何的大量信息“Smit%”之类的名字实际上在哪里。它可能根据列直方图大致了解“Sm%”的选择性,但它相当盲目地猜测接下来两个字符的选择性。因此,大多数查询都能高效工作,但优化器确信
WHERE firstname LIKE 'Cave%'
的选择性不足以使用索引,这种情况并不罕见。假设这是一个常见查询,您可能需要考虑使用 Oracle 的计划稳定性功能来强制 Oracle 使用特定计划,而不管绑定变量的值如何。这可能意味着输入单个字符的用户必须等待比原本等待的时间更长的时间,因为索引扫描的效率远低于表扫描的效率。但对于其他正在搜索简短但相当独特的姓氏的用户来说,这可能是值得的。您可以执行诸如向查询添加 ROWNUM 限制器或向前端添加逻辑等操作,要求搜索框中的字符数最少,以避免出现表扫描效率更高的情况。
In addition to Mat's point that either index 2 or 3 should be redundant because you should choose one approach to doing case-insensitive searches and to Richard's point that it will depend on the selectivity of the index, be aware that there are additional concerns when you are using the LIKE clause.
Assuming you are using bind variables (which it sounds like you are based on your use of prepared statements), the optimizer has to guess at how selective the actual bind value is going to be. Something short like 'S%' is going to be very non-selective, causing the optimizer to generally prefer a table scan. A longer string like 'Smithfield-Manning%', on the other hand, is likely to be very selective and would likely use index 4. How Oracle handles this variability will depend on the version.
In Oracle 10, Oracle introduced bind variable peeking. This meant that the first time Oracle parsed a query after a reboot (or after the query plan being aged out of the shared pool), Oracle looked at the bind value and decided what plan to use based on that value. Assuming that most of your queries would benefit from the index scan because users are generally searching on relatively selective values, this was great if the first query after a reboot had a selective condition. But if you got unlucky and someone did a
WHERE firstname LIKE 'S%'
immediately after a reboot, you'd be stuck with the table scan query plan until the query plan was removed from the shared pool.Starting in Oracle 11, however, the optimizer has the ability to do adaptive cursor sharing. That means that the optimizer will try to figure out that
WHERE firstname LIKE 'S%'
should do a table scan andWHERE firstname LIKE 'Smithfield-Manning%'
should do an index scan and will maintain multiple query plans for the same statement in the shared pool. That solves most of the problems that we had with bind variable peeking in earlier versions.But even here, the accuracy of the optimizer's selectivity estimates are generally going to be problematic for medium-length strings. It's generally going to know that a single-character string is very weakly selective and that a 20 character string is highly selective but even with a 256 bucket histogram, it's not going to have a whole lot of information about how selective something like
WHERE firstname LIKE 'Smit%'
really is. It may know roughly how selective 'Sm%' is based on the column histogram but it's guessing rather blindly at how selective the next two characters are. So it's not uncommon to end up in a situation where most of the queries work efficiently but the optimizer is convinced thatWHERE firstname LIKE 'Cave%'
isn't selective enough to use an index.Assuming that this is a common query, you may want to consider using Oracle's plan stability features to force Oracle to use a particular plan regardless of the value of a bind variable. This may mean that users that enter a single character have to wait even longer than they would otherwise have waited because the index scan is substantially less efficient than doing a table scan. But that may be worth it for other users that are searching for short but reasonably distinctive last names. And you may do things like add a ROWNUM limiter to the query or add logic to the front end that requires a minimum number of characters in the search box to avoid situations where a table scan would be more efficient.
在同一个字段上同时具有基于函数的上索引和下索引有点奇怪。而且我认为优化器不会在您的查询中使用任何一个。
您应该选择其中之一(也可能删除最后一个),并且只查询大写(或小写)大小写,例如:
编辑:也看看这篇文章,有一些有趣的信息如何使用Oracle 10+ 中包含 NULL 的列上基于函数的索引?
It's a bit strange to have both the upper and lower function-based indexes on the same field. And I don't think the optimizer will use either in your query as it its.
You should pick one or the other (and probably drop the last one too), and only ever query on the upper (or lower)-case with something like:
Edit: look at this post too, has some interesting info How to use a function-based index on a column that contains NULLs in Oracle 10+?
它可能不会命中您的任何索引,因为您在 SELECT 子句中返回 ID,而索引未涵盖该 ID。
如果索引的选择性很强,并且 Oracle 认为仍然值得使用它来查找“MIKE%”,然后对数据执行查找以获取 ID 列,那么它可能会使用
4。索引(名字)
。仅当搜索的列使用索引中定义的确切函数时,才会使用 2 和 3。It may not hit any of your indexes, because you are returning ID in the SELECT clause, which is not covered by the indexes.
If the index is very selective, and Oracle decides it is still worthwhile using it to find 'MIKE%' then perform a lookup on the data to get the ID column, then it may use
4. Index(FirstName)
. 2 and 3 will only be used if the column searched uses the exact function defined in the index.