如何获得 COUNT(col) ... GROUP BY 来使用索引？

发布于 2024-08-30 22:18:39 字数 549 浏览 2 评论 0原文

我有一个表（col1，col2，...），其索引为（col1，col2，...）。该表中有数百万行，我想运行一个查询：

 SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1

不幸的是，这会导致对该表进行全表扫描，这需要一分钟以上的时间。有什么方法可以让oracle使用列上的索引来更快地返回结果吗？

编辑：

更具体地说，我正在运行以下查询：

SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner

SYS.OBJ$ (SYS.I_OBJ2) 上有一个索引，其中索引 owner# 和 name 列；我相信我应该能够在查询中使用这个索引，而不是对 SYS.OBJ$ 进行全表扫描

原文

I've got a table (col1, col2, ...) with an index on (col1, col2, ...). The table has got millions of rows in it, and I want to run a query:

 SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1

Unfortunately, this is resulting in a full table scan of the table, which takes upwards of a minute. Is there any way of getting oracle to use the index on the columns to return the results much faster?

EDIT:

more specifically, I'm running the following query:

SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner

and there is an index on SYS.OBJ$ (SYS.I_OBJ2) which indexes the owner# and name columns; I believe I should be able to use this index in the query, rather than a full table scan of SYS.OBJ$

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

尽揽少女心 2024-09-06 22:18:39

我有机会尝试一下这个，我之前关于 NOT IN 的评论在这种情况下是转移注意力的。关键在于 NULL 的存在，或者更确切地说，索引列是否强制执行 NOT NULL 约束。

这将取决于您使用的数据库的版本，因为优化器随着每个版本的发布而变得更加智能。我使用的是 11gR1，优化器在所有情况下都使用索引，除了一种情况：当两列都为空并且我没有包含 NOT IN 子句时：

SQL> desc big_table
 Name                                  Null?    Type
 -----------------------------------  ------    -------------------
 ID                                             NUMBER
 COL1                                           NUMBER
 COL2                                           VARCHAR2(30 CHAR)
 COL3                                           DATE
 COL4                                           NUMBER

没有 NOT IN 子句...

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      group by col4
  4  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399

----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 31964 |   280K|       |  7574   (2)| 00:01:31 |
|   1 |  HASH GROUP BY     |           | 31964 |   280K|    45M|  7574   (2)| 00:01:31 |
|   2 |   TABLE ACCESS FULL| BIG_TABLE |  2340K|    20M|       |  4284   (1)| 00:00:52 |
----------------------------------------------------------------------------------------

9 rows selected.


SQL>

当我取消了 NOT IN 子句后，优化器选择使用索引。诡异的。

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      where col1 not in (12, 19)
  4      group by col4
  5  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376

----------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        | 31964 |   280K|       |  5057   (3)| 00:01:01 |
|   1 |  HASH GROUP BY        |        | 31964 |   280K|    45M|  5057   (3)| 00:01:01 |
|*  2 |   INDEX FAST FULL SCAN| BIG_I2 |  2340K|    20M|       |  1767   (2)| 00:00:22 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------

   2 - filter("COL1"<>12 AND "COL1"<>19)

14 rows selected.

SQL>

重复一下，在所有其他情况下，只要索引列之一被声明为非空，索引就会用于满足查询。对于早期版本的 Oracle 来说，情况可能并非如此，但它可能指明了前进的方向。

I have had the chance to play around with this, and my previous comments regarding the NOT IN are a red herring in this case. The key thing is the presence of NULLs, or rather whether the indexed columns have NOT NULL constraints enforced.

This is going to depend on the version of the database you're using, because the optimizer gets smarter with each release. I'm using 11gR1 and the optimizer used the index in all cases except one: when both columns were null and I didn't include the NOT IN clause:

SQL> desc big_table
 Name                                  Null?    Type
 -----------------------------------  ------    -------------------
 ID                                             NUMBER
 COL1                                           NUMBER
 COL2                                           VARCHAR2(30 CHAR)
 COL3                                           DATE
 COL4                                           NUMBER

Without the NOT IN clause...

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      group by col4
  4  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399

----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 31964 |   280K|       |  7574   (2)| 00:01:31 |
|   1 |  HASH GROUP BY     |           | 31964 |   280K|    45M|  7574   (2)| 00:01:31 |
|   2 |   TABLE ACCESS FULL| BIG_TABLE |  2340K|    20M|       |  4284   (1)| 00:00:52 |
----------------------------------------------------------------------------------------

9 rows selected.


SQL>

When I dobbed the NOT IN clause back in, the optimizer opted to use the index. Weird.

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      where col1 not in (12, 19)
  4      group by col4
  5  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376

----------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        | 31964 |   280K|       |  5057   (3)| 00:01:01 |
|   1 |  HASH GROUP BY        |        | 31964 |   280K|    45M|  5057   (3)| 00:01:01 |
|*  2 |   INDEX FAST FULL SCAN| BIG_I2 |  2340K|    20M|       |  1767   (2)| 00:00:22 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------

   2 - filter("COL1"<>12 AND "COL1"<>19)

14 rows selected.

SQL>

Just to repeat, in all other cases, as long as one of the indexed columns was declared not nill, the index was used to satisfy the query. This may not be true on earlier versions of Oracle, but it probably points the way forward.

回复收藏 0 原文