如何获得 COUNT(col) ... GROUP BY 来使用索引?

发布于 2024-08-30 22:18:39 字数 549 浏览 2 评论 0原文

我有一个表(col1,col2,...),其索引为(col1,col2,...)。该表中有数百万行,我想运行一个查询:

 SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1

不幸的是,这会导致对该表进行全表扫描,这需要一分钟以上的时间。有什么方法可以让oracle使用列上的索引来更快地返回结果吗?

编辑:

更具体地说,我正在运行以下查询:

SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner

SYS.OBJ$ (SYS.I_OBJ2) 上有一个索引,其中索引 owner#name 列;我相信我应该能够在查询中使用这个索引,而不是对 SYS.OBJ$ 进行全表扫描

I've got a table (col1, col2, ...) with an index on (col1, col2, ...). The table has got millions of rows in it, and I want to run a query:

 SELECT col1, COUNT(col2) WHERE col1 NOT IN (<couple of exclusions>) GROUP BY col1

Unfortunately, this is resulting in a full table scan of the table, which takes upwards of a minute. Is there any way of getting oracle to use the index on the columns to return the results much faster?

EDIT:

more specifically, I'm running the following query:

SELECT owner, COUNT(object_name) FROM all_objects GROUP BY owner

and there is an index on SYS.OBJ$ (SYS.I_OBJ2) which indexes the owner# and name columns; I believe I should be able to use this index in the query, rather than a full table scan of SYS.OBJ$

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

尽揽少女心 2024-09-06 22:18:39

我有机会尝试一下这个,我之前关于 NOT IN 的评论在这种情况下是转移注意力的。关键在于 NULL 的存在,或者更确切地说,索引列是否强制执行 NOT NULL 约束。

这将取决于您使用的数据库的版本,因为优化器随着每个版本的发布而变得更加智能。我使用的是 11gR1,优化器在所有情况下都使用索引,除了一种情况:当两列都为空并且我没有包含 NOT IN 子句时:

SQL> desc big_table
 Name                                  Null?    Type
 -----------------------------------  ------    -------------------
 ID                                             NUMBER
 COL1                                           NUMBER
 COL2                                           VARCHAR2(30 CHAR)
 COL3                                           DATE
 COL4                                           NUMBER

没有 NOT IN 子句...

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      group by col4
  4  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399

----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 31964 |   280K|       |  7574   (2)| 00:01:31 |
|   1 |  HASH GROUP BY     |           | 31964 |   280K|    45M|  7574   (2)| 00:01:31 |
|   2 |   TABLE ACCESS FULL| BIG_TABLE |  2340K|    20M|       |  4284   (1)| 00:00:52 |
----------------------------------------------------------------------------------------

9 rows selected.


SQL>

当我取消了 NOT IN 子句后,优化器选择使用索引。诡异的。

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      where col1 not in (12, 19)
  4      group by col4
  5  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376

----------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        | 31964 |   280K|       |  5057   (3)| 00:01:01 |
|   1 |  HASH GROUP BY        |        | 31964 |   280K|    45M|  5057   (3)| 00:01:01 |
|*  2 |   INDEX FAST FULL SCAN| BIG_I2 |  2340K|    20M|       |  1767   (2)| 00:00:22 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------

   2 - filter("COL1"<>12 AND "COL1"<>19)

14 rows selected.

SQL>

重复一下,在所有其他情况下,只要索引列之一被声明为非空,索引就会用于满足查询。对于早期版本的 Oracle 来说,情况可能并非如此,但它可能指明了前进的方向。

I have had the chance to play around with this, and my previous comments regarding the NOT IN are a red herring in this case. The key thing is the presence of NULLs, or rather whether the indexed columns have NOT NULL constraints enforced.

This is going to depend on the version of the database you're using, because the optimizer gets smarter with each release. I'm using 11gR1 and the optimizer used the index in all cases except one: when both columns were null and I didn't include the NOT IN clause:

SQL> desc big_table
 Name                                  Null?    Type
 -----------------------------------  ------    -------------------
 ID                                             NUMBER
 COL1                                           NUMBER
 COL2                                           VARCHAR2(30 CHAR)
 COL3                                           DATE
 COL4                                           NUMBER

Without the NOT IN clause...

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      group by col4
  4  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1753714399

----------------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 31964 |   280K|       |  7574   (2)| 00:01:31 |
|   1 |  HASH GROUP BY     |           | 31964 |   280K|    45M|  7574   (2)| 00:01:31 |
|   2 |   TABLE ACCESS FULL| BIG_TABLE |  2340K|    20M|       |  4284   (1)| 00:00:52 |
----------------------------------------------------------------------------------------

9 rows selected.


SQL>

When I dobbed the NOT IN clause back in, the optimizer opted to use the index. Weird.

SQL> explain plan for
  2      select col4, count(col1) from big_table
  3      where col1 not in (12, 19)
  4      group by col4
  5  /

Explained.

SQL> select * from table(dbms_xplan.display)
  2  /

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 343952376

----------------------------------------------------------------------------------------
| Id  | Operation             | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |        | 31964 |   280K|       |  5057   (3)| 00:01:01 |
|   1 |  HASH GROUP BY        |        | 31964 |   280K|    45M|  5057   (3)| 00:01:01 |
|*  2 |   INDEX FAST FULL SCAN| BIG_I2 |  2340K|    20M|       |  1767   (2)| 00:00:22 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------

   2 - filter("COL1"<>12 AND "COL1"<>19)

14 rows selected.

SQL>

Just to repeat, in all other cases, as long as one of the indexed columns was declared not nill, the index was used to satisfy the query. This may not be true on earlier versions of Oracle, but it probably points the way forward.

零度℉ 2024-09-06 22:18:39

您可以使用提示 http://download.oracle .com/docs/cd/B10501_01/server.920/a96533/hintsref.htm
但请记住,使用索引可能并不总能带来更快的执行速度。

you could use a hint http://download.oracle.com/docs/cd/B10501_01/server.920/a96533/hintsref.htm ,
but remember that using an index might not always result in faster execution.

白日梦 2024-09-06 22:18:39

(以防万一,您确定它正在执行表扫描而不是索引扫描吗?)

尝试使用 COUNT(*) 而不是 COUNT(col2) (假设这是当然适合你的问题)。另外,也许可以尝试仅使用 col1 建立索引。

(Just in case, are you sure it's doing a table scan and not an index scan?)

Try using COUNT(*) instead of COUNT(col2) (assuming this is appropriate for you problem, of course). Also, maybe try an index with just col1.

北凤男飞 2024-09-06 22:18:39

您正在查询 Oracle 的固定表,因为您没有说明这是哪个数据库版本,所以我假设是最近的一个。固定表是否经过分析并更新了统计数据?您是否通过使用 /*+ 规则 */ 提示尝试使用规则库优化器进行查询。我经常看到,当使用规则库优化器时,针对 Oracle 自己的固定表的查询性能更好。

You are querying against oracle's fixed tables, since you've not stated which db vesion this is, I'll assume a recent one. Have the fixed tables been analyzed and have updated statistics? Have you tried your query using the rule base optimizer by the use of the /*+ rule */ hint. Often I've seen that queries against oracle's own fixed tables perform better when the rule base optimizer is used.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文