SQL 查询进行全表扫描而不是基于索引的扫描

发布于 2024-10-21 18:28:16 字数 1423 浏览 4 评论 0原文

我有两个表:

create table big( id number, name varchar2(100));
insert into big(id, name) select rownum, object_name from all_objects;

create table small as select id from big where rownum < 10;
create index big_index on big(id);

在这些表上,如果我执行以下查询:

select * 
  from big_table 
 where id like '45%' 
    or id in ( select id from small_table);

它总是进行全表扫描。

Execution Plan
----------------------------------------------------------
Plan hash value: 2290496975
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |  3737 | 97162 |    85   (3)| 00:00:02 |
|*  1 |  FILTER            |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL| BIG   | 74718 |  1897K|    85   (3)| 00:00:02 |
|*  3 |   TABLE ACCESS FULL| SMALL |     1 |     4 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID"=45 OR  EXISTS (SELECT /*+ */ 0 FROM "SMALL" "SMALL"

          WHERE "ID"=:B1))

3 - filter("ID"=:B1)

有什么方法可以重写查询,使其始终进行索引扫描。

I have two tables:

create table big( id number, name varchar2(100));
insert into big(id, name) select rownum, object_name from all_objects;

create table small as select id from big where rownum < 10;
create index big_index on big(id);

On these tables if I execute the following query:

select * 
  from big_table 
 where id like '45%' 
    or id in ( select id from small_table);

it always goes for a Full Table Scan.

Execution Plan
----------------------------------------------------------
Plan hash value: 2290496975
----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |  3737 | 97162 |    85   (3)| 00:00:02 |
|*  1 |  FILTER            |       |       |       |            |          |
|   2 |   TABLE ACCESS FULL| BIG   | 74718 |  1897K|    85   (3)| 00:00:02 |
|*  3 |   TABLE ACCESS FULL| SMALL |     1 |     4 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID"=45 OR  EXISTS (SELECT /*+ */ 0 FROM "SMALL" "SMALL"

          WHERE "ID"=:B1))

3 - filter("ID"=:B1)

Are there any ways in which we can rewrite the Query So that it always goes for index Scan.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

窝囊感情。 2024-10-28 18:28:16

不,不,不。

您不希望它使用索引。幸运的是,甲骨文比这更聪明。

ID 是数字。虽然它的 ID 值可能为 45,450,451,452,4501,45004,4500003 等,但在索引中这些值将分散在任何地方。如果您采用 ID BETWEEN 450 AND 459 等条件,那么可能值得使用索引。

要使用索引,必须从上到下扫描它(将每个 ID 转换为字符以进行 LIKE 比较)。然后,对于任何匹配,都必须获取 NAME 列。

它决定扫描表更容易、更快捷(表有 75,000 行,无论如何也不算大),而不是在索引和表之间来回移动。

No, no and no.

You do NOT want it to use an index. Luckily Oracle is smarter than that.

ID is numeric. While it might have ID values of 45,450,451,452,4501,45004,4500003 etc, in the indexes these values will be scattered anywhere and everywhere. If you went with a condition such as ID BETWEEN 450 AND 459, then it may be worth using the index.

To use the index it would have to scan it all the way from top to bottom (converting each ID to a character to do the LIKE comparison). Then, for any match, it has to go off to get the NAME column.

It has decided that it is easier to and quicker to scan the table (which, with 75,000 rows isn't that big anyway) rather than mucking about going back and forth between the index and the table.

时光磨忆 2024-10-28 18:28:16

其他人是对的,你不应该使用这样的数字列。

然而,在这种情况下,实际上是 OR 构造导致了(性能)问题。我不知道版本 11 中是否有所不同,但到版本 10gr2 为止,它会导致过滤操作,基本上是带有相关子查询的嵌套循环。在您的情况下,使用数字列作为 varchar 也会导致全表扫描。

您可以像这样重写您的查询:

select *
  from big
 where id like '45%'
union all
select *
  from big
  join small using(id)
 where id not like '45%';

通过您的测试用例,我最终得到的行数为 174000 行(大行)和 9 行(小行)。
运行查询需要 7 秒,有 1211399 次一致获取。
运行我的查询 0.7 秒并使用 542 次一致获取。

我的查询的解释计划是:

 --------------------------------------------------------------------
| Id  | Operation                     | Name   | Rows  | Cost (%CPU)|
---------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        |  8604 |   154   (6)|
|   1 |  UNION-ALL                    |        |       |            |
|*  2 |   TABLE ACCESS FULL           | BIG    |  8603 |   151   (4)|
|   3 |   NESTED LOOPS                |        |     1 |     3   (0)|
|*  4 |    TABLE ACCESS FULL          | SMALL  |     1 |     3   (0)|
|   5 |    TABLE ACCESS BY INDEX ROWID| BIG    |     1 |     0   (0)|
|*  6 |     INDEX UNIQUE SCAN         | BIG_PK |     1 |     0   (0)|
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("ID") LIKE '45%')
   4 - filter(TO_CHAR("SMALL"."ID") NOT LIKE '45%')
   6 - access("BIG"."ID"="SMALL"."ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        542  consistent gets
          0  physical reads
          0  redo size
      33476  bytes sent via SQL*Net to client
        753  bytes received via SQL*Net from client
         76  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1120  rows processed

The others are right, you shouldn't use a numeric column like that.

However, it is actually, the OR <subquery> construct that is causing a (performance) problem in this case. I don't know if it is different in version 11, but up to version 10gr2, it causes a a filter operation with what is basically a nested loop with a correlated subquery. In your case, the use of a numeric column as a varchar also results in a full table scan.

You can rewrite your query like this:

select *
  from big
 where id like '45%'
union all
select *
  from big
  join small using(id)
 where id not like '45%';

With your test case, I end up with a row count of 174000 rows in big and 9 small.
Running your query takes 7 seconds with 1211399 consistent gets.
Running my query 0,7 seconds and uses 542 consistent gets.

The explain plans for my query is:

 --------------------------------------------------------------------
| Id  | Operation                     | Name   | Rows  | Cost (%CPU)|
---------------------------------------------------------------------
|   0 | SELECT STATEMENT              |        |  8604 |   154   (6)|
|   1 |  UNION-ALL                    |        |       |            |
|*  2 |   TABLE ACCESS FULL           | BIG    |  8603 |   151   (4)|
|   3 |   NESTED LOOPS                |        |     1 |     3   (0)|
|*  4 |    TABLE ACCESS FULL          | SMALL  |     1 |     3   (0)|
|   5 |    TABLE ACCESS BY INDEX ROWID| BIG    |     1 |     0   (0)|
|*  6 |     INDEX UNIQUE SCAN         | BIG_PK |     1 |     0   (0)|
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_CHAR("ID") LIKE '45%')
   4 - filter(TO_CHAR("SMALL"."ID") NOT LIKE '45%')
   6 - access("BIG"."ID"="SMALL"."ID")


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        542  consistent gets
          0  physical reads
          0  redo size
      33476  bytes sent via SQL*Net to client
        753  bytes received via SQL*Net from client
         76  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1120  rows processed
魂归处 2024-10-28 18:28:16

像这样的事情可能会起作用:

select * 
  from big_table big
 where id like '45%' 
    or exists ( select id from small_table where id = big.id);

Something like this might work:

select * 
  from big_table big
 where id like '45%' 
    or exists ( select id from small_table where id = big.id);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文