关于“ORDER BY”的性能调整和“喜欢”条款

发布于 2024-09-24 21:21:15 字数 7500 浏览 3 评论 0原文

我有2个表,其中有很多记录(假设TableA和TableB都有大约3,000,000条记录)。vr2_input是用户输入的varchar输入参数,我想获取最多200个最大的“dateField”的TableA记录,其stringField像' vr2_input' 。这两个表的连接如下:

select * from(   
   select * from 
        TableA join TableB on TableA.id = TableB.id
        where  TableA.stringField like 'vr2_input' || '%'
        order by  TableA.dateField desc   
) where rownum < 201

查询很慢,我仔细研究了一下,发现这是因为“like”和“order by”涉及全表扫描。但是,我找不到解决方案问题 。我该如何调优这种类型的 SQL?我已经在 TableA.stringField 和 TableA.dateField 上创建了索引,但是如何在 select 语句中使用索引功能?数据库是oracle 10g。非常感谢!!

更新:我使用 iddqd 的建议,只选择我想要的字段并运行解释计划。完成查询大约需要4分钟。 IX_TableA_stringField 是 TableA.srv_ref 字段的索引名称。我在没有提示的情况下再次运行解释计划,解释计划仍然得到相同的结果。

EXPLAIN PLAN FOR
    select * from(
         select   
                 /*+ INDEX(TableB IX_TableA_stringField)*/ 
                  TableA.id,
                    TableA.stringField,
                    TableA.dateField,
                    TableA.someField2,
                   TableA.someField3,
            TableB.someField1,
            TableB.someField2,
            TableB.someField3,
                    from TableA 
                    join TableB  on  TableA.id=TableB.id
                    WHERE TableA.stringField  like '21'||'%'  
                 order by TableA.dateField  desc
        ) where rownum < 201

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
Plan hash value: 871807846                                                                                                                                                                                                                                                                    

--------------------------------------------------------------------------------------------------------                                                                                                                                                                           
| Id  | Operation                       | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |                                                                                                                                                                      
--------------------------------------------------------------------------------------------------------                                                                                                                                                                           
|   0 | SELECT STATEMENT                |                      |   200 | 24000 |  3293   (1)| 00:00:18 |                                                                                                                                                                   
|*  1 |  COUNT STOPKEY                  |                      |       |       |            |          |                                                                                                                                                                                   
|   2 |   VIEW                          |                      |  1397 |   163K|  3293   (1)| 00:00:18 |                                                                                                                                                                             
|*  3 |    SORT ORDER BY STOPKEY        |                      |  1397 | 90805 |  3293   (1)| 00:00:18 |                                                                                                                                                              
|   4 |     NESTED LOOPS                |                      |  1397 | 90805 |  3292   (1)| 00:00:18 |                                                                                                                                                                      
|   5 |      TABLE ACCESS BY INDEX ROWID| TableA       |  1397 | 41910 |   492   (1)| 00:00:03 |                                                                                                                                                 
|*  6 |       INDEX RANGE SCAN          | IX_TableA_stringField |  1397 |       |     6   (0)| 00:00:01 |                                                                                                                                                         
|   7 |      TABLE ACCESS BY INDEX ROWID| TableB      |     1 |    35 |     2   (0)| 00:00:01 |                                                                                                                                                      
|*  8 |       INDEX UNIQUE SCAN         | PK_TableB   |     1 |       |     1   (0)| 00:00:01 |                                                                                                                                                            
--------------------------------------------------------------------------------------------------------                                                                                                                                                                           

Predicate Information (identified by operation id):                                                                                                                                                                                                                                      
---------------------------------------------------                                                                                                                                                                                                                                             

   1 - filter(ROWNUM<201)                                                                                                                                                                                                                                                                      
   3 - filter(ROWNUM<201)                                                                                                                                                                                                                                                                      
   6 - access("TableA"."stringField" LIKE '21%')                                                                                                                                                                                                                                                 
       filter("TableA"."stringField" LIKE '21%')                                                                                                                                                                                                                                                     
   8 - access(TableA"."id"="TableB"."id")       

I have 2 tables which have many records (say both TableA and TableB has about 3,000,000 records).vr2_input is a varchar input parameters enter by the users and I want to get the most 200 largest "dateField" 's TableA records whose stringField like 'vr2_input' .The 2 tables are joined as the following:

select * from(   
   select * from 
        TableA join TableB on TableA.id = TableB.id
        where  TableA.stringField like 'vr2_input' || '%'
        order by  TableA.dateField desc   
) where rownum < 201

The query is slow , I goggled that and found out that it is because "like" and "order by" involves the full table scan .However , I cannot found a solution to solve the problem . How can I tune this type of SQL? I have already create an index on TableA.stringField and TableA.dateField but how can I use the index feature in the select statement? The database is oracle 10g. Thanks so much!!

Update : I use iddqd 's suggestion and only select the fields that I want and run the explain plan . It cost about 4 mins to finish the query . IX_TableA_stringField is the index name of the TableA.srv_ref field .I run again the explain plan without the hint , the explain plan still get the same result.

EXPLAIN PLAN FOR
    select * from(
         select   
                 /*+ INDEX(TableB IX_TableA_stringField)*/ 
                  TableA.id,
                    TableA.stringField,
                    TableA.dateField,
                    TableA.someField2,
                   TableA.someField3,
            TableB.someField1,
            TableB.someField2,
            TableB.someField3,
                    from TableA 
                    join TableB  on  TableA.id=TableB.id
                    WHERE TableA.stringField  like '21'||'%'  
                 order by TableA.dateField  desc
        ) where rownum < 201

PLAN_TABLE_OUTPUT                                                                                                                                                                                                                                                                                            
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
Plan hash value: 871807846                                                                                                                                                                                                                                                                    

--------------------------------------------------------------------------------------------------------                                                                                                                                                                           
| Id  | Operation                       | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |                                                                                                                                                                      
--------------------------------------------------------------------------------------------------------                                                                                                                                                                           
|   0 | SELECT STATEMENT                |                      |   200 | 24000 |  3293   (1)| 00:00:18 |                                                                                                                                                                   
|*  1 |  COUNT STOPKEY                  |                      |       |       |            |          |                                                                                                                                                                                   
|   2 |   VIEW                          |                      |  1397 |   163K|  3293   (1)| 00:00:18 |                                                                                                                                                                             
|*  3 |    SORT ORDER BY STOPKEY        |                      |  1397 | 90805 |  3293   (1)| 00:00:18 |                                                                                                                                                              
|   4 |     NESTED LOOPS                |                      |  1397 | 90805 |  3292   (1)| 00:00:18 |                                                                                                                                                                      
|   5 |      TABLE ACCESS BY INDEX ROWID| TableA       |  1397 | 41910 |   492   (1)| 00:00:03 |                                                                                                                                                 
|*  6 |       INDEX RANGE SCAN          | IX_TableA_stringField |  1397 |       |     6   (0)| 00:00:01 |                                                                                                                                                         
|   7 |      TABLE ACCESS BY INDEX ROWID| TableB      |     1 |    35 |     2   (0)| 00:00:01 |                                                                                                                                                      
|*  8 |       INDEX UNIQUE SCAN         | PK_TableB   |     1 |       |     1   (0)| 00:00:01 |                                                                                                                                                            
--------------------------------------------------------------------------------------------------------                                                                                                                                                                           

Predicate Information (identified by operation id):                                                                                                                                                                                                                                      
---------------------------------------------------                                                                                                                                                                                                                                             

   1 - filter(ROWNUM<201)                                                                                                                                                                                                                                                                      
   3 - filter(ROWNUM<201)                                                                                                                                                                                                                                                                      
   6 - access("TableA"."stringField" LIKE '21%')                                                                                                                                                                                                                                                 
       filter("TableA"."stringField" LIKE '21%')                                                                                                                                                                                                                                                     
   8 - access(TableA"."id"="TableB"."id")       

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

眼睛会笑 2024-10-01 21:21:15

您说运行查询大约需要 4 分钟。 EXPLAIN PLAN 输出显示估计时间为 18 秒。因此,在这种情况下,优化器可能与它的一些估计相差甚远。 (它仍然可能选择最好的计划,但也可能不是。)

在这种情况下,第一步是获取实际的执行计划和统计数据。使用提示 /*+ Gather_plan_statistics */ 运行查询,然后立即执行 select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'))

这将显示已运行的实际执行计划,并且对于每个步骤,它将显示估计行数、实际行数和实际花费的时间。在此处发布输出,也许我们可以对您的问题说一些更有意义的内容。

如果没有这些信息,我的建议是尝试以下查询重写。我相信它是等效的,因为 ID 似乎是 TableB 的主键。

select TableA.id,
       TableA.stringField,
       TableA.dateField,
       TableA.someField2,
       TableA.someField3,
       TableB.someField1,
       TableB.someField2,
       TableB.someField3,
  from (select * from(
         select   
                  TableA.id,
                    TableA.stringField,
                    TableA.dateField,
                    TableA.someField2,
                    TableA.someField3,
                    from TableA 
                    WHERE TableA.stringField  like '21'||'%'  
                 order by TableA.dateField  desc
          )
          where rownum < 201
       ) TableA
       join TableB  on  TableA.id=TableB.id

You say it's taking about 4 minutes to run the query. The EXPLAIN PLAN output shows an estimate of 18 seconds. So the optimizer is probably far off on some of its estimates in this case. (It could still be choosing the best possible plan, but maybe not.)

The first step in a case like this is to get the actual execution plan and statistics. Run your query with the hint /*+ gather_plan_statistics */, then immediately afterwards execute select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')).

This will show the actual execution plan that was run, and for each step it will show the estimated rows, actual rows, and actual time taken. Post the output here and maybe we can say something more meaningful about your issue.

Without that information, my suggestion is to try out the following rewrite of the query. I believe it is equivalent since it appears that ID is the primary key of TableB.

select TableA.id,
       TableA.stringField,
       TableA.dateField,
       TableA.someField2,
       TableA.someField3,
       TableB.someField1,
       TableB.someField2,
       TableB.someField3,
  from (select * from(
         select   
                  TableA.id,
                    TableA.stringField,
                    TableA.dateField,
                    TableA.someField2,
                    TableA.someField3,
                    from TableA 
                    WHERE TableA.stringField  like '21'||'%'  
                 order by TableA.dateField  desc
          )
          where rownum < 201
       ) TableA
       join TableB  on  TableA.id=TableB.id
夜清冷一曲。 2024-10-01 21:21:15

是否需要选择所有列 (*)?如果您选择所有列,优化器将更有可能进行全面扫描。如果您需要输出中的所有列,您最好在内联视图中选择 id,然后返回选择其他列,这可以通过索引查找来完成。尝试对这两种情况运行解释计划以查看优化器正在做什么。

Do you need to select all columns (*)? The optimizer will be more likely to full scan if you select all columns. If you need all columns in output you may be better to select the id in your inline view and then join back to select other columns, which could be done with an index lookup. Try running an explain plan for both cases to see what the optimizer is doing.

埋情葬爱 2024-10-01 21:21:15

在 stringField 和 dateField 列上创建索引。 SQL 引擎自动使用它们。

Create indexes on the stringField and dateField columns. The SQL engine uses them automatically.

烟凡古楼 2024-10-01 21:21:15
select id from(   
   select /*+ INDEX(TableB stringField_indx)*/ TableB.id from 
        TableA join TableB on TableA.id = TableB.id
        where  TableA.stringField like 'vr2_input' || '%'
        order by  TableA.dateField desc   
) where rownum < 201

next:

SELECT * FROM TableB WHERE id iN( id from first query)

请发送该表的统计数据和 DDL。

select id from(   
   select /*+ INDEX(TableB stringField_indx)*/ TableB.id from 
        TableA join TableB on TableA.id = TableB.id
        where  TableA.stringField like 'vr2_input' || '%'
        order by  TableA.dateField desc   
) where rownum < 201

next:

SELECT * FROM TableB WHERE id iN( id from first query)

Please send stats and DDL of this tables.

何以笙箫默 2024-10-01 21:21:15

如果您有足够的内存,您可以提示查询使用散列连接。请附上解释计划好吗

If you have enough memory you can hint the query to use hash join. Could you please attach the explain plan

心的憧憬 2024-10-01 21:21:15

如果表 A 是较小的表,则表 A 有多少条记录,您可以在该表上进行选择,然后循环检索表 B 记录的结果,因为选择和排序都在表 A 上。

一个很好的实验是删除连接并测试速度,如果允许,您可以将 rownum < 201 作为主查询上的 AND 子句。目前查询很可能将所有行返回到外部查询,然后它被修剪?

How many records does Table A has if it's the smaller table could you do the select on that table and then loop though the results retrieving the Table B records, as both the select and the sort are on TableA.

A good experiment would be to remove the join and test the speed on that also if allowed can you put the rownum < 201 as an AND clause on the main query. It's probable at the moment that the query is returning all rows to the outer query and then it's getting trimmed?

偷得浮生 2024-10-01 21:21:15

要优化 like 谓词,您可以创建上下文索引并使用 contains 子句。

请看: http://docs.oracle.com/cd/B28359_01 /text.111/b28303/ind.htm

谢谢

To optimize the like predicate, you can create a contextual index and use contains clause.

Look: http://docs.oracle.com/cd/B28359_01/text.111/b28303/ind.htm

Thanks

め可乐爱微笑 2024-10-01 21:21:15

您可以在 tableA 上创建一个函数索引。这将根据 TableA.stringField 的条件返回 1 或 0,如 'vr2_input' || '%' 是否满足。该索引将使查询运行得更快。该函数的逻辑将是

if (substr(TableA.stringField, 1, 9) = 'vr2_input'
THEN 
    return 1;
else 
    return 0;

使用实际的列名称而不是“*”可能会有所帮助。至少应该删除常见的列名称。

You can create one function index on tableA. That will return 1 or 0 based on the condition TableA.stringField like 'vr2_input' || '%' is satisfied or not. That index will make query run faster. The logic of the function will be

if (substr(TableA.stringField, 1, 9) = 'vr2_input'
THEN 
    return 1;
else 
    return 0;

Using actual column names instead of "*" may help. At least common column names should be removed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文