为什么这个 oracle select 语句需要几分钟才能完成?
这些表的字段少于 20 个,属性大约有 900 万行,而列表有 300 万行,但这应该不是问题。这就是数据库的用途...
listing_ids 的类型为 Number。到目前为止,我们最好的猜测是,由于属性表中还有 600 万行,其 Listingid 实际上并不指向列表,因此 Oracle 花费了大量时间来查找不存在的列表。这还有道理吗?
Select count(*)
from listings.rfs_listings listings
join listings.rfs_properties properties
on listings.listing_id= properties.listing_id
where listings.display = 1
and properties.city= 'New York'
and rownum <= 10;
我对查询运行了一个解释计划并得到了以下信息:
PLAN_TABLE_OUTPUT
Plan hash value: 772088252
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 4110 | 283 (0)| 00:00:04 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 10 | 4110 | 283 (0)| 00:00:04 |
|* 4 | TABLE ACCESS FULL | RFS_LISTINGS | 140 | 2940 | 3 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | RFS_PROPERTIES_LD730_UNIQUE | 1 | | 1 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID| RFS_PROPERTIES | 1 | 390 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=10)
4 - filter("LISTINGS"."DISPLAY"=1)
5 - access("LISTINGS"."LISTING_ID"="PROPERTIES"."LISTING_ID")
6 - filter(NLSSORT("PROPERTIES"."CITY",'nls_sort=''BINARY_CI''')=HEXTORAW('6E657720796F726
B00') )
编辑:表模式:
rfs.rfs_listings:
Name Null? Type
------------------------------- -------- ------------------
DETAIL_ID NOT NULL NUMBER
LISTING_ID NOT NULL NUMBER
DETAIL_CHECKSUM NOT NULL VARCHAR2(32 CHAR)
PRICE NUMBER
IN_CONTRACT NOT NULL NUMBER
CREATED NOT NULL DATE
PROPERTY_WEB_ID NOT NULL VARCHAR2(100)
SOURCE_ID NOT NULL NUMBER
LISTING_CREATED NOT NULL DATE
ARCHIVE_NAME VARCHAR2(100)
DUPLICATES_GROUP_ID NUMBER
FILENAME VARCHAR2(80)
DISPLAY NUMBER
rfs.rfs_properties:
Name Null? Type
------------------------------- -------- -----------------
PROPERTY_ID NOT NULL NUMBER
LISTING_ID NOT NULL NUMBER
BLDG_PROPKEY NUMBER
UNIT_PROPKEY NUMBER
ADDRESSKEY NUMBER
HOUSE_NUMBER VARCHAR2(32)
STREET_ADDRESS VARCHAR2(200)
UNIT_NUMBER VARCHAR2(32)
UNIT_NUMBER_PARSED VARCHAR2(16 CHAR)
PARSED_ADDRESS VARCHAR2(255)
DISPLAY_ADDRESS VARCHAR2(150)
CROSS_STREET VARCHAR2(200)
NEIGHBORHOOD VARCHAR2(150)
NEIGHBORHOOD_CODE NUMBER
NEIGHBORHOOD_REG_CODE NUMBER
SCHOOL_DISTRICT VARCHAR2(100)
BOROUGH_CITY VARCHAR2(100 CHAR)
METRO_AREA VARCHAR2(100 CHAR)
ZIP_CODE NUMBER
COUNTY NUMBER
STATE VARCHAR2(4 CHAR)
ROOMS NUMBER
BEDROOMS NUMBER
BATHROOMS NUMBER
SQFT NUMBER
LOT_SIZE NUMBER
STUDIO NUMBER
LOFT NUMBER
MAINT_CC NUMBER
RE_TAX NUMBER
PROPERTY_TYPE_ID NUMBER
PROPERTY_TYPE VARCHAR2(255)
BLDG_NAME VARCHAR2(255)
BLDG_TYPE VARCHAR2(32 CHAR)
BLDG_NEW_DEVEL NUMBER
BLDG_FEATURES VARCHAR2(256)
MANUALLY_BLDG_FEAT VARCHAR2(255)
APT_FEATURES VARCHAR2(256)
OUTDOOR_SPACE VARCHAR2(32 CHAR)
YEAR_BUILT NUMBER
LISTING_RANK NOT NULL NUMBER
LOCATION_CHECKED NUMBER
PROPKEY_SOURCE VARCHAR2(15)
WEB_BUG_URL VARCHAR2(255)
EMAIL_LEAD_GENERATION VARCHAR2(100)
LISTING_URL VARCHAR2(255)
BROKER_NAME VARCHAR2(100)
BROKER_URL VARCHAR2(256)
LISTING_TEXT CLOB
IS_UPDATED NUMBER
CENTROID_X NUMBER(20,10)
CENTROID_Y NUMBER(20,10)
CENTROID MDSYS.SDO_GEOMETRY
COUNTY_GEO_ID NUMBER
CX NUMBER
CY NUMBER
更新 oracle 统计信息后的一些更新的统计信息:
统计信息
31 recursive calls
2 db block gets
63053 consistent gets
15474 physical reads
0 redo size
2890 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed
更新统计信息后的新执行计划:
Execution Plan
----------------------------------------------------------
Plan hash value: 3213592672
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 4110 | 236 (0)| 00:00:03 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 10 | 4110 | 236 (0)| 00:00:03 |
| 4 | TABLE ACCESS BY INDEX ROWID| RFS_LISTINGS | 224 | 4704 | 11 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | RFS_LISTINGS_DISPLAY | | | 3 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | RFS_PROPERTIES_LD730_UNIQUE | 1 | | 1 (0)| 00:00:01 |
|* 7 | TABLE ACCESS BY INDEX ROWID | RFS_PROPERTIES | 1 | 390 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=10)
5 - access("LISTINGS"."DISPLAY"=1)
6 - access("LISTINGS"."LISTING_ID"="PROPERTIES"."LISTING_ID")
7 - filter(NLSSORT("PROPERTIES"."BOROUGH_CITY",'nls_sort=''BINARY_CI''')=HEXTORAW('6E657720796F726B
00') )
These tables have less than 20 fields, properties has about 9 million rows while listings has 3 million rows, but that should not be a problem. This is what databases are for...
the listing_ids are of type Number. Our best guess so far is that because the properties table has 6 million more rows with listingids that dont actually point to a listing, Oracle spends a lot of time looking for listings that dont exist. Does that even make sense?
Select count(*)
from listings.rfs_listings listings
join listings.rfs_properties properties
on listings.listing_id= properties.listing_id
where listings.display = 1
and properties.city= 'New York'
and rownum <= 10;
I ran an explain plan on the query and got the following info:
PLAN_TABLE_OUTPUT
Plan hash value: 772088252
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 4110 | 283 (0)| 00:00:04 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 10 | 4110 | 283 (0)| 00:00:04 |
|* 4 | TABLE ACCESS FULL | RFS_LISTINGS | 140 | 2940 | 3 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | RFS_PROPERTIES_LD730_UNIQUE | 1 | | 1 (0)| 00:00:01 |
|* 6 | TABLE ACCESS BY INDEX ROWID| RFS_PROPERTIES | 1 | 390 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=10)
4 - filter("LISTINGS"."DISPLAY"=1)
5 - access("LISTINGS"."LISTING_ID"="PROPERTIES"."LISTING_ID")
6 - filter(NLSSORT("PROPERTIES"."CITY",'nls_sort=''BINARY_CI''')=HEXTORAW('6E657720796F726
B00') )
EDIT: Table Schemas:
rfs.rfs_listings:
Name Null? Type
------------------------------- -------- ------------------
DETAIL_ID NOT NULL NUMBER
LISTING_ID NOT NULL NUMBER
DETAIL_CHECKSUM NOT NULL VARCHAR2(32 CHAR)
PRICE NUMBER
IN_CONTRACT NOT NULL NUMBER
CREATED NOT NULL DATE
PROPERTY_WEB_ID NOT NULL VARCHAR2(100)
SOURCE_ID NOT NULL NUMBER
LISTING_CREATED NOT NULL DATE
ARCHIVE_NAME VARCHAR2(100)
DUPLICATES_GROUP_ID NUMBER
FILENAME VARCHAR2(80)
DISPLAY NUMBER
rfs.rfs_properties:
Name Null? Type
------------------------------- -------- -----------------
PROPERTY_ID NOT NULL NUMBER
LISTING_ID NOT NULL NUMBER
BLDG_PROPKEY NUMBER
UNIT_PROPKEY NUMBER
ADDRESSKEY NUMBER
HOUSE_NUMBER VARCHAR2(32)
STREET_ADDRESS VARCHAR2(200)
UNIT_NUMBER VARCHAR2(32)
UNIT_NUMBER_PARSED VARCHAR2(16 CHAR)
PARSED_ADDRESS VARCHAR2(255)
DISPLAY_ADDRESS VARCHAR2(150)
CROSS_STREET VARCHAR2(200)
NEIGHBORHOOD VARCHAR2(150)
NEIGHBORHOOD_CODE NUMBER
NEIGHBORHOOD_REG_CODE NUMBER
SCHOOL_DISTRICT VARCHAR2(100)
BOROUGH_CITY VARCHAR2(100 CHAR)
METRO_AREA VARCHAR2(100 CHAR)
ZIP_CODE NUMBER
COUNTY NUMBER
STATE VARCHAR2(4 CHAR)
ROOMS NUMBER
BEDROOMS NUMBER
BATHROOMS NUMBER
SQFT NUMBER
LOT_SIZE NUMBER
STUDIO NUMBER
LOFT NUMBER
MAINT_CC NUMBER
RE_TAX NUMBER
PROPERTY_TYPE_ID NUMBER
PROPERTY_TYPE VARCHAR2(255)
BLDG_NAME VARCHAR2(255)
BLDG_TYPE VARCHAR2(32 CHAR)
BLDG_NEW_DEVEL NUMBER
BLDG_FEATURES VARCHAR2(256)
MANUALLY_BLDG_FEAT VARCHAR2(255)
APT_FEATURES VARCHAR2(256)
OUTDOOR_SPACE VARCHAR2(32 CHAR)
YEAR_BUILT NUMBER
LISTING_RANK NOT NULL NUMBER
LOCATION_CHECKED NUMBER
PROPKEY_SOURCE VARCHAR2(15)
WEB_BUG_URL VARCHAR2(255)
EMAIL_LEAD_GENERATION VARCHAR2(100)
LISTING_URL VARCHAR2(255)
BROKER_NAME VARCHAR2(100)
BROKER_URL VARCHAR2(256)
LISTING_TEXT CLOB
IS_UPDATED NUMBER
CENTROID_X NUMBER(20,10)
CENTROID_Y NUMBER(20,10)
CENTROID MDSYS.SDO_GEOMETRY
COUNTY_GEO_ID NUMBER
CX NUMBER
CY NUMBER
Some updated Statistics after updating oracle stats:
Statistics
31 recursive calls
2 db block gets
63053 consistent gets
15474 physical reads
0 redo size
2890 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
10 rows processed
New Execution Plan After Updating Statistics:
Execution Plan
----------------------------------------------------------
Plan hash value: 3213592672
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 4110 | 236 (0)| 00:00:03 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | NESTED LOOPS | | | | | |
| 3 | NESTED LOOPS | | 10 | 4110 | 236 (0)| 00:00:03 |
| 4 | TABLE ACCESS BY INDEX ROWID| RFS_LISTINGS | 224 | 4704 | 11 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | RFS_LISTINGS_DISPLAY | | | 3 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | RFS_PROPERTIES_LD730_UNIQUE | 1 | | 1 (0)| 00:00:01 |
|* 7 | TABLE ACCESS BY INDEX ROWID | RFS_PROPERTIES | 1 | 390 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(ROWNUM<=10)
5 - access("LISTINGS"."DISPLAY"=1)
6 - access("LISTINGS"."LISTING_ID"="PROPERTIES"."LISTING_ID")
7 - filter(NLSSORT("PROPERTIES"."BOROUGH_CITY",'nls_sort=''BINARY_CI''')=HEXTORAW('6E657720796F726B
00') )
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您的查询计划表明 Oracle 认为 RFS_LISTINGS 中有 140 行,其中 DISPLAY=1,而不是数百万行。为了获得更好的优化,您需要收集一些更好的统计数据。
为了澄清,我的意思是你应该运行 DBMS_STATS< /a> 包,以便 Oracle(不仅仅是您)知道它正在处理多少数据,例如:
首先与您的 DBA 交谈。
Your query plan suggests that Oracle believes there are 140 rows in RFS_LISTINGS where DISPLAY=1, not millions. To get better optimisation you need to gather some better stats.
To clarify, I mean you should run the DBMS_STATS package so that Oracle (not just you) knows how much data it is dealing with e.g.:
Talk to your DBA first.
您将需要在 WHERE 列上建立索引...
you will want indexing on the WHERE columns...
可能是因为在选择所有数据后才评估
ROWNUM
。另外,正如兰迪所说,您将需要索引,但根据计划,您似乎拥有它们。Could be because
ROWNUM
is evaluated after all data is selected. Plus, as Randy said, you're gonna need indexes, but by the plan, it looks like you have them.我认为您需要的是每个表上的 LISTING_ID 的索引。这应该可以防止发生全表扫描。
I think what you need is an index for LISTING_ID on each table. This should prevent the full table scan from occurring.