如何让“不存在”变得存在快点
我有一个问题。我的数据库是Oracle 9。这是我的SQL:
SELECT COUNT(distinct A.person_id)
INTO p_record_count
FROM A,
B,
F,
T
WHERE B.position_id = F.position_id **--Normal bussiness association**
AND T.field2 = A.org_id **--Normal bussiness association**
AND F.Organization_Id = A.org_id **--Normal bussiness association**
AND A.person_id = F.person_id **--Normal bussiness association**
AND F.primary_flag = 'Y' **--Normal bussiness association**
and sysdate between F.effective_start_date and
F.effective_end_date **--Normal bussiness association**
AND NOT EXISTS --very slow
(SELECT log.person_id
FROM cper.ehr_access_log log
WHERE log.person_id = A.person_id
AND log.access_page = 'login.do'
AND trunc(log.access_time) BETWEEN
to_date(p_startDate, 'yyyy-mm-dd') AND
to_date(p_endDate, 'yyyy-mm-dd'))
AND A.enable_flag = 'Y'; **--Normal bussiness association**
要求:获取未登录的人员。
逻辑:获取所有我能找到的人,然后减去登录的人。
我只有一张包含登录记录的表。
表cper.ehr_access_log有超过10M记录。这是一个日志表。此 SQL 大约需要 30 秒。
5s以内可以吗?谢谢你的帮助。
谢谢你~!我会努力的。周末愉快~ : )
I have a problem. My DB is Oracle 9. Here is my SQL:
SELECT COUNT(distinct A.person_id)
INTO p_record_count
FROM A,
B,
F,
T
WHERE B.position_id = F.position_id **--Normal bussiness association**
AND T.field2 = A.org_id **--Normal bussiness association**
AND F.Organization_Id = A.org_id **--Normal bussiness association**
AND A.person_id = F.person_id **--Normal bussiness association**
AND F.primary_flag = 'Y' **--Normal bussiness association**
and sysdate between F.effective_start_date and
F.effective_end_date **--Normal bussiness association**
AND NOT EXISTS --very slow
(SELECT log.person_id
FROM cper.ehr_access_log log
WHERE log.person_id = A.person_id
AND log.access_page = 'login.do'
AND trunc(log.access_time) BETWEEN
to_date(p_startDate, 'yyyy-mm-dd') AND
to_date(p_endDate, 'yyyy-mm-dd'))
AND A.enable_flag = 'Y'; **--Normal bussiness association**
Requirement: Get people who didn't login.
The logic: Get all people I can get, then minus the people who login.
I have only a table with the login records.
Table cper.ehr_access_log has more than 10M records. It is a log table. This SQL takes about 30seconds.
Within 5s is possible? Thanks for your help.
Thank you~!I will try.Have a good weekend~ : )
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我想说,如果不添加一些索引,这是不可能的。但您可能想尝试以下建议
使用查询计划来识别热点。这可能是
ehr_access_log
所需的表扫描,您可以使用该参数来说服其他人您需要索引。尝试按照其他人的建议用外连接替换子查询
您在
B
和T
上有连接> 但不是它们的谓词。您在查询中真的需要它们吗?您也许可以通过避免
trunc
调用来节省一些时间(不确定)。作为替代方案,将p_startDate
修改为一天的开始,将p_endDate
修改为一天的结束I would say it is not possible without adding some indexes. But you might want to try the following suggestions
Use a query plan to identify the hotspot. That will likely be tablescan required for
ehr_access_log
and you can probably use that argument to convince others that you need the index.Try replacing subquery with outer join as others have suggested
You have joins on
B
andT
but not predicates on them. Do you really need them in the query?You might be able to save some time by avoiding
trunc
calls (not sure). As an alternative, modifyp_startDate
beginning of day andp_endDate
to end of day查询的“不存在”部分称为相关子查询。如果重写为 OUTER 连接,通常可以获得更好的性能。
图案就
变成了
The 'not exists' part of your query is known as a correlated subquery. You can generally get better performance if you re-write as an OUTER join.
The pattern is
becomes
为了获得更好的性能,您必须INDEX表
cper.ehr_access_log
由于它是一个LOG表,因此每月计划根据此表进行INDEX在
person_id
上。这将减少一些查询时间。For a better performance, you must INDEX the table
cper.ehr_access_log
Since it is a LOG table, make a monthly schedule to INDEX this table based on
person_id
. This will reduce some amount of time in quering.我认为您可以通过仅选择 DISTINCT log.person_id 记录来优化查询,这会减少日志中的记录量。
您还可以尝试从第一部分中排除:
I think you can optimize you query by selecting only DISTINCT log.person_id records, it will decrease amount of records from your log.
You can also try to make an EXCLUDE from the first part: