如何让“不存在”变得存在快点

发布于 2024-12-11 07:27:13 字数 1395 浏览 0 评论 0原文

我有一个问题。我的数据库是Oracle 9。这是我的SQL:

SELECT COUNT(distinct A.person_id)
        INTO p_record_count
        FROM A,
             B,
             F,
             T
       WHERE B.position_id = F.position_id **--Normal bussiness association**
         AND T.field2 = A.org_id           **--Normal bussiness association**
         AND F.Organization_Id = A.org_id  **--Normal bussiness association**
         AND A.person_id = F.person_id     **--Normal bussiness association**
         AND F.primary_flag = 'Y'          **--Normal bussiness association**
         and sysdate between F.effective_start_date and
             F.effective_end_date          **--Normal bussiness association**
         AND NOT EXISTS                    --very slow
                (SELECT log.person_id
                FROM cper.ehr_access_log log
               WHERE log.person_id = A.person_id
                 AND log.access_page = 'login.do'
                 AND trunc(log.access_time) BETWEEN
                     to_date(p_startDate, 'yyyy-mm-dd') AND
                     to_date(p_endDate, 'yyyy-mm-dd'))
         AND A.enable_flag = 'Y';            **--Normal bussiness association**

要求:获取未登录的人员。

逻辑:获取所有我能找到的人,然后减去登录的人。

我只有一张包含登录记录的表。

表cper.ehr_access_log有超过10M记录。这是一个日志表。此 SQL 大约需要 30 秒。

5s以内可以吗?谢谢你的帮助。

谢谢你~!我会努力的。周末愉快~ : )

I have a problem. My DB is Oracle 9. Here is my SQL:

SELECT COUNT(distinct A.person_id)
        INTO p_record_count
        FROM A,
             B,
             F,
             T
       WHERE B.position_id = F.position_id **--Normal bussiness association**
         AND T.field2 = A.org_id           **--Normal bussiness association**
         AND F.Organization_Id = A.org_id  **--Normal bussiness association**
         AND A.person_id = F.person_id     **--Normal bussiness association**
         AND F.primary_flag = 'Y'          **--Normal bussiness association**
         and sysdate between F.effective_start_date and
             F.effective_end_date          **--Normal bussiness association**
         AND NOT EXISTS                    --very slow
                (SELECT log.person_id
                FROM cper.ehr_access_log log
               WHERE log.person_id = A.person_id
                 AND log.access_page = 'login.do'
                 AND trunc(log.access_time) BETWEEN
                     to_date(p_startDate, 'yyyy-mm-dd') AND
                     to_date(p_endDate, 'yyyy-mm-dd'))
         AND A.enable_flag = 'Y';            **--Normal bussiness association**

Requirement: Get people who didn't login.

The logic: Get all people I can get, then minus the people who login.

I have only a table with the login records.

Table cper.ehr_access_log has more than 10M records. It is a log table. This SQL takes about 30seconds.

Within 5s is possible? Thanks for your help.

Thank you~!I will try.Have a good weekend~ : )

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

裸钻 2024-12-18 07:27:13

表cper.ehr_access_log有超过10M的记录。它是一个日志
表。这个sql大约需要30秒。 5秒之内可能吗?

我想说,如果不添加一些索引,这是不可能的。但您可能想尝试以下建议

  • 使用查询计划来识别热点。这可能是 ehr_access_log 所需的表扫描,您可以使用该参数来说服其他人您需要索引。

  • 尝试按照其他人的建议用外连接替换子查询

  • 您在 BT 上有连接> 但不是它们的谓词。您在查询中真的需要它们吗?

  • 您也许可以通过避免 trunc 调用来节省一些时间(不确定)。作为替代方案,将 p_startDate 修改为一天的开始,将 p_endDate 修改为一天的结束

Table cper.ehr_access_log has more than 10M records.It is a log
table.This sql takes about 30seconds. Within 5s is possible?

I would say it is not possible without adding some indexes. But you might want to try the following suggestions

  • Use a query plan to identify the hotspot. That will likely be tablescan required for ehr_access_log and you can probably use that argument to convince others that you need the index.

  • Try replacing subquery with outer join as others have suggested

  • You have joins on B and T but not predicates on them. Do you really need them in the query?

  • You might be able to save some time by avoiding trunc calls (not sure). As an alternative, modify p_startDate beginning of day and p_endDate to end of day

简单气质女生网名 2024-12-18 07:27:13

查询的“不存在”部分称为相关子查询。如果重写为 OUTER 连接,通常可以获得更好的性能。

图案就

select a from b where not exists (select 1 from c where b.a = c.a)

变成了

select a from b left outer join c on b.a = c.a where c.a is null

The 'not exists' part of your query is known as a correlated subquery. You can generally get better performance if you re-write as an OUTER join.

The pattern is

select a from b where not exists (select 1 from c where b.a = c.a)

becomes

select a from b left outer join c on b.a = c.a where c.a is null
迷路的信 2024-12-18 07:27:13

为了获得更好的性能,您必须INDEXcper.ehr_access_log

由于它是一个LOG表,因此每月计划根据此表进行INDEX在 person_id 上。这将减少一些查询时间。

For a better performance, you must INDEX the table cper.ehr_access_log

Since it is a LOG table, make a monthly schedule to INDEX this table based on person_id. This will reduce some amount of time in quering.

在梵高的星空下 2024-12-18 07:27:13

我认为您可以通过仅选择 DISTINCT log.person_id 记录来优化查询,这会减少日志中的记录量。

您还可以尝试从第一部分中排除:

SELECT COUNT(distinct A.person_id)
    INTO p_record_count
    FROM A,
         B,
         F,
         T
   LEFT OUTER JOIN cper.ehr_access_log log
     ON log.person_id = A.person_id
        AND log.access_page = 'login.do'
        AND trunc(log.access_time) BETWEEN
        to_date(p_startDate, 'yyyy-mm-dd') AND
        to_date(p_endDate, 'yyyy-mm-dd')
   WHERE B.position_id = F.position_id **--Normal bussiness association**
     AND T.field2 = A.org_id           **--Normal bussiness association**
     AND F.Organization_Id = A.org_id  **--Normal bussiness association**
     AND A.person_id = F.person_id     **--Normal bussiness association**
     AND F.primary_flag = 'Y'          **--Normal bussiness association**
     and sysdate between F.effective_start_date and
         F.effective_end_date          **--Normal bussiness association**
     AND log.person_id IS null
     AND A.enable_flag = 'Y';            **--Normal bussiness association**

I think you can optimize you query by selecting only DISTINCT log.person_id records, it will decrease amount of records from your log.

You can also try to make an EXCLUDE from the first part:

SELECT COUNT(distinct A.person_id)
    INTO p_record_count
    FROM A,
         B,
         F,
         T
   LEFT OUTER JOIN cper.ehr_access_log log
     ON log.person_id = A.person_id
        AND log.access_page = 'login.do'
        AND trunc(log.access_time) BETWEEN
        to_date(p_startDate, 'yyyy-mm-dd') AND
        to_date(p_endDate, 'yyyy-mm-dd')
   WHERE B.position_id = F.position_id **--Normal bussiness association**
     AND T.field2 = A.org_id           **--Normal bussiness association**
     AND F.Organization_Id = A.org_id  **--Normal bussiness association**
     AND A.person_id = F.person_id     **--Normal bussiness association**
     AND F.primary_flag = 'Y'          **--Normal bussiness association**
     and sysdate between F.effective_start_date and
         F.effective_end_date          **--Normal bussiness association**
     AND log.person_id IS null
     AND A.enable_flag = 'Y';            **--Normal bussiness association**
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文