Oracle/SQL - 需要帮助优化此联合/组/计数查询

发布于 2024-10-14 19:53:41 字数 1119 浏览 5 评论 0原文

我正在尝试尽可能优化这个查询。在我的测试表中，这也正是我想要的，但在实时表上，这需要很长时间才能运行。

select   THING_, 
            count(case STATUS_ when '_Good_' then 1 end) as GOOD, 
            count(case STATUS_ when '_Bad_' then 1 end) as BAD,
            count(case STATUS_ when '_Bad_' then 1 end) / count(case STATUS_ when '_Good_' then 1 end) * 100 as FAIL_PERCENT
    from    
                (
                select      THING_,
                            STATUS_,
                    from    <good table>
                    where   TIMESTAMP_ > (sysdate - 1) and
                            STATUS_ = '_Good_' and
                            upper(THING_) like '%TEST%'

                UNION ALL           

                select      THING_,
                            STATUS_,
                    from    <bad table>
                    where   TIMESTAMP_ > (sysdate - 1) and
                            STATUS_ = '_Bad_' and
                            THING_THING_ like '%TEST%'
                ) u
    group by THING_

我认为通过查看查询，我想要做什么应该是不言自明的，但如果没有或者需要其他信息，请告诉我，我将发布一些示例表。

谢谢！

原文

I'm trying to optimize this query however possible. In my test tables this does exactly what I want it too, but on the live tables this takes a VERY long time to run.

select   THING_, 
            count(case STATUS_ when '_Good_' then 1 end) as GOOD, 
            count(case STATUS_ when '_Bad_' then 1 end) as BAD,
            count(case STATUS_ when '_Bad_' then 1 end) / count(case STATUS_ when '_Good_' then 1 end) * 100 as FAIL_PERCENT
    from    
                (
                select      THING_,
                            STATUS_,
                    from    <good table>
                    where   TIMESTAMP_ > (sysdate - 1) and
                            STATUS_ = '_Good_' and
                            upper(THING_) like '%TEST%'

                UNION ALL           

                select      THING_,
                            STATUS_,
                    from    <bad table>
                    where   TIMESTAMP_ > (sysdate - 1) and
                            STATUS_ = '_Bad_' and
                            THING_THING_ like '%TEST%'
                ) u
    group by THING_

I think by looking at the query it should be self explanatory what I want to do, but if not or if additional info is needed please let me know and I will post some sample tables.

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

澜川若宁 2024-10-21 19:53:41

在两个表中的 (STATUS_, TIMESTAMP_) 上创建复合索引。

回复收藏 0 原文

未蓝澄海的烟 2024-10-21 19:53:41

(1) 查看执行计划应该始终是诊断 SQL 性能问题的第一步

(2) 编写的查询可能存在的问题是，因为 SYSDATE 是一个函数，直到执行时（即执行后）才计算值计划确定），优化器无法利用时间戳列上的直方图来评估索引的效用。我已经看到这会导致糟糕的优化器决策。如果您可以找到一种方法来首先计算日期，然后将其作为绑定或文字输入到查询中，这可能会有所帮助，尽管这实际上只是一个猜测。

(3) 也许构建查询的更好的总体方法是作为每个表上的聚合查询之间的联接（可能是完全外联接）。

SELECT COALESCE(g.thing_,b.thing_), COALESCE(good_count,0), COALESCE(bad_count,0)
  FROM (SELECT thing_,count(*) good_count from good_table WHERE ... GROUP BY thing_) g
       FULL OUTER JOIN
       (SELECT thing_,count(*) bad_count from bad_table WHERE ... GROUP BY thing_) b
       ON b.thing_ = g.thing_

（不得不说，当你还有一个状态列来指示“好”或“坏”时，你有两个单独的表似乎有点奇怪。但也许我过度解释了。）

(1) Looking at the execution plan should always be your first step in diagnosing SQL performance issues

(2) A possible problem with the query as written is that, because SYSDATE is a function that is not evaluated until execution time (i.e. after the execution plan is determined), the optimizer cannot make use of histograms on the timestamp column to evaluate the utility of an index. I have seen that lead to bad optimizer decisions. If you can work out a way to calculate the date first then feed it into the query as a bind or a literal, that may help, although this is really just a guess.

(3) Maybe a better overall way to structure the query would be as a join (possibly full outer join) between aggregate queries on each of the tables.

SELECT COALESCE(g.thing_,b.thing_), COALESCE(good_count,0), COALESCE(bad_count,0)
  FROM (SELECT thing_,count(*) good_count from good_table WHERE ... GROUP BY thing_) g
       FULL OUTER JOIN
       (SELECT thing_,count(*) bad_count from bad_table WHERE ... GROUP BY thing_) b
       ON b.thing_ = g.thing_

(Have to say, it seems kind of weird that you have two separate tables when you also have a status column to indicate "good" or "bad". But maybe I am overinterpreting.)

回复收藏 0 原文

初与友歌 2024-10-21 19:53:41

您尝试过使用分析函数吗？它可能会减少一些执行时间。这是一个例子：

select distinct col1, col2, col3
(Select col1,
       count(col2) over (partition by col1) col2,
       count(col3) over (partition by col1) col3
from table
)

是这样的。

Have you tried analytical function to use? It might decrease some time execution. Here you are an example:

select distinct col1, col2, col3
(Select col1,
       count(col2) over (partition by col1) col2,
       count(col3) over (partition by col1) col3
from table
)

Its something like that.

回复收藏 0 原文

~没有更多了~