计算此数据的最佳方法

发布于 2024-08-04 06:18:32 字数 818 浏览 1 评论 0原文

简而言之,我有 2 个表:

用户:

------------------------
UserID   |   Name
------------------------
0     a
1     b
2     c

CALLS:
------------------------
ToUser   |   Result
------------------------
0     ANSWERED
1     ENGAGED
1     ANSWERED
0     ANSWERED

等等(我使用数字参考来表示实际结果)

我有超过 200 万条记录,每条记录都详细说明了对特定客户的调用。目前,我正在使用 Case 语句来计算特定结果的每次重复,在我已经完成快速总计计数之后:

COUNT(DISTINCT l_call_log.line_id),
COALESCE (SUM(CASE WHEN l_call_log.line_result = 1 THEN 1 ELSE NULL END), 0) AS [Answered],
COALESCE (SUM(CASE WHEN l_call_log.line_result = 2 THEN 1 ELSE NULL END), 0) AS [Engaged], 
COALESCE (SUM(CASE WHEN l_call_log.line_result = 4 THEN 1 ELSE NULL END), 0) AS [Unanswered]

我是否在初始总计计数后对数据进行 3 次扫描?如果是这样,有没有一种方法可以一次性进行一次扫描并按结果计算调用次数?

谢谢。

In short I have 2 tables:

USERS:

------------------------
UserID   |   Name
------------------------
0     a
1     b
2     c

CALLS:
------------------------
ToUser   |   Result
------------------------
0     ANSWERED
1     ENGAGED
1     ANSWERED
0     ANSWERED

Etc, etc (i use a numerical referance for result in reality)

I have over 2 million records each detailing a call to a specific client. Currently I'm using Case statements to count each recurance of a particular result AFTER I have already done the quick total count:

COUNT(DISTINCT l_call_log.line_id),
COALESCE (SUM(CASE WHEN l_call_log.line_result = 1 THEN 1 ELSE NULL END), 0) AS [Answered],
COALESCE (SUM(CASE WHEN l_call_log.line_result = 2 THEN 1 ELSE NULL END), 0) AS [Engaged], 
COALESCE (SUM(CASE WHEN l_call_log.line_result = 4 THEN 1 ELSE NULL END), 0) AS [Unanswered]

Am I doing 3 scans of the data after my inital total count? if so, is there a way I can do one sweep and count the calls as-per-result in one go?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

尬尬 2024-08-11 06:18:32

这将需要一次全表扫描。

编辑:没有足够的信息来回答;因为我之前错过了重复删除(DISTINCT),所以我们无法判断将使用什么策略......尤其是在不知道数据库引擎的情况下。

在几乎每个主要查询引擎中,每个聚合函数都是针对每行每列执行的,并且它可能使用缓存的结果(例如 COUNT(*))。

line_result 是否已建立索引?如果是这样,您可以利用更好的查询(GROUP BY + COUNT(*))来利用索引统计信息,尽管我不确定这是否值得,具体取决于查询中的其他表。

This would take one full table scan.

EDIT: There's not enough information to answer; because the duplicate removal (DISTINCT) that I missed earlier, we can't tell what strategy that would be used.... especially without knowing the database engine.

In just about every major query engine, each aggregate function is executed per each column per each row, and it may use a cached result (such as COUNT(*) for example).

Is line_result indexed? If so, you could leverage a better query (GROUP BY + COUNT(*) to take advantage of index statistics, though I'm not sure if that's worthwhile depending on your other tables in the query.

撩心不撩汉 2024-08-11 06:18:32

SQL 中有GROUP BY 结构。尝试:

SELECT COUNT(DISTINCT l_call_log.line_id)
  GROUP BY l_call_log.line_result

There is the GROUP BY construction in SQL. Try:

SELECT COUNT(DISTINCT l_call_log.line_id)
  GROUP BY l_call_log.line_result
倾城泪 2024-08-11 06:18:32

我猜这是一个表扫描,因为你没有任何依赖的子查询。对查询运行解释以确保确定。

I would guess it's a table scan, since you don't have any depending subqueries. Run explain on the query to be sure.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文