使用多个左连接来计算平均值和计数

发布于 2024-08-07 12:41:20 字数 1731 浏览 4 评论 0原文

我试图弄清楚如何使用多个左外连接来计算平均分数和卡片数量。我有以下架构和测试数据。每副牌有 0 个或更多分数和 0 个或更多卡片。我需要计算每副牌的平均分数和牌数。为了方便起见,我使用 mysql,我最终希望它能在 Android 手机上的 sqlite 上运行。

mysql> select * from deck;
+----+-------+
| id | name  |
+----+-------+
|  1 | one   | 
|  2 | two   | 
|  3 | three | 
+----+-------+
mysql> select * from score;
+---------+-------+---------------------+--------+
| scoreId | value | date                | deckId |
+---------+-------+---------------------+--------+
|       1 |  6.58 | 2009-10-05 20:54:52 |      1 | 
|       2 |     7 | 2009-10-05 20:54:58 |      1 | 
|       3 |  4.67 | 2009-10-05 20:55:04 |      1 | 
|       4 |     7 | 2009-10-05 20:57:38 |      2 | 
|       5 |     7 | 2009-10-05 20:57:41 |      2 | 
+---------+-------+---------------------+--------+
mysql> select * from card;
+--------+-------+------+--------+
| cardId | front | back | deckId |
+--------+-------+------+--------+
|      1 | fron  | back |      2 | 
|      2 | fron  | back |      1 | 
|      3 | f1    | b2   |      1 | 
+--------+-------+------+--------+

我运行以下查询......

mysql> select deck.name, sum(score.value)/count(score.value) "Ave", 
    ->   count(card.front) "Count" 
    -> from deck 
    -> left outer join score on deck.id=score.deckId 
    -> left outer join card on deck.id=card.deckId
    -> group by deck.id;

+-------+-----------------+-------+
| name  | Ave             | Count |
+-------+-----------------+-------+
| one   | 6.0833333333333 |     6 | 
| two   |               7 |     2 | 
| three |            NULL |     0 | 
+-------+-----------------+-------+

我得到了平均值的正确答案,但卡片数量的错误答案。在我拔掉头发之前,有人能告诉我我做错了什么吗?

谢谢!

约翰

I am trying to figure out how to use multiple left outer joins to calculate average scores and number of cards. I have the following schema and test data. Each deck has 0 or more scores and 0 or more cards. I need to calculate an average score and card count for each deck. I'm using mysql for convenience, I eventually want this to run on sqlite on an Android phone.

mysql> select * from deck;
+----+-------+
| id | name  |
+----+-------+
|  1 | one   | 
|  2 | two   | 
|  3 | three | 
+----+-------+
mysql> select * from score;
+---------+-------+---------------------+--------+
| scoreId | value | date                | deckId |
+---------+-------+---------------------+--------+
|       1 |  6.58 | 2009-10-05 20:54:52 |      1 | 
|       2 |     7 | 2009-10-05 20:54:58 |      1 | 
|       3 |  4.67 | 2009-10-05 20:55:04 |      1 | 
|       4 |     7 | 2009-10-05 20:57:38 |      2 | 
|       5 |     7 | 2009-10-05 20:57:41 |      2 | 
+---------+-------+---------------------+--------+
mysql> select * from card;
+--------+-------+------+--------+
| cardId | front | back | deckId |
+--------+-------+------+--------+
|      1 | fron  | back |      2 | 
|      2 | fron  | back |      1 | 
|      3 | f1    | b2   |      1 | 
+--------+-------+------+--------+

I run the following query...


mysql> select deck.name, sum(score.value)/count(score.value) "Ave", 
    ->   count(card.front) "Count" 
    -> from deck 
    -> left outer join score on deck.id=score.deckId 
    -> left outer join card on deck.id=card.deckId
    -> group by deck.id;

+-------+-----------------+-------+
| name  | Ave             | Count |
+-------+-----------------+-------+
| one   | 6.0833333333333 |     6 | 
| two   |               7 |     2 | 
| three |            NULL |     0 | 
+-------+-----------------+-------+

... and I get the right answer for the average, but the wrong answer for the number of cards. Can someone tell me what I am doing wrong before I pull my hair out?

Thanks!

John

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

贪恋 2024-08-14 12:41:20

它正在运行您所要求的内容 - 将卡 2 和卡 3 连接到分数 1、2 和 3 - 创建计数 6 (2 * 3)。在卡 1 的情况下,它连接到分数 4 和 5,创建计数 2 (1 * 2)。

如果您只想计算卡片数量(就像您当前所做的那样),请使用 COUNT(Distinct Card.CardId)。

It's running what you're asking--it's joining card 2 and 3 to scores 1, 2, and 3--creating a count of 6 (2 * 3). In card 1's case, it joins to scores 4 and 5, creating a count of 2 (1 * 2).

If you just want a count of cards, like you're currently doing, COUNT(Distinct Card.CardId).

逆蝶 2024-08-14 12:41:20
select deck.name, coalesce(x.ave,0) as ave, count(card.*) as count -- card.* makes the intent more clear, i.e. to counting card itself, not the field.  but do not do count(*), will make the result wrong
from deck    
left join -- flatten the average result rows first
(
    select deckId,sum(value)/count(*) as ave -- count the number of rows, not count the column name value.  intent is more clear
    from score 
    group by deckId
) as x on x.deckId = deck.id
left outer join card on card.deckId = deck.id -- then join the flattened results to cards
group by deck.id, x.ave, deck.name
order by deck.id

[编辑]

sql有内置的平均函数,只需使用这个:

select deckId, avg(value) as ave
from score 
group by deckId
select deck.name, coalesce(x.ave,0) as ave, count(card.*) as count -- card.* makes the intent more clear, i.e. to counting card itself, not the field.  but do not do count(*), will make the result wrong
from deck    
left join -- flatten the average result rows first
(
    select deckId,sum(value)/count(*) as ave -- count the number of rows, not count the column name value.  intent is more clear
    from score 
    group by deckId
) as x on x.deckId = deck.id
left outer join card on card.deckId = deck.id -- then join the flattened results to cards
group by deck.id, x.ave, deck.name
order by deck.id

[EDIT]

sql has built-in average function, just use this:

select deckId, avg(value) as ave
from score 
group by deckId
深海夜未眠 2024-08-14 12:41:20

问题在于您正在 score笛卡尔积 > 和

它的工作原理如下:当您将 deck 加入到 score 时,可能会有多行匹配。然后,这些多行中的每行都会连接到卡中的所有匹配行。没有条件阻止这种情况发生,并且在没有条件限制时的默认联接行为是将一个表中的所有行联接到另一表中的所有行。

要查看其实际效果,请尝试此查询,无需使用分组依据:

select * 
from deck 
left outer join score on deck.id=score.deckId 
left outer join card on deck.id=card.deckId;

您将在来自 scorecard 的列中看到大量重复数据。当您对包含重复的数据计算 AVG() 时,冗余值会神奇地消失(只要这些值均匀重复)。但是当您使用 COUNT()SUM() 计算时,总数会相差很大。

对于无意的笛卡尔积可能有补救措施。对于您的情况,您可以使用 COUNT(DISTINCT) 进行补偿:

select deck.name, avg(score.value) "Ave", count(DISTINCT card.front) "Count" 
from deck 
left outer join score on deck.id=score.deckId 
left outer join card on deck.id=card.deckId
group by deck.id;

此解决方案无法解决所有无意的笛卡尔积的情况。更通用的解决方案是将其分解为两个单独的查询:

select deck.name, avg(score.value) "Ave"
from deck 
left outer join score on deck.id=score.deckId 
group by deck.id;

select deck.name, count(card.front) "Count" 
from deck 
left outer join card on deck.id=card.deckId
group by deck.id;

并非数据库编程中的每个任务都必须在单个查询中完成。当您需要多个统计信息时,使用单独的查询甚至可以高效(并且更简单、更容易修改且不易出错)。

What's going wrong is that you're creating a Cartesian product between score and card.

Here's how it works: when you join deck to score, you may have multiple rows match. Then each of these multiple rows is joined to all of the matching rows in card. There's no condition preventing that from happening, and the default join behavior when no condition restricts it is to join all rows in one table to all rows in another table.

To see it in action, try this query, without the group by:

select * 
from deck 
left outer join score on deck.id=score.deckId 
left outer join card on deck.id=card.deckId;

You'll see a lot of repeated data in the columns that come from score and card. When you calculate the AVG() over data that has repeats in it, the redundant values magically disappear (as long as the values are repeated uniformly). But when you COUNT() or SUM() them, the totals are way off.

There may be remedies for inadvertent Cartesian products. In your case, you can use COUNT(DISTINCT) to compensate:

select deck.name, avg(score.value) "Ave", count(DISTINCT card.front) "Count" 
from deck 
left outer join score on deck.id=score.deckId 
left outer join card on deck.id=card.deckId
group by deck.id;

This solution doesn't solve all cases of inadvertent Cartesian products. The more general-purpose solution is to break it up into two separate queries:

select deck.name, avg(score.value) "Ave"
from deck 
left outer join score on deck.id=score.deckId 
group by deck.id;

select deck.name, count(card.front) "Count" 
from deck 
left outer join card on deck.id=card.deckId
group by deck.id;

Not every task in database programming must be done in a single query. It can even be more efficient (as well as simpler, easier to modify, and less error-prone) to use individual queries when you need multiple statistics.

触ぅ动初心 2024-08-14 12:41:20

在我看来,使用左连接并不是一个好方法。下面是一个标准 SQL 查询,可得到您想要的结果。

select
  name,
  (select avg(value) from score where score.deckId = deck.id) as Ave,
  (select count(*) from card where card.deckId = deck.id) as "Count"
from deck;

Using left joins isn't a good approach, in my opinion. Here's a standard SQL query for the result you want.

select
  name,
  (select avg(value) from score where score.deckId = deck.id) as Ave,
  (select count(*) from card where card.deckId = deck.id) as "Count"
from deck;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文