在 GROUP BY 和 COUNT 之后加入另一个表

发布于 2024-09-06 12:28:20 字数 1154 浏览 6 评论 0原文

我试图了解使用 JOINCOUNT(*)GROUP BY 进行非常简单的查询的正确方法。我实际上已经让它工作了(见下文),但从我读到的内容来看,我使用了一个额外的 GROUP BY ,这是我不应该使用的。

(注意:下面的问题不是我的实际问题(它处理更复杂的表),但我试图提出一个类似的问题)

我有两个表:

Table: Person
-------------
key  name     cityKey
1    Alice    1
2    Bob      2
3    Charles  2
4    David    1

Table: City
-------------
key  name
1    Albany
2    Berkeley
3    Chico

我想对 People 进行查询(带有一些 WHERE 子句)返回

  • 每个城市中匹配的人数
  • 城市的键 城市
  • 的名称。

如果我这样做,

SELECT COUNT(Person.key) AS count, City.key AS cityKey, City.name AS cityName
FROM Person 
LEFT JOIN City ON Person.cityKey = City.key 
GROUP BY Person.cityKey, City.name

我会得到我想要的结果

count   cityKey   cityName
2       1         Albany
2       2         Berkeley

但是,我read 放入 GROUP BY 子句 (City.name) 的最后一部分只是为了使其工作错误的。

那么这样做的正确方法是什么?我一直在尝试用谷歌搜索答案,但我觉得有一些基本的东西我没有得到。

I'm trying to make sense of the right way to use JOIN, COUNT(*), and GROUP BY to do a pretty simple query. I've actually gotten it to work (see below) but from what I've read, I'm using an extra GROUP BY that I shouldn't be.

(Note: The problem below isn't my actual problem (which deals with more complicated tables), but I've tried to come up with an analogous problem)

I have two tables:

Table: Person
-------------
key  name     cityKey
1    Alice    1
2    Bob      2
3    Charles  2
4    David    1

Table: City
-------------
key  name
1    Albany
2    Berkeley
3    Chico

I'd like to do a query on the People (with some WHERE clause) that returns

  • the number of matching people in each city
  • the key for the city
  • the name of the city.

If I do

SELECT COUNT(Person.key) AS count, City.key AS cityKey, City.name AS cityName
FROM Person 
LEFT JOIN City ON Person.cityKey = City.key 
GROUP BY Person.cityKey, City.name

I get the result that I want

count   cityKey   cityName
2       1         Albany
2       2         Berkeley

However, I've read that throwing in that last part of the GROUP BY clause (City.name) just to make it work is wrong.

So what's the right way to do this? I've been trying to google for an answer, but I feel like there's something fundamental that I'm just not getting.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

西瓜 2024-09-13 12:28:20

我不认为在这种情况下这是“错误的”,因为城市名称和城市键之间存在一对一的关系。您可以重写它,以便加入子选择以按键获取城市的人数,然后再次获取城市表的名称,但这是有争议的,这是否会更好。我想这是风格和观点的问题。

select PC.ct, City.key, City.name
  from City
  join (select count(Person.key) ct, cityKey key from Person group by cityKey) PC
    on City.key = PC.key

如果我的 SQL 不是太生疏的话:-)

I don't think that it's "wrong" in this case, because you've got a one-to-one relationship between city name and city key. You could rewrite it such that you join to a sub-select to get the count of persons to cities by key, to the city table again for the name, but it's debatable that that'd be better. It's a matter of style and opinion I guess.

select PC.ct, City.key, City.name
  from City
  join (select count(Person.key) ct, cityKey key from Person group by cityKey) PC
    on City.key = PC.key

if my SQL isn't too rusty :-)

最丧也最甜 2024-09-13 12:28:20

...我读到,仅仅为了使其工作而添加 GROUP BY 子句的最后一部分 (City.name) 是错误的。

你误会了,你搞反了。
标准 SQL 要求您在 GROUP BY 中指定 SELECT 中提到的所有未包装在聚合函数中的列。如果您不希望 GROUP BY 中的某些列,请将它们包装在聚合函数中。根据数据库的不同,您可以使用分析/窗口函数OVER...

但是,MySQL 和 SQLite 提供了“功能”,您可以在其中从分组依据中省略这些列 - 这会导致没有结果“为什么这个不从 MySQL 移植到 fill_in_the_blank 数据库?!” Stackoverflow 和许多其他网站 &论坛。

...I've read that throwing in that last part of the GROUP BY clause (City.name) just to make it work is wrong.

You misunderstand, you got it backwards.
Standard SQL requires you to specify in the GROUP BY all the columns mentioned in the SELECT that are not wrapped in aggregate functions. If you don't want certain columns in the GROUP BY, wrap them in aggregate functions. Depending on the database, you could use the analytic/windowing function OVER...

However, MySQL and SQLite provide the "feature" where you can omit these columns from the group by - which leads to no end of "why doesn't this port from MySQL to fill_in_the_blank database?!" Stackoverflow and numerous other sites & forums.

我三岁 2024-09-13 12:28:20

但是,我读过
GROUP BY 子句的最后一部分
(City.name) 只是为了让它发挥作用
错了。

这并没有错。您必须了解查询优化器如何看待您的查询。解析它的顺序就是要求您“将最后一部分放入”。优化器以类似于以下顺序的方式查看您的查询:

  • 连接所需的表;
  • 通过 WHERE 子句过滤复合数据
  • 集;通过 GROUP BY 子句将剩余的行分成组,并聚合
  • 它们,然后通过 HAVING 再次过滤它们。子句
  • 最终通过 SELECT / ORDER BY、UPDATE 或 DELETE 进行操作。

这里的要点是,GROUP BY 并不必须命名​​ SELECT 中的所有列,但事实上恰恰相反 - SELECT 不能包含 GROUP BY 中尚未包含的任何列。

However, I've read that throwing in
that last part of the GROUP BY clause
(City.name) just to make it work is
wrong.

It's not wrong. You have to understand how the Query Optimizer sees your query. The order in which it is parsed is what requires you to "throw the last part in." The optimizer sees your query in something akin to this order:

  • the required tables are joined
  • the composite dataset is filtered through the WHERE clause
  • the remaining rows are chopped into groups by the GROUP BY clause, and aggregated
  • they are then filtered again, through the HAVING clause
  • finally operated on, by SELECT / ORDER BY, UPDATE or DELETE.

The point here is that it's not that the GROUP BY has to name all the columns in the SELECT, but in fact it is the opposite - the SELECT cannot include any columns not already in the GROUP BY.

同尘 2024-09-13 12:28:20

您的查询仅适用于 MySQL,因为您对 Person.cityKey 进行分组,但选择了 city.key。所有其他数据库都要求您使用 min(city.key) 之类的聚合,或者将 City.key 添加到 group by 子句中。

由于城市名称和城市键的组合是唯一的,因此以下内容是等效的:

select    count(person.key), min(city.key), min(city.name)
...
group by  person.citykey

或:

select    count(person.key), city.key, city.name
...
group by  person.citykey, city.key, city.name

或:

select    count(person.key), city.key, max(city.name)
...
group by  city.key

组中的所有行都将具有相同的城市名称和键,因此使用 max< 并不重要/code> 或 min 聚合。

PS 如果您只想计算不同的人,即使他们有多行,请尝试:

count(DISTINCT person.key)

而不是

count(person.key)

Your query would only work on MySQL, because you group on Person.cityKey but select city.key. All other databases would require you to use an aggregate like min(city.key), or to add City.key to the group by clause.

Because the combination of city name and city key is unique, the following are equivalent:

select    count(person.key), min(city.key), min(city.name)
...
group by  person.citykey

Or:

select    count(person.key), city.key, city.name
...
group by  person.citykey, city.key, city.name

Or:

select    count(person.key), city.key, max(city.name)
...
group by  city.key

All rows in the group will have the same city name and key, so it doesn't matter if you use the max or min aggregate.

P.S. If you'd like to count only different persons, even if they have multiple rows, try:

count(DISTINCT person.key)

instead of

count(person.key)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文