我可以在 SQL 中执行 max(count(*)) 吗?

发布于 2024-08-24 19:05:24 字数 579 浏览 13 评论 0原文

这是我的代码:

select yr,count(*)
from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;

这是问题:

那是“约翰·特拉沃尔塔”最忙碌的几年。显示他每年制作的电影数量。

这是表结构:

movie(id, title, yr, score, votes, director)
actor(id, name)
casting(movieid, actorid, ord)

这是我得到的输出:

yr      count(*)
1976    1
1977    1
1978    1
1981    1
1994    1
-- etc.

我需要获取 count(*) 为 max 的行。 我该怎么做?

Here's my code:

select yr,count(*)
from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;

Here's the question:

Which were the busiest years for 'John Travolta'. Show the number of movies he made for each year.

Here's the table structure:

movie(id, title, yr, score, votes, director)
actor(id, name)
casting(movieid, actorid, ord)

This is the output I am getting:

yr      count(*)
1976    1
1977    1
1978    1
1981    1
1994    1
-- etc.

I need to get the rows for which count(*) is max.
How do I do this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

秉烛思 2024-08-31 19:05:25

使用:

  SELECT m.yr, 
         COUNT(*) AS num_movies
    FROM MOVIE m
    JOIN CASTING c ON c.movieid = m.id
    JOIN ACTOR a ON a.id = c.actorid
                AND a.name = 'John Travolta'
GROUP BY m.yr
ORDER BY num_movies DESC, m.yr DESC

num_movies DESC 排序会将最高值放在结果集的顶部。如果许多年的计数相同,m.yr 会将最近的一年放在顶部...直到下一个 num_movies 值发生变化。

我可以使用 MAX(COUNT(*)) 吗?


不可以,您不能在同一 SELECT 子句中将聚合函数分层。内部聚合必须在子查询中执行。 IE:

SELECT MAX(y.num)
  FROM (SELECT COUNT(*) AS num
          FROM TABLE x) y

Use:

  SELECT m.yr, 
         COUNT(*) AS num_movies
    FROM MOVIE m
    JOIN CASTING c ON c.movieid = m.id
    JOIN ACTOR a ON a.id = c.actorid
                AND a.name = 'John Travolta'
GROUP BY m.yr
ORDER BY num_movies DESC, m.yr DESC

Ordering by num_movies DESC will put the highest values at the top of the resultset. If numerous years have the same count, the m.yr will place the most recent year at the top... until the next num_movies value changes.

Can I use a MAX(COUNT(*)) ?


No, you can not layer aggregate functions on top of one another in the same SELECT clause. The inner aggregate would have to be performed in a subquery. IE:

SELECT MAX(y.num)
  FROM (SELECT COUNT(*) AS num
          FROM TABLE x) y
如何视而不见 2024-08-31 19:05:25

只需按 count(*) desc 排序,您就会得到最高的(如果您将其与 limit 1 结合使用)

Just order by count(*) desc and you'll get the highest (if you combine it with limit 1)

七分※倦醒 2024-08-31 19:05:25

这个问题很旧,但在 dba.SE 上的一个新问题中引用了。我觉得还没有提供最好的解决方案。另外,还有新的、更快的选项。

标题中的问题

我可以在 SQL 中执行 max(count(*)) 吗?

,您可以通过在 窗口函数

SELECT m.yr
     , count(*) AS movie_count
     , max(count(*)) OVER () AS max_ct
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;

db<>fiddle 此处

这是标准 SQL。 Postgres 在 8.4 版本(2009-07-01 发布,在这个问题提出之前)引入了它。其他 RDBMS 应该也有同样的能力。
考虑 SELECT 查询中的事件顺序:

可能的缺点:窗口函数不会聚合行。您将获得聚合步骤后剩余的所有行。在某些查询中很有用,但对于这个查询来说并不理想。

要获取计数最高的一行,您可以使用ORDER BY ct DESC FETCH FIRST 1 ROW ONLY

SELECT c.yr, count(*) AS ct
FROM   actor   a
JOIN   casting c ON c.actorid = a.id
WHERE  a.name = 'John Travolta'
GROUP  BY c.yr
ORDER  BY ct DESC
FETCH  FIRST 1 ROW ONLY;

仅使用基本的 SQL 功能,可用在任何中等程度的 RDBMS 中。大多数流行的 RDBMS 也支持使用 LIMITTOPROWNUMFETCH FIRST 替代语法。请参阅:

或者您可以使用 DISTINCT ON 获得最高计数的每组一行(仅 Postgres ):

实际问题

我需要获取 count(*) 为 max 的行。

计数最高的行可能不止一行。

SQL Server 拥有 WITH TIES 功能已经有一段时间了 - 使用非标准语法:

SELECT TOP 1 WITH TIES
       m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;  -- can't sort by year for this

db<>fiddle 此处

PostgreSQL 13 添加了 WITH TIES 使用标准 SQL 语法:

SELECT m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC  -- can't sort by year for this
FETCH  FIRST 1 ROWS WITH TIES;

db<>fiddle 此处

这应该是最快的查询。进一步阅读:

要按附加条件对结果进行排序(或者对于旧版本的 Postgres 或其他没有 WITH TIES 的 RDBMS),请使用窗口函数 rank() 在子查询中:

SELECT yr, movie_count
FROM  (
   SELECT m.yr, count(*) AS movie_count
        , rank() OVER (ORDER BY count(*) DESC) AS rnk
   FROM   casting c
   JOIN   movie   m ON c.movieid = m.id
   WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
   GROUP  BY m.yr
   ) sub
WHERE  rnk = 1
ORDER  BY yr;  -- optionally sort by year

现在所有主要的 RDBMS 都支持窗口函数。

This question is old, but was referenced in a new question on dba.SE. I feel the best solutions haven't been provided. Plus, there are new, faster options.

Question in the title

Can I do a max(count(*)) in SQL?

Yes, you can achieve that by nesting an aggregate function in a window function:

SELECT m.yr
     , count(*) AS movie_count
     , max(count(*)) OVER () AS max_ct
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;

db<>fiddle here

That's standard SQL. Postgres introduced it with version 8.4 (released 2009-07-01, before this question was asked. Other RDBMS should be capable of the same.
Consider the sequence of events in a SELECT query:

Possible downside: window functions do not aggregate rows. You get all rows left after the aggregate step. Useful in some queries, but not ideal for this one.

To get one row with the highest count, you can use ORDER BY ct DESC FETCH FIRST 1 ROW ONLY:

SELECT c.yr, count(*) AS ct
FROM   actor   a
JOIN   casting c ON c.actorid = a.id
WHERE  a.name = 'John Travolta'
GROUP  BY c.yr
ORDER  BY ct DESC
FETCH  FIRST 1 ROW ONLY;

Using only basic SQL features, available in any halfway decent RDBMS. Most popular RDBMS (also) support alternative syntax for FETCH FIRST with LIMIT, TOP or ROWNUM. See:

Or you can get one row per group with the highest count with DISTINCT ON (only Postgres):

Actual Question

I need to get the rows for which count(*) is max.

There may be more than one row with the highest count.

SQL Server has had the feature WITH TIES for some time - with non-standard syntax:

SELECT TOP 1 WITH TIES
       m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC;  -- can't sort by year for this

db<>fiddle here

PostgreSQL 13 added WITH TIES with standard SQL syntax:

SELECT m.yr, count(*) AS movie_count
FROM   casting c
JOIN   movie   m ON c.movieid = m.id
WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
GROUP  BY m.yr
ORDER  BY count(*) DESC  -- can't sort by year for this
FETCH  FIRST 1 ROWS WITH TIES;

db<>fiddle here

This should be the fastest possible query. Further reading:

To sort results by additional criteria (or for older versions of Postgres or other RDBMS without WITH TIES), use the window function rank() in a subquery:

SELECT yr, movie_count
FROM  (
   SELECT m.yr, count(*) AS movie_count
        , rank() OVER (ORDER BY count(*) DESC) AS rnk
   FROM   casting c
   JOIN   movie   m ON c.movieid = m.id
   WHERE  c.actorid = (SELECT id FROM actor WHERE name = 'John Travolta')
   GROUP  BY m.yr
   ) sub
WHERE  rnk = 1
ORDER  BY yr;  -- optionally sort by year

All major RDBMS support window functions nowadays.

老旧海报 2024-08-31 19:05:25
SELECT * from 
(
SELECT yr as YEAR, COUNT(title) as TCOUNT
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr
order by TCOUNT desc
) res
where rownum < 2
SELECT * from 
(
SELECT yr as YEAR, COUNT(title) as TCOUNT
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr
order by TCOUNT desc
) res
where rownum < 2
徒留西风 2024-08-31 19:05:25

它来自这个网站 - http://sqlzoo.net/3.htm
2 种可能的解决方案:

使用 TOP 1 ORDER BY ... DESC:

SELECT yr, COUNT(title) 
FROM actor 
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING count(title)=(SELECT TOP 1 COUNT(title) 
FROM casting 
JOIN movie ON movieid=movie.id 
JOIN actor ON actor.id=actorid
WHERE name='John Travolta'
GROUP BY yr
ORDER BY count(title) desc)

使用 MAX:

SELECT yr, COUNT(title) 
FROM actor  
JOIN casting ON actor.id=actorid    
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING 
    count(title)=
        (SELECT MAX(A.CNT) 
            FROM (SELECT COUNT(title) AS CNT FROM actor 
                JOIN casting ON actor.id=actorid
                JOIN movie ON movie.id=movieid
                    WHERE name = 'John Travolta'
                    GROUP BY (yr)) AS A)

it's from this site - http://sqlzoo.net/3.htm
2 possible solutions:

with TOP 1 a ORDER BY ... DESC:

SELECT yr, COUNT(title) 
FROM actor 
JOIN casting ON actor.id=actorid
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING count(title)=(SELECT TOP 1 COUNT(title) 
FROM casting 
JOIN movie ON movieid=movie.id 
JOIN actor ON actor.id=actorid
WHERE name='John Travolta'
GROUP BY yr
ORDER BY count(title) desc)

with MAX:

SELECT yr, COUNT(title) 
FROM actor  
JOIN casting ON actor.id=actorid    
JOIN movie ON movie.id=movieid
WHERE name = 'John Travolta'
GROUP BY yr
HAVING 
    count(title)=
        (SELECT MAX(A.CNT) 
            FROM (SELECT COUNT(title) AS CNT FROM actor 
                JOIN casting ON actor.id=actorid
                JOIN movie ON movie.id=movieid
                    WHERE name = 'John Travolta'
                    GROUP BY (yr)) AS A)
洋洋洒洒 2024-08-31 19:05:25

使用带有限制的 max 只会给出第一行,但如果有两行或更多行具有相同的最大电影数,那么您将丢失一些数据。如果您有可用的 rank() 函数,下面是一种执行此操作的方法。

SELECT
    total_final.yr,
    total_final.num_movies
    FROM
    ( SELECT 
        total.yr, 
        total.num_movies, 
        RANK() OVER (ORDER BY num_movies desc) rnk
        FROM (
               SELECT 
                      m.yr, 
                      COUNT(*) AS num_movies
               FROM MOVIE m
               JOIN CASTING c ON c.movieid = m.id
               JOIN ACTOR a ON a.id = c.actorid
               WHERE a.name = 'John Travolta'
               GROUP BY m.yr
             ) AS total
    ) AS total_final 
   WHERE rnk = 1

Using max with a limit will only give you the first row, but if there are two or more rows with the same number of maximum movies, then you are going to miss some data. Below is a way to do it if you have the rank() function available.

SELECT
    total_final.yr,
    total_final.num_movies
    FROM
    ( SELECT 
        total.yr, 
        total.num_movies, 
        RANK() OVER (ORDER BY num_movies desc) rnk
        FROM (
               SELECT 
                      m.yr, 
                      COUNT(*) AS num_movies
               FROM MOVIE m
               JOIN CASTING c ON c.movieid = m.id
               JOIN ACTOR a ON a.id = c.actorid
               WHERE a.name = 'John Travolta'
               GROUP BY m.yr
             ) AS total
    ) AS total_final 
   WHERE rnk = 1
雄赳赳气昂昂 2024-08-31 19:05:25

下面的代码给你答案。它本质上是通过使用 ALL 来实现 MAX(COUNT(*)) 的。它的优点是它使用非常基本的命令和操作。

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

The following code gives you the answer. It essentially implements MAX(COUNT(*)) by using ALL. It has the advantage that it uses very basic commands and operations.

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)
岁月静好 2024-08-31 19:05:25

取决于您使用的数据库...

select yr, count(*) num from ...
order by num desc

我的大部分经验都是在 Sybase 中,它使用一些与其他数据库不同的语法。但在本例中,您要命名计数列,以便可以对其进行降序排序。您可以更进一步,将结果限制为前 10 行(以找到他最繁忙的 10 年)。

Depending on which database you're using...

select yr, count(*) num from ...
order by num desc

Most of my experience is in Sybase, which uses some different syntax than other DBs. But in this case, you're naming your count column, so you can sort it, descending order. You can go a step further, and restrict your results to the first 10 rows (to find his 10 busiest years).

一梦浮鱼 2024-08-31 19:05:25
create view sal as
select yr,count(*) as ct from
(select title,yr from movie m, actor a, casting c
where a.name='JOHN'
and a.id=c.actorid
and c.movieid=m.id)group by yr

-----已创建视图-----

select yr from sal
where ct =(select max(ct) from sal)


2013年

create view sal as
select yr,count(*) as ct from
(select title,yr from movie m, actor a, casting c
where a.name='JOHN'
and a.id=c.actorid
and c.movieid=m.id)group by yr

-----VIEW CREATED-----

select yr from sal
where ct =(select max(ct) from sal)

YR
2013

夜声 2024-08-31 19:05:25

感谢最后一个答案

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

,我遇到了同样的问题:我只需要知道其计数与最大计数匹配的记录(可能是一个或多个记录)。

我必须了解有关“ALL 子句”的更多信息,这正是我正在寻找的那种简单的解决方案。

Thanks to the last answer

SELECT yr, COUNT(title)
FROM actor
JOIN casting ON actor.id = casting.actorid
JOIN movie ON casting.movieid = movie.id
WHERE name = 'John Travolta'
GROUP BY yr HAVING COUNT(title) >= ALL
  (SELECT COUNT(title)
   FROM actor
   JOIN casting ON actor.id = casting.actorid
   JOIN movie ON casting.movieid = movie.id
   WHERE name = 'John Travolta'
   GROUP BY yr)

I had the same problem: I needed to know just the records which their count match the maximus count (it could be one or several records).

I have to learn more about "ALL clause", and this is exactly the kind of simple solution that I was looking for.

热风软妹 2024-08-31 19:05:25
     select top 1 yr,count(*)  from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr order by 2 desc
     select top 1 yr,count(*)  from movie
join casting on casting.movieid=movie.id
join actor on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr order by 2 desc
乄_柒ぐ汐 2024-08-31 19:05:25

您可以将 topwith ties 一起使用,其中包含具有最大 count(*) 值的所有年份,如下所示:

select top (1) with ties yr, count(*)
from movie
   join casting 
      on casting.movieid=movie.id
   join actor 
      on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;
order by count(*) desc

如果最大值为 6,您将获得计数值为 6 的所有年份。

you can use the top along with with ties, which will include all of the years having the maximum count(*) value, something like this:

select top (1) with ties yr, count(*)
from movie
   join casting 
      on casting.movieid=movie.id
   join actor 
      on casting.actorid = actor.id
where actor.name = 'John Travolta'
group by yr;
order by count(*) desc

If the maximum is say 6, you'll get all of the years for which the count value is 6.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文