我一直在练习SQL,并遇到了我无法解释的这种行为。 (我也是问这个问题的人: ()函数不涵盖表中的所有行) - >这是一个不同的问题。
假设我有一张像这样的
移动表:
movie_id |
user_id |
rating |
create_at |
1 |
1 |
1 3 |
2020-01-12 |
1 |
2 |
4 |
2020-02-11 |
1 |
3 |
2 |
2020-02-12 |
1 |
4 1 |
4 1 |
2020-01-01 |
2 |
1 |
5 |
2020 -02-17 |
2 |
2 |
2 |
2020-02-01 |
2 |
3 |
2 |
2020-03-01 |
3 |
1 |
3 |
2020-02-22 |
3 |
2 |
4 |
2020-02-25 |
我要做的是通过评级来对电影进行排名,我有此SQL查询:
SELECT
movie_id,
rank() over(partition by movie_id order by avg(rating) desc) as rank_rate
FROM
MovieRating
从我的上一个问题中,我了解到 Over()函数将在查询选择的窗口中运行,基本上是窗口此查询返回:
SELECT movie_id FROM MovieRating
所以我希望能有关ID 1、2和3,请参见至少3行。
但是,结果只是一行:
{"headers": ["movie_id", "rank_rate"], "values": [[1, 1]]}
为什么?我对()函数的工作方式的理解有问题吗?
I have been practising SQL, and came across this behaviour i couldnt explain. ( I am also the one who asked this question : Over() function does not cover all rows in the table) -> its a different problem.
Suppose i have a table like this
MovieRating table:
movie_id |
user_id |
rating |
created_at |
1 |
1 |
3 |
2020-01-12 |
1 |
2 |
4 |
2020-02-11 |
1 |
3 |
2 |
2020-02-12 |
1 |
4 |
1 |
2020-01-01 |
2 |
1 |
5 |
2020-02-17 |
2 |
2 |
2 |
2020-02-01 |
2 |
3 |
2 |
2020-03-01 |
3 |
1 |
3 |
2020-02-22 |
3 |
2 |
4 |
2020-02-25 |
What I am trying to do, is to rank the movie by rating, which i have this SQL query:
SELECT
movie_id,
rank() over(partition by movie_id order by avg(rating) desc) as rank_rate
FROM
MovieRating
From my previous question, i learnt that the over() function will operate in a window selected by the query, basically the window this query returns:
SELECT movie_id FROM MovieRating
So I would expect to see at least 3 rows here, for id 1, 2 and 3.
The result is however just one row:
{"headers": ["movie_id", "rank_rate"], "values": [[1, 1]]}
Why is that ? Is something wrong with my understanding regarding how over()
function works ?
发布评论
评论(2)
您需要一个聚合查询并使用
rank()
窗口函数在其结果上函数:请参阅 demo 。
您的查询是一个没有 by 子句的没有
组的聚合查询,这意味着它在整个表上运行,而不是每个
> Movie_id
。此类查询仅返回1行与聚合结果。当您应用
rank()
窗口函数时,它将在该行上进行操作,而不是在表上。You need an aggregation query and use
RANK()
window function on its results:See the demo.
Your query is an aggregation query without a
group by
clause and this means that it operates on the whole table and not to eachmovie_id
. Such queries return only 1 row with the result of the aggregation.When yo apply
RANK()
window function, it will operate on that single row and not on the table.我认为您的意思是每部电影的平均评分。
您应该使用来使用
组,而不是窗口函数:
您只有一行的原因是,当您使用像
avg()avg()
的聚合函数时查询汇总查询。查询的结果是每组一行。说:
换句话说,如果您使用
avg()
,则整个表被视为一个“组”,但不要通过表达式指定组。因为整个表是一个组,所以结果是一行。
窗口功能定义的窗口与汇总函数定义的组不同。通过聚合减少行后,将应用窗口函数。由于您的结果中只有一组,因此排名为1。
I think you mean to get one row for each movie, with its average rating.
You should use
GROUP BY
, not a window function:https://www.db-fiddle.com/f/o9qLFbJEwhaHDWoTS9Qfwp/1
The reason you only got one row is that when you use an aggregate function like
AVG()
, that implicitly makes the query into an aggregating query. The result of the query is one row per group.https://dev.mysql.com/doc/refman/8.0/en/aggregate-functions.html says:
In other words, the whole table is considered one "group" if you use
AVG()
but don't specify aGROUP BY
expression. Because the whole table is a single group, the result is one row.Windows defined by windowing functions are not the same as groups defined by aggregate functions. The window functions are applied after the rows have been reduced by aggregation. Since there was only one group and therefore one row in your result, the rank was 1.