mysql中的组内排序
我有一个面板数据集:即 times
、ids
和 values
。我想根据每个日期的价值进行排名。我可以通过运行非常简单地实现排序:
select * from tbl order by date, value
我遇到的问题是,一旦表以这种方式排序,我如何检索每个组的行号(也就是说,对于每个日期,我希望有一个名为排名从 1 到 N)。
示例:
输入:
Date, ID, Value
d1, id1, 2
d1, id2, 1
d2, id1, 10
d2, id2, 11
输出:
Date, ID, Value, Rank
d1, id2, 1, 1
d1, id1, 2, 2
d2, id1, 10, 1
d2, id2, 11, 2
I have a panel data set: that is, times
, ids
, and values
. I would like to do a ranking based on value for each date. I can achieve the sort very simply by running:
select * from tbl order by date, value
The issue I have is once the table is sorted in this way, how do I retrieve the row number of each group (that is, for each date I would like there to be a column called ranking that goes from 1 to N).
Example:
Input:
Date, ID, Value
d1, id1, 2
d1, id2, 1
d2, id1, 10
d2, id2, 11
Output:
Date, ID, Value, Rank
d1, id2, 1, 1
d1, id1, 2, 2
d2, id1, 10, 1
d2, id2, 11, 2
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果没有窗口函数,您可以订购 tbl 并使用用户变量自己计算分区(“日期”值)的排名:
更新
那么,该查询在做什么?
我们使用用户变量来“循环”排序的结果集,根据结果集的哪个连续段(在
@partition
)我们在。在查询A中,我们初始化了两个用户变量。在查询B中,我们按照所需的顺序获取表中的记录:首先按日期,然后按值。 A 和 B 一起创建一个派生表,
tbl_ordered
,看起来像这样:记住,我们并不真正关心列< code>dummy.rank 和
dummy.partition
— 它们只是我们初始化变量@rank
和@partition
的意外情况>。在查询C中,我们循环访问派生表的记录。我们所做的或多或少与以下伪代码的作用相同:
最后,查询 D 投影 C 中的所有列,除了包含
@partition
的列(我们将其命名为dummy
并且不需要显示)。Absent window functions, you can order
tbl
and use user variables to compute rank over your partitions ("date" values) yourself:Update
So, what is that query doing?
We are using user variables to "loop" through a sorted result set, incrementing or resetting a counter (
@rank
) depending upon which contiguous segment of the result set (tracked in@partition
) we're in.In query A we initialize two user variables. In query B we get the records of your table in the order we need: first by date and then by value. A and B together make a derived table,
tbl_ordered
, that looks something like this:Remember, we don't really care about the columns
dummy.rank
anddummy.partition
— they're just accidents of how we initialize the variables@rank
and@partition
.In query C we loop through the derived table's records. What we're doing is more-or-less what the following pseudocode does:
Finally, query D projects all columns from C except for the column holding
@partition
(which we nameddummy
and do not need to display).我知道这是一个老问题,但这里有一个简短的答案:
I know this is an old question but here is a shorter answer:
这能解决问题吗?
Would this do the trick?