Mysql限制每组的行数奇怪的结果

发布于 2024-12-02 19:42:13 字数 1398 浏览 1 评论 0原文

我想获取每个 symbolid 的最新 4 个日期。我改编了代码 此处如下:

set @num := 0, @symbolid := '';
select symbolid, date,
@num := if(@symbolid = symbolid, @num + 1, 1) as row_number,
@symbolid := symbolid as dummy
from projections
group by symbolid, date desc
having row_number < 5

并得到以下结果:

symbolid                date   row_number   dummy       
1       '2011-09-01 00:00:00'       1       1
1       '2011-08-31 00:00:00'       3       1
1       '2011-08-30 00:00:00'       5       1
2       '2011-09-01 00:00:00'       1       2
2       '2011-08-31 00:00:00'       3       2
2       '2011-08-30 00:00:00'       5       2
3       '2011-09-01 00:00:00'       1       3
3       '2011-08-31 00:00:00'       3       3
3       '2011-08-30 00:00:00'       5       3
4       '2011-09-01 00:00:00'       1       4
...

明显的问题是,为什么每个symbolid我只得到3行,以及为什么它们被编号1,3,5?一些细节:

  1. 我尝试了强制索引和不强制索引(如此处所示),并且两种方式都得到了相同的结果。
  2. 日期是正确的,即列表正确显示了每个符号的前 3 个日期,但 row_number 值关闭
  3. 当我不使用“having”语句时,行号是正确的,即,最近的日期是 1,下一个最近的日期是 2,等等

显然,row_number 计算字段受到“having”子句的影响,但我不知道如何修复它。

我意识到我可以将“having”更改为“having row_number < 7”(6 与 5 相同),但它非常难看,并且想知道如何使其“行为”。

I wanted to get the latest 4 dates for each symbolid. I adapted the code here as follows:

set @num := 0, @symbolid := '';
select symbolid, date,
@num := if(@symbolid = symbolid, @num + 1, 1) as row_number,
@symbolid := symbolid as dummy
from projections
group by symbolid, date desc
having row_number < 5

and get the following results:

symbolid                date   row_number   dummy       
1       '2011-09-01 00:00:00'       1       1
1       '2011-08-31 00:00:00'       3       1
1       '2011-08-30 00:00:00'       5       1
2       '2011-09-01 00:00:00'       1       2
2       '2011-08-31 00:00:00'       3       2
2       '2011-08-30 00:00:00'       5       2
3       '2011-09-01 00:00:00'       1       3
3       '2011-08-31 00:00:00'       3       3
3       '2011-08-30 00:00:00'       5       3
4       '2011-09-01 00:00:00'       1       4
...

The obvious question is, why did I only get 3 rows per symbolid, and why are they numbered 1,3,5? A few details:

  1. I tried both forcing an index and not (as seen here), and got the same results both ways.
  2. The dates are correct, i.e., the listing correctly shows the top 3 dates per symbolid, but the row_number value is off
  3. When I don't use the "having" statement, the row numbers are correct, i.e., the most recent date is 1, the next most recent is 2, etc

Obviously the row_number computed field is being affected by the "having" clause, but I don't know how to fix it.

I realize that I could just change the "having" to "having row_number < 7" (6 gives the same as 5), but it's very ugly and would like to know what to do to make it "behave".

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

两仪 2024-12-09 19:42:13

我不是 100% 确定它为什么会这样(也许是因为逻辑上 SELECT 是在 ORDER BY 之前处理的),但它应该按预期工作:

SELECT * 
FROM 
(
    select symbolid, date,
    @num := if(@symbolid = symbolid, @num + 1, 1) as row_number,
    @symbolid := symbolid as dummy
    from projections
    INNER JOIN (SELECT @symbolid:=0)c
    INNER JOIN (SELECT @num:=0)d
    group by symbolid, date desc

) a
WHERE row_number < 5

I'm not 100% sure why it behaves this way (maybe it's because logically SELECT is processed prior to ORDER BY), but it should work as expected:

SELECT * 
FROM 
(
    select symbolid, date,
    @num := if(@symbolid = symbolid, @num + 1, 1) as row_number,
    @symbolid := symbolid as dummy
    from projections
    INNER JOIN (SELECT @symbolid:=0)c
    INNER JOIN (SELECT @num:=0)d
    group by symbolid, date desc

) a
WHERE row_number < 5
又爬满兰若 2024-12-09 19:42:13

用户定义的变量不能很好地工作,(请参阅此处)

作为一般规则,您永远不应该为用户变量赋值并在同一语句中读取该值。您可能会得到预期的结果,但这并不能保证。涉及用户变量的表达式的求值顺序是未定义的,并且可能会根据给定语句中包含的元素而改变;此外,不保证 MySQL 服务器版本之间的顺序相同。在 SELECT @a, @a:=@a+1, ... 中,您可能认为 MySQL 会先计算 @a,然后再进行赋值。但是,更改语句(例如,通过添加 GROUP BY、HAVING 或 ORDER BY 子句)可能会导致 MySQL 选择具有不同评估顺序的执行计划。

这是我的建议

select symbolid, 
substring_index(group_concat(date order by date desc), ',', 4) as last_4_dates
from projections
group by symbolid

这种方法的缺点是它会将日期分组折叠,
并且需要先爆炸才能真正使用它。

The user defined variables does not work well, (refer here)

As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get the results you expect, but this is not guaranteed. The order of evaluation for expressions involving user variables is undefined and may change based on the elements contained within a given statement; in addition, this order is not guaranteed to be the same between releases of the MySQL Server. In SELECT @a, @a:=@a+1, ..., you might think that MySQL will evaluate @a first and then do an assignment second. However, changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation.

Here is my proposal

select symbolid, 
substring_index(group_concat(date order by date desc), ',', 4) as last_4_dates
from projections
group by symbolid

The drawback of this approach is it will group collapse the date,
and you need to explode before you can actually use it.

纸伞微斜 2024-12-09 19:42:13

最终代码:

set @num := 0, @symbolid := '';
select d.* from
(
select symbolid, date,
  @num := if(@symbolid = symbolid, @num + 1, 1) as row_number,
  @symbolid := symbolid as dummy
from projections
order by symbolid, date desc
 ) d
where d.row_number < 5

Final code:

set @num := 0, @symbolid := '';
select d.* from
(
select symbolid, date,
  @num := if(@symbolid = symbolid, @num + 1, 1) as row_number,
  @symbolid := symbolid as dummy
from projections
order by symbolid, date desc
 ) d
where d.row_number < 5
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文