连接 MySQL 中表中的单行

发布于 2024-09-07 06:02:17 字数 667 浏览 7 评论 0原文

我有两个表playersscores

我想生成一个如下所示的报告:

player    first score             points
foo       2010-05-20              19
bar       2010-04-15              29
baz       2010-02-04              13

现在,我的查询如下所示:

select p.name        player,
       min(s.date)   first_score,
       s.points      points    
from  players p    
join  scores  s on  s.player_id = p.id    
group by p.name, s.points

我需要与 min(s.date) 行关联的 s.points ) 返回。这个查询会发生这种情况吗?也就是说,我如何确定我获得了连接行的正确 s.points 值?

旁注:我想这在某种程度上与 MySQL 缺乏密集的排名有关。这里最好的解决方法是什么?

I have two tables players and scores.

I want to generate a report that looks something like this:

player    first score             points
foo       2010-05-20              19
bar       2010-04-15              29
baz       2010-02-04              13

Right now, my query looks something like this:

select p.name        player,
       min(s.date)   first_score,
       s.points      points    
from  players p    
join  scores  s on  s.player_id = p.id    
group by p.name, s.points

I need the s.points that is associated with the row that min(s.date) returns. Is that happening with this query? That is, how can I be certain I'm getting the correct s.points value for the joined row?

Side note: I imagine this is somehow related to MySQL's lack of dense ranking. What's the best workaround here?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

葵雨 2024-09-14 06:02:17

这是 Stack Overflow 上经常出现的最大的每组问题。

这是我通常的回答:

select
  p.name        player,
  s.date        first_score,
  s.points      points

from  players p

join  scores  s
  on  s.player_id = p.id

left outer join scores  s2
  on  s2.player_id = p.id
      and s2.date < s.date

where
  s2.player_id is null

;

换句话说,给定分数 s,尝试找到同一玩家的分数 s2,但日期更早。如果没有找到更早的分数,则 s 是最早的分数。


回复您对领带的评论:您必须制定一项政策,以便在出现领带时使用哪种政策。一种可能性是,如果您使用自动递增主键,则值最小的主键就是较早的主键。请参阅下面的外连接中的附加术语:

select
  p.name        player,
  s.date        first_score,
  s.points      points

from  players p

join  scores  s
  on  s.player_id = p.id

left outer join scores  s2
  on  s2.player_id = p.id
      and (s2.date < s.date or s2.date = s.date and s2.id < s.id)

where
  s2.player_id is null

;

基本上,您需要添加决胜局术语,直到您找到一个保证唯一的列,至少对于给定的玩家而言。表的主键通常是最好的解决方案,但我也见过其他列更合适的情况。

关于我与@OMG Ponies 分享的评论,请记住,这种类型的查询从正确的索引中受益匪浅。

This is the greatest-n-per-group problem that comes up frequently on Stack Overflow.

Here's my usual answer:

select
  p.name        player,
  s.date        first_score,
  s.points      points

from  players p

join  scores  s
  on  s.player_id = p.id

left outer join scores  s2
  on  s2.player_id = p.id
      and s2.date < s.date

where
  s2.player_id is null

;

In other words, given score s, try to find a score s2 for the same player, but with an earlier date. If no earlier score is found, then s is the earliest one.


Re your comment about ties: You have to have a policy for which one to use in case of a tie. One possibility is if you use auto-incrementing primary keys, the one with the least value is the earlier one. See the additional term in the outer join below:

select
  p.name        player,
  s.date        first_score,
  s.points      points

from  players p

join  scores  s
  on  s.player_id = p.id

left outer join scores  s2
  on  s2.player_id = p.id
      and (s2.date < s.date or s2.date = s.date and s2.id < s.id)

where
  s2.player_id is null

;

Basically you need to add tiebreaker terms until you get down to a column that's guaranteed to be unique, at least for the given player. The primary key of the table is often the best solution, but I've seen cases where another column was suitable.

Regarding the comments I shared with @OMG Ponies, remember that this type of query benefits hugely from the right index.

当使用 GROUP BY 时,大多数 RDMB 甚至不允许您在 SELECT 子句中包含非聚合列。在 MySQL 中,您最终会得到非聚合列的随机行值。如果您实际上在所有行的特定列中具有相同的值,这非常有用。因此,MySQL 不会限制我们,这很好,尽管理解这一点很重要。

SQL 反模式中专门讨论了这一点。

Most RDMBs won't even let you include non aggregate columns in your SELECT clause when using GROUP BY. In MySQL, you'll end up with values from random rows for your non-aggregate columns. This is useful if you actually have the same value in a particular column for all the rows. Therefore, it's nice that MySQL doesn't restrict us, though it's an important thing to understand.

A whole chapter is devoted to this in SQL Antipatterns.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文