连接 MySQL 中表中的单行
我有两个表players
和scores
。
我想生成一个如下所示的报告:
player first score points
foo 2010-05-20 19
bar 2010-04-15 29
baz 2010-02-04 13
现在,我的查询如下所示:
select p.name player,
min(s.date) first_score,
s.points points
from players p
join scores s on s.player_id = p.id
group by p.name, s.points
我需要与 min(s.date) 行关联的
返回。这个查询会发生这种情况吗?也就是说,我如何确定我获得了连接行的正确 s.points
)s.points
值?
旁注:我想这在某种程度上与 MySQL 缺乏密集的排名有关。这里最好的解决方法是什么?
I have two tables players
and scores
.
I want to generate a report that looks something like this:
player first score points
foo 2010-05-20 19
bar 2010-04-15 29
baz 2010-02-04 13
Right now, my query looks something like this:
select p.name player,
min(s.date) first_score,
s.points points
from players p
join scores s on s.player_id = p.id
group by p.name, s.points
I need the s.points
that is associated with the row that min(s.date)
returns. Is that happening with this query? That is, how can I be certain I'm getting the correct s.points
value for the joined row?
Side note: I imagine this is somehow related to MySQL's lack of dense ranking. What's the best workaround here?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这是 Stack Overflow 上经常出现的最大的每组问题。
这是我通常的回答:
换句话说,给定分数 s,尝试找到同一玩家的分数 s2,但日期更早。如果没有找到更早的分数,则 s 是最早的分数。
回复您对领带的评论:您必须制定一项政策,以便在出现领带时使用哪种政策。一种可能性是,如果您使用自动递增主键,则值最小的主键就是较早的主键。请参阅下面的外连接中的附加术语:
基本上,您需要添加决胜局术语,直到您找到一个保证唯一的列,至少对于给定的玩家而言。表的主键通常是最好的解决方案,但我也见过其他列更合适的情况。
关于我与@OMG Ponies 分享的评论,请记住,这种类型的查询从正确的索引中受益匪浅。
This is the greatest-n-per-group problem that comes up frequently on Stack Overflow.
Here's my usual answer:
In other words, given score s, try to find a score s2 for the same player, but with an earlier date. If no earlier score is found, then s is the earliest one.
Re your comment about ties: You have to have a policy for which one to use in case of a tie. One possibility is if you use auto-incrementing primary keys, the one with the least value is the earlier one. See the additional term in the outer join below:
Basically you need to add tiebreaker terms until you get down to a column that's guaranteed to be unique, at least for the given player. The primary key of the table is often the best solution, but I've seen cases where another column was suitable.
Regarding the comments I shared with @OMG Ponies, remember that this type of query benefits hugely from the right index.
当使用 GROUP BY 时,大多数 RDMB 甚至不允许您在 SELECT 子句中包含非聚合列。在 MySQL 中,您最终会得到非聚合列的随机行值。如果您实际上在所有行的特定列中具有相同的值,这非常有用。因此,MySQL 不会限制我们,这很好,尽管理解这一点很重要。
SQL 反模式中专门讨论了这一点。
Most RDMBs won't even let you include non aggregate columns in your SELECT clause when using GROUP BY. In MySQL, you'll end up with values from random rows for your non-aggregate columns. This is useful if you actually have the same value in a particular column for all the rows. Therefore, it's nice that MySQL doesn't restrict us, though it's an important thing to understand.
A whole chapter is devoted to this in SQL Antipatterns.