如何按一列的最大值获取SQL行,并按另一列分组
我需要从旧数据库中读出用户以用于新系统中的统计,但我没有原始用户表。不过有一个统计表,其中包含每年的总和,我还可以在其中找到所有需要的用户信息。此外,这也只给我活跃的用户,这正是我所需要的。
该表具有以下相关列:(统计列在此不相关)
- UserID
- FirstName
- LastName
- Year
我希望 UserID 是不同的,因此它是我可以在 GROUP BY 中拥有的唯一列。 我将在 Year 上运行 MAX,以获取最近一年的值。 名字、姓氏和电子邮件需要与 MAX(Year) 所在的行相同。换句话说,这些年来人们可能会更改姓名和电子邮件,而我只想要最后一个,因为它是唯一相关的。
我对 SQL 查询的最佳建议是这样的:
SELECT UserID, Firstname, LastName, Email, MAX(Year) AS Year
FROM myTable
GROUP BY UserID
ORDER BY LastName, FirstName
唯一的问题是 SQL Server 2008 不会让我做任何类似的事情,因为所有列要么必须使用像 MAX 这样的函数,要么必须是 GROUP BY 的一部分。 FirstName、LastName 和 Email 列不能位于 GROUP BY 下,因为这样会产生太多记录。似乎可以将 MAX 放在所有这些上,但我无法知道 MAX 函数实际上在哪一列上工作。我不确定这是否会成为问题,但我没有时间查看 100 000 行来查看是否确实存在问题。
简而言之,我想要整行五列,其中 MAX 仅适用于一列,而 GROUP BY 则适用于另一列。有人有一个好的解决方案吗,或者在所有非分组行上使用 MAX 实际上是否安全?
I need to read out users from an old database to use for statistics in a new system, but I do not have the original user table. However there is a statistics table with the sum of each year, where I also can find all needed user information. In addition this also only give me the users that were active, which is what I need.
The table has the following relevant columns: (the statistics columns are not relevant here)
- UserID
- FirstName
- LastName
- Year
I want UserID to be distinct, so it is the only column that I can have in GROUP BY.
I will run MAX on Year, to get the values from the most recent year.
FirstName, LastName and Email need to be the same as the row where MAX(Year). In other words people may have changed both names and emails during the years, and I only want the last one, as it's the only one that is relevant.
My best suggestion for a SQL query goes like this:
SELECT UserID, Firstname, LastName, Email, MAX(Year) AS Year
FROM myTable
GROUP BY UserID
ORDER BY LastName, FirstName
The only problem is that SQL Server 2008 will not let me do anyhing like that, because all columns either has to be with a function like MAX or a part of GROUP BY. The columns FirstName, LastName and Email can't be under GROUP BY because that will make too many records. It seems somehow to work to put MAX on all of them, but then I have no way of knowing which column the MAX function is actually working on. I do not know for sure that it will be a problem, but I don't have time to look through 100 000 rows to see if there actually is a problem.
So in short I want the whole row of five columns where MAX works on only one column and GROUP BY on another. Does anybody have a good solution, or is it actually safe to use MAX on all non-grouping rows?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一些答案...
相关子查询...
加入派生聚合...
使用 ROW_NUMBER() 订购 CTE...
A few answers...
Correlated sub-query...
Join on derived aggregate...
Ordered CTE using ROW_NUMBER()...
每个用户只有一年的记录吗?如果是,那么您可以使用旧的'n'good join:
当然您可以使用较新的 SQL 版本的构造,我只是已经习惯了较旧的(=更通用的)可能性:)。
Dy you have only one year record per user? If yes, then you can use old'n'good join:
Sure you can use constructs from newer SQL versions, I just have got used to older (=more generic) possibilities :).