T-SQL - 帮助进行多对多的 MAX 操作
这与我之前提出的一个问题密切相关。
我在帖子和位置之间存在多对多关系。
连接表称为PostLocations,除了FK 之外什么也没有。 (LocationId、PostId)
我正在尝试撤回每个位置的置顶帖子。
这是我的查询(在我上一个问题的答案中给出):
SELECT pl.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by pl.locationid, p.postid, p.UniqueUri, p.Content
但问题是,因为 PostLocations 可能有这样的条目:
LocationId PostId
1 213213
2 498324
1 230943
所以我上面的查询返回 LocationId 1 两次,因为它在连接表中有两条记录。我只想要每个位置 1 条记录 - 每个位置 ID 的顶部帖子。
我也尝试过这个:
SELECT l.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.Locations l on pl.LocationId = l.LocationId
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by l.locationid, p.postid, p.UniqueUri, p.Content
相同的结果 - 这就是返回的结果:
LocationId PostId UniqueUri Content TopScore
1 213213 some-post pew pew 2.00
2 498324 anot-post blah bl 4.50
1 230943 sadjsa-as asijd a 3.5
这是应该返回的结果:
LocationId PostId UniqueUri Content TopScore
1 230943 sadjsa-as asijd a 3.5
2 498324 anot-post blah bl 4.50
因为 LocationId 1 有 2 个帖子,但 PostId 230943 得分最高,所以这是返回的结果。
对我所缺少的有什么想法吗?
This is closely related to a previous question i asked.
I have a many-to-many relationship between Post and Location.
The join table is called PostLocations, and has nothing but the FK's. (LocationId, PostId)
I'm trying to pull back the top post for each location.
This is the query i have (which was given in the answer to my previous question):
SELECT pl.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by pl.locationid, p.postid, p.UniqueUri, p.Content
But the problem is, because PostLocations could have entries like this:
LocationId PostId
1 213213
2 498324
1 230943
So my above query is returning LocationId 1 twice, because it has two records in the join table. I only want 1 record per location - the top post per locationid.
I've also tried this:
SELECT l.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.Locations l on pl.LocationId = l.LocationId
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by l.locationid, p.postid, p.UniqueUri, p.Content
Same result - this is what comes back:
LocationId PostId UniqueUri Content TopScore
1 213213 some-post pew pew 2.00
2 498324 anot-post blah bl 4.50
1 230943 sadjsa-as asijd a 3.5
This is what should come back:
LocationId PostId UniqueUri Content TopScore
1 230943 sadjsa-as asijd a 3.5
2 498324 anot-post blah bl 4.50
Because LocationId 1 has 2 posts, but PostId 230943 has the highest score so that is the one returned.
Any ideas on what i'm missing?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您使用的是 SQL Server 2005 或更高版本,您可以执行以下操作:
If you are using SQL Server 2005 or later, you can do something like:
当您根据唯一的 postID 进行分组时,每个帖子都属于自己的一组。
我不确定是否有更好的方法来做到这一点,但是我过去所做的都是沿着这些思路进行的
我们创建一个子查询来查找给定位置的最高分,然后像之前和最后一样进行连接join 确保基础分数是我们之前找到的最高分数。
这意味着,如果给定位置有两个相等的最高分,我们将返回两行,但是在所有其他情况下,我们将每个位置返回一行,可以修改它以在给定两个相等的最高分的情况下选择任意帖子,但我还没有这样做过。
我很想看看是否有任何其他解决方案可以解决这个问题,因为对于额外连接的数量来说,这是一个计算成本相当高的问题解决方案。
编辑 - 回应您的评论,因为 postid 是我们可以依靠它来发布最新帖子的主键。
As you're grouping on postID which is unique every single post falls into its own group of one.
I'm not sure if there is a better way to do this however what I've done in the past runs along these lines
We make a subquery to find the top score for a given location then do our joins as before and on the last join ensure that the basescore is the topscore we found earlier.
This means that if we have two equal top scores for a given location we will return both rows, however in all other cases we will return a single row per location, it can be modified to pick an arbitrary post given two equal top scores but I haven't done so.
I'm interested to see if there are any other solutions to this problem as with the number of extra joins this is quite a computationally expensive solution to the problem.
Edit - in response to your comment as postid is the primary key we can rely on it to be greatest for the latest post.