T-SQL - 帮助进行多对多的 MAX 操作

发布于 2024-10-08 05:36:35 字数 1813 浏览 0 评论 0原文

这与我之前提出的一个问题密切相关。

我在帖子位置之间存在多对多关系。

连接表称为PostLocations,除了FK 之外什么也没有。 (LocationId、PostId)

我正在尝试撤回每个位置的置顶帖子

这是我的查询(在我上一个问题的答案中给出):

SELECT pl.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by pl.locationid, p.postid, p.UniqueUri, p.Content

但问题是,因为 PostLocations 可能有这样的条目:

LocationId   PostId
1            213213
2            498324
1            230943

所以我上面的查询返回 LocationId 1 两次,因为它在连接表中有两条记录。我只想要每个位置 1 条记录 - 每个位置 ID 的顶部帖子。

我也尝试过这个:

SELECT l.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.Locations l on pl.LocationId = l.LocationId
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by l.locationid, p.postid, p.UniqueUri, p.Content

相同的结果 - 这就是返回的结果:

LocationId   PostId   UniqueUri   Content   TopScore
1            213213   some-post   pew pew   2.00
2            498324   anot-post   blah bl   4.50
1            230943   sadjsa-as   asijd a   3.5

这是应该返回的结果:

LocationId   PostId   UniqueUri   Content   TopScore
1            230943   sadjsa-as   asijd a   3.5
2            498324   anot-post   blah bl   4.50

因为 LocationId 1 有 2 个帖子,但 PostId 230943 得分最高,所以这是返回的结果。

对我所缺少的有什么想法吗?

This is closely related to a previous question i asked.

I have a many-to-many relationship between Post and Location.

The join table is called PostLocations, and has nothing but the FK's. (LocationId, PostId)

I'm trying to pull back the top post for each location.

This is the query i have (which was given in the answer to my previous question):

SELECT pl.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by pl.locationid, p.postid, p.UniqueUri, p.Content

But the problem is, because PostLocations could have entries like this:

LocationId   PostId
1            213213
2            498324
1            230943

So my above query is returning LocationId 1 twice, because it has two records in the join table. I only want 1 record per location - the top post per locationid.

I've also tried this:

SELECT l.LocationId, p.postid, p.UniqueUri, p.Content, MAX(s.BaseScore) as topscore
from dbo.PostLocations pl
inner join dbo.Locations l on pl.LocationId = l.LocationId
inner join dbo.posts p on pl.PostId = p.PostId
inner join dbo.reviews r on p.postid = r.postid
inner join dbo.scores s on r.scoreid = s.scoreid
group by l.locationid, p.postid, p.UniqueUri, p.Content

Same result - this is what comes back:

LocationId   PostId   UniqueUri   Content   TopScore
1            213213   some-post   pew pew   2.00
2            498324   anot-post   blah bl   4.50
1            230943   sadjsa-as   asijd a   3.5

This is what should come back:

LocationId   PostId   UniqueUri   Content   TopScore
1            230943   sadjsa-as   asijd a   3.5
2            498324   anot-post   blah bl   4.50

Because LocationId 1 has 2 posts, but PostId 230943 has the highest score so that is the one returned.

Any ideas on what i'm missing?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

美男兮 2024-10-15 05:36:35

如果您使用的是 SQL Server 2005 或更高版本,您可以执行以下操作:

With RankedLocations As
    (
    Select PL.LocationId
        , S.BaseScore
        , P.PostID
        , P.UniqueUri
        , P.Content
        , Row_Number() Over( Partition By PL.LocationId Order By S.BaseScore Desc ) As ScoreRank
    From dbo.PostLocations As PL
        Join dbo.Posts As P
            On P.PostId = PL.PostId
        Join dbo.Reviews As R
            On R.PostId = P.PostId
        Join dbo.Scores As S
            On S.ScoreId = R.ScoreId
    )
Select LocationId, BaseScore, PostID, UniqueUri, Content
From RankedLocations
Where ScoreRank = 1

If you are using SQL Server 2005 or later, you can do something like:

With RankedLocations As
    (
    Select PL.LocationId
        , S.BaseScore
        , P.PostID
        , P.UniqueUri
        , P.Content
        , Row_Number() Over( Partition By PL.LocationId Order By S.BaseScore Desc ) As ScoreRank
    From dbo.PostLocations As PL
        Join dbo.Posts As P
            On P.PostId = PL.PostId
        Join dbo.Reviews As R
            On R.PostId = P.PostId
        Join dbo.Scores As S
            On S.ScoreId = R.ScoreId
    )
Select LocationId, BaseScore, PostID, UniqueUri, Content
From RankedLocations
Where ScoreRank = 1
紫瑟鸿黎 2024-10-15 05:36:35

当您根据唯一的 postID 进行分组时,每个帖子都属于自己的一组。

我不确定是否有更好的方法来做到这一点,但是我过去所做的都是沿着这些思路进行的

Select l.LocationId, p.postid, p.UniqueUri, p.Content, s.basescore as topscore
from
    dbo.Locations l inner join
    (select 
        pl.locationid, max(s.BaseScore) as topscore
    from
        dbo.postlocations pl 
        inner join dbo.posts p on pl.PostId = p.PostId
        inner join dbo.reviews r on p.postid = r.postid
        inner join dbo.scores s on r.scoreid = s.scoreid
    group by
        pl.locationid) as topPost on l.locationid = topPost.locationid
    inner join dbo.postlocations pl on pl.locationid = l.locationid
    inner join dbo.posts p on pl.PostId = p.PostId
    inner join dbo.reviews r on p.postid = r.postid
    inner join dbo.scores s on r.scoreid = s.scoreid and s.basescore = toppost.topscore

我们创建一个子查询来查找给定位置的最高分,然后像之前和最后一样进行连接join 确保基础分数是我们之前找到的最高分数。

这意味着,如果给定位置有两个相等的最高分,我们将返回两行,但是在所有其他情况下,我们将每个位置返回一行,可以修改它以在给定两个相等的最高分的情况下选择任意帖子,但我还没有这样做过。

我很想看看是否有任何其他解决方案可以解决这个问题,因为对于额外连接的数量来说,这是一个计算成本相当高的问题解决方案。

编辑 - 回应您的评论,因为 postid 是我们可以依靠它来发布最新帖子的主键。

Select l.LocationId, p.postid, p.UniqueUri, p.Content, bar.basescore as topscore
from
    dbo.Locations l inner join
    (select 
        l.LocationId, max(p.postid) as postid ,max(s.basescore) as basescore
    from
        (select 
            pl.locationid, max(s.BaseScore) as topscore
        from
            dbo.postlocations pl 
            inner join dbo.posts p on pl.PostId = p.PostId
            inner join dbo.reviews r on p.postid = r.postid
            inner join dbo.scores s on r.scoreid = s.scoreid

        group by
            pl.locationid) as topPost on l.locationid = topPost.locationid
        inner join dbo.postlocations pl on pl.locationid = l.locationid
        inner join dbo.posts p on pl.PostId = p.PostId
        inner join dbo.reviews r on p.postid = r.postid
        inner join dbo.scores s on r.scoreid = s.scoreid and s.basescore = toppost.topscore
    group by l.locationid) as bar on l.locationid = bar.locationid
    inner join posts p on bar.postid = p.postid

As you're grouping on postID which is unique every single post falls into its own group of one.

I'm not sure if there is a better way to do this however what I've done in the past runs along these lines

Select l.LocationId, p.postid, p.UniqueUri, p.Content, s.basescore as topscore
from
    dbo.Locations l inner join
    (select 
        pl.locationid, max(s.BaseScore) as topscore
    from
        dbo.postlocations pl 
        inner join dbo.posts p on pl.PostId = p.PostId
        inner join dbo.reviews r on p.postid = r.postid
        inner join dbo.scores s on r.scoreid = s.scoreid
    group by
        pl.locationid) as topPost on l.locationid = topPost.locationid
    inner join dbo.postlocations pl on pl.locationid = l.locationid
    inner join dbo.posts p on pl.PostId = p.PostId
    inner join dbo.reviews r on p.postid = r.postid
    inner join dbo.scores s on r.scoreid = s.scoreid and s.basescore = toppost.topscore

We make a subquery to find the top score for a given location then do our joins as before and on the last join ensure that the basescore is the topscore we found earlier.

This means that if we have two equal top scores for a given location we will return both rows, however in all other cases we will return a single row per location, it can be modified to pick an arbitrary post given two equal top scores but I haven't done so.

I'm interested to see if there are any other solutions to this problem as with the number of extra joins this is quite a computationally expensive solution to the problem.

Edit - in response to your comment as postid is the primary key we can rely on it to be greatest for the latest post.

Select l.LocationId, p.postid, p.UniqueUri, p.Content, bar.basescore as topscore
from
    dbo.Locations l inner join
    (select 
        l.LocationId, max(p.postid) as postid ,max(s.basescore) as basescore
    from
        (select 
            pl.locationid, max(s.BaseScore) as topscore
        from
            dbo.postlocations pl 
            inner join dbo.posts p on pl.PostId = p.PostId
            inner join dbo.reviews r on p.postid = r.postid
            inner join dbo.scores s on r.scoreid = s.scoreid

        group by
            pl.locationid) as topPost on l.locationid = topPost.locationid
        inner join dbo.postlocations pl on pl.locationid = l.locationid
        inner join dbo.posts p on pl.PostId = p.PostId
        inner join dbo.reviews r on p.postid = r.postid
        inner join dbo.scores s on r.scoreid = s.scoreid and s.basescore = toppost.topscore
    group by l.locationid) as bar on l.locationid = bar.locationid
    inner join posts p on bar.postid = p.postid
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文