选择 X 组中最大的 N 条记录

发布于 2024-12-07 13:17:59 字数 982 浏览 0 评论 0原文

因此,我一直在浏览“best-n-per-group”标签下的所有问题,要么我不明白我正在阅读的内容,要么到目前为止没有任何内容符合我的需求。 这个链接也提供了很多有用的信息,但仍然没有答案。

所以我有一个包含以下字段的表:

  • id (unique int)
  • user_id (int)
  • category (varchar)
  • score (int)
  • interest (int)

我相信我的问题偏离了常见的每组最大 n 问题,因为我不需要每个组都有最大的N。我需要 X 组的最大 N 条记录。

因此,我需要获得兴趣最高的 X 个类别(简单,按类别分组、按兴趣排序、限制 X)。然后我需要获取每个类别得分最高的 N 条记录。

我的查询将如何查找类似的内容?

抱歉,如果这确实与其他每组最伟大的问题重复,我只是不明白它们是如何工作的。如果是这样,请随意关闭此内容。

更新:

根据 @tehshrike 的敦促,这里有更多信息。

事实上,我的用户对许多不同的类别都有兴趣。我想生成原始用户感兴趣的类别中具有最高分数的用户列表。

因此,可能的情况是我正在寻找的是:

1 最感兴趣的 4 个类别得分最高的 15 位用户。


用户 strong>特定 user_id 查找顶部X 个类别(对该特定用户最感兴趣)并针对这些类别(X 个类别)查找前 N 个用户(对这些类别得分最高)。

So I've been running through all the questions under the greatest-n-per-group tag, and either I don't understand what I'm reading, or nothing has fit my needs so far. This link has also provided a lot of useful information, but still no answer.

So I've got a table with the following fields:

  • id (unique int)
  • user_id (int)
  • category (varchar)
  • score (int)
  • interest (int)

I believe my problem strays from the common greatest-n-per-group question, in that I don't need the greatest N for every group. I need the greatest N records for X groups.

So, I need to get X categories with the highest interest (simple, GROUP BY category ORDER BY interest, LIMIT X). And then I need to get N records with the highest score for each of those categories.

How would my query look for something like this?

Sorry if this is really a duplicate of every other greatest-n-per-group question, and I just don't understand how they work. Feel free to close this if so.

Update:

Per @tehshrike's prodding, here's a bit more information.

Really what's happening is that my user's have interest in lots of different categories. I would like to generate a list of users with the highest score in the categories that the original user is interested in.

So one possible thing that I'm looing for is:

The 15 users with the highest score in the 4 categories that user 1 is most interested in.


For a specific user_id find Top X categories (with highest interest for that specific user) and for those (X categories) find Top N users (with highest score for those categories).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

够运 2024-12-14 13:17:59
SELECT catX.category
       catX.interest
       t1.user_id
       t1.score
FROM 
    ( SELECT category 
           , interest 
      FROM tableX 
      WHERE user_id = @user_id_we_are_interested_in     --- specific user 
      ORDER BY interest DESC
      LIMIT @X                         --- top @X categories per specific user 
    ) AS catX 
  JOIN 
    tableX AS t1 
      ON t1.category = catX.category 
  LEFT JOIN 
    tableX AS t2 
      ON  t2.category = t1.category 
      AND t2.score > t1.score 
  GROUP BY t1.category
         , t1.user_id
  HAVING COUNT(t2.score) < @N                      --- top @N users per category 
  ORDER BY catX.interest DESC 
         , t1.score DESC 
SELECT catX.category
       catX.interest
       t1.user_id
       t1.score
FROM 
    ( SELECT category 
           , interest 
      FROM tableX 
      WHERE user_id = @user_id_we_are_interested_in     --- specific user 
      ORDER BY interest DESC
      LIMIT @X                         --- top @X categories per specific user 
    ) AS catX 
  JOIN 
    tableX AS t1 
      ON t1.category = catX.category 
  LEFT JOIN 
    tableX AS t2 
      ON  t2.category = t1.category 
      AND t2.score > t1.score 
  GROUP BY t1.category
         , t1.user_id
  HAVING COUNT(t2.score) < @N                      --- top @N users per category 
  ORDER BY catX.interest DESC 
         , t1.score DESC 
守不住的情 2024-12-14 13:17:59

您可能想要编写 groupwise-max 查询,然后通过加入类似这样的内容来限制它:(

JOIN
(
  SELECT category
  FROM your_table
  ORDER BY interest
  LIMIT 10
) AS just_get_these_categories ON just_get_these_categories.category = your_table.category

假设类别是表中的候选键)

编辑:根据进一步的问题,它看起来像您想要的是不可能的。 (请参阅:聊天日志

You probably want to write your groupwise-max query, and then limit it down by JOINing on something like this:

JOIN
(
  SELECT category
  FROM your_table
  ORDER BY interest
  LIMIT 10
) AS just_get_these_categories ON just_get_these_categories.category = your_table.category

(This is assuming that category is a candidate key in your table)

Edit: based on further questions, it looks like what you want is not possible. (See: chatlog)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文