选择相关表中前 N 个结果的项目

发布于 2024-09-08 12:34:22 字数 433 浏览 3 评论 0原文

假设我有一个游戏,其中提出一个问题,人们发布回答并进行评分,前 10 名回答获胜。我有一个存储所有这些信息的 SQL 数据库,因此我可能有诸如用户、问题和响应之类的表。 Responses 表具有外键 user_id 和 Question_id 以及属性 Total_score。

显然,对于特定问题,我可以检索带有顺序和限制的前 10 个响应:

SELECT * FROM Responses WHERE question_id=? ORDER BY total_score DESC LIMIT 10;

我正在寻找一种方法,可以为特定用户确定其所有获胜者响应的列表(在顶部) 10 针对他们的特定问题)。以编程方式单步执行每个响应并查看它是否包含在其问题的前 10 个中是很简单的,但我想对此进行优化,因此我不会执行 N+1 查询,其中 N 是用户已提交的响应数。

Say I have a game where a question is asked, people post responses which are scored, and the top 10 responses win. I have a SQL database that stores all of this information, so I might have tables such as Users, Questions, and Responses. The Responses table has foreign_keys user_id and question_id, and attribute total_score.

Obviously, for a particular Question I can retrieve the top 10 Responses with an order and limit:

SELECT * FROM Responses WHERE question_id=? ORDER BY total_score DESC LIMIT 10;

What I'm looking for is a way I can determine, for a particular User, a list of all their Responses that are winners (in the top 10 for their particular Question). It is simple programmatically to step through each Response and see if it is included in the top 10 for its Question, but I would like to optimize this so I am not doing N+1 queries where N is the number of Responses the User has submitted.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

我是男神闪亮亮 2024-09-15 12:34:22

如果您使用 Oracle、Microsoft SQL Server、DB2 或 PostgreSQL,这些数据库支持窗口函数。将用户的回答加入到同一问题的其他回答中。然后按问题分区,按分数降序排列。使用每个分区中的行号将集合限制为前 10 名中的行号。同时传递给定用户的 user_id,以便您可以从前 10 名中挑选他们,因为您只对给定用户的响应感兴趣。

SELECT *
FROM (
  SELECT r1.user_id AS given_user, r2.*,
    ROW_NUMBER() OVER (PARTITION BY r2.question_id ORDER BY r2.total_score DESC) AS rownum
  FROM Responses r1 JOIN Responses r2 ON r1.question_id = r2.question_id
  WHERE r1.user_id = ?
) t
WHERE rownum <= 10 AND user_id = given_user;

但是,如果您使用 MySQL 或 SQLite 或其他不支持窗口函数的数据库,则可以使用以下不同的解决方案:

查询用户的响应,并使用联接来匹配分数更高(或更早)的相应问题的其他响应平局的情况下PK)。按问题分组,并计算得分较高的回复数量。如果计数小于 10,则用户的回答属于每个问题的前 10 名。

SELECT r1.*
FROM Responses r1
LEFT OUTER JOIN Responses r2 ON r1.question_id = r2.question_id 
  AND (r1.total_score < r2.total_score 
    OR r1.total_score = r2.total_score AND r1.response_id > r2.response_id)
WHERE r1.user_id = ?
GROUP BY r1.question_id
HAVING COUNT(*) < 10;

If you use Oracle, Microsoft SQL Server, DB2, or PostgreSQL, these databases support windowing functions. Join the user's responses to other responses to the same question. Then partition by question and order by score descending. Use the row number within each partition to restrict the set to those in the top 10. Also pass along the user_id of the given user so you can pick them out of the top 10, since you're only interested in the given user's responses.

SELECT *
FROM (
  SELECT r1.user_id AS given_user, r2.*,
    ROW_NUMBER() OVER (PARTITION BY r2.question_id ORDER BY r2.total_score DESC) AS rownum
  FROM Responses r1 JOIN Responses r2 ON r1.question_id = r2.question_id
  WHERE r1.user_id = ?
) t
WHERE rownum <= 10 AND user_id = given_user;

However, if you use MySQL or SQLite or other databases that don't support windowing functions, you can use this different solution:

Query for the user's responses, and use a join to match other responses to the respective questions with greater score (or earlier PK in the case of ties). Group by question, and count the number of responses that have higher score. If the count is fewer than 10, then the user's response is among the top 10 per question.

SELECT r1.*
FROM Responses r1
LEFT OUTER JOIN Responses r2 ON r1.question_id = r2.question_id 
  AND (r1.total_score < r2.total_score 
    OR r1.total_score = r2.total_score AND r1.response_id > r2.response_id)
WHERE r1.user_id = ?
GROUP BY r1.question_id
HAVING COUNT(*) < 10;
久伴你 2024-09-15 12:34:22

尝试嵌入 select 语句。我今天无法访问数据库工具,因此无法确认语法/输出。只需进行适当的更改即可捕获您需要的所有列。您还可以向主查询添加问题并连接响应。

select *
  from users
     , responses
 where users.user_id=responses.user_id
   and responses.response_id in (SELECT z.response_id 
                                   FROM Responses z 
                                  WHERE z.user_id = users.user_id 
                                 ORDER BY total_score DESC 
                                 LIMIT 10)

Try an embedded select statement. I don't have access to a DB tool today so I can't confirm the syntax/output. Just make the appropriate changes to capture all the columns you need. You can also add questions to the main query and join off of responses.

select *
  from users
     , responses
 where users.user_id=responses.user_id
   and responses.response_id in (SELECT z.response_id 
                                   FROM Responses z 
                                  WHERE z.user_id = users.user_id 
                                 ORDER BY total_score DESC 
                                 LIMIT 10)
反差帅 2024-09-15 12:34:22

或者您可以通过添加另一个字段(例如“IsTopPost”)来真正优化它。当有人投票时,您必须更新热门帖子,但您的查询很简单:

SELECT * FROM Responses WHERE user_id=? and IsTopPost = 1

Or you can really optimize it by adding another field like "IsTopPost". You would have to update the top posts when someone votes, but your query would be simple:

SELECT * FROM Responses WHERE user_id=? and IsTopPost = 1
酷炫老祖宗 2024-09-15 12:34:22

我认为这样的事情应该可以解决问题:

SELECT 
    user_id, question_id, response_id 
FROM 
    Responses AS r1 
WHERE 
    user_id = ?
AND
    response_id IN (SELECT response_id 
                    FROM Responses AS r2 
                    WHERE r2.question_id = r1.question_id 
                    ORDER BY total_score DESC LIMIT 10)

实际上,对于每个 Question_id,执行一个子查询来确定该 Question_id 的前 10 个响应。

您可能需要考虑添加一列,将某些响应标记为“获胜者”。这样,您只需选择这些行即可使数据库免于一遍又一遍地计算前 10 行。

I think something like this should do the trick:

SELECT 
    user_id, question_id, response_id 
FROM 
    Responses AS r1 
WHERE 
    user_id = ?
AND
    response_id IN (SELECT response_id 
                    FROM Responses AS r2 
                    WHERE r2.question_id = r1.question_id 
                    ORDER BY total_score DESC LIMIT 10)

Effectively, for each question_id, a subquery is performed which determines the top 10 responses for that question_id.

You may want to consider adding a column which marks certain Responses as 'winners'. That way, you can simply select those rows and save the database from having to calculate the top 10's over and over again.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文