mysql:选择最后 10 条消息,并为每条消息选择最后 3 条回复
为了简单起见,让我们将消息表精简到最小,使用一些示例数据,
message_id reply_to createdate
1 0 123
2 0 124
3 0 123
4 1 154
5 1 165
reply_to 是消息的回复的 message_id,
所以我正在寻找一个 sql 语句/过程/函数/其他表设计,让我选择最后一个10条消息,对于最后3条回复中的每一条,我不介意更改表结构,甚至不介意为最后3条回复保留某种记录,
只需选择最后10条消息
SELECT * FROM message ORDER BY createdate LIMIT 10;
,对于每条消息,回复都是
SELECT * FROM message WHERE reply_to = :message_id: ORDER BY createdate LIMIT 3;
我的尝试到目前为止:
- 消息表上的三重外连接作为回复
- 普通连接,但 mysql 不允许
- 使用 HAVING COUNT(DISTINCTreply_to) <= 3 进行连接限制,但当然 HAVING 最后被评估,
我无法得到其中任何
一个最后一个选项 atm 是有一个单独的表来跟踪每条消息的最后 3 个回复
message_reply: 消息 ID、r_1、r_2、r_3
然后使用触发器更新该表 因此,消息表中的新行(即回复)更新了 message_reply 表
UPDATE message_reply SET r_3 = r_2, r_2 = r_1, r_1 = NEW.reply_to WHERE message_id = NEW.message_id
然后我可以查询这些记录的消息表
有人有更好的建议,甚至是有效的 SQL 语句吗?
谢谢
编辑:
添加了解释结果
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 3
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 10 Using where; Using join buffer
1 PRIMARY r eq_ref PRIMARY,message_id,message_id_2 PRIMARY 4 func 1
4 DERIVED NULL NULL NULL NULL NULL NULL NULL No tables used
5 UNION NULL NULL NULL NULL NULL NULL NULL No tables used
6 UNION NULL NULL NULL NULL NULL NULL NULL No tables used
NULL UNION RESULT <union4,5,6> ALL NULL NULL NULL NULL NULL
2 DERIVED m ALL NULL NULL NULL NULL 299727
3 DEPENDENT SUBQUERY r ref reply_to,reply_to_2 reply_to_2 4 testv4.m.message_id 29973
编辑2:
好吧,我尝试了message_reply表方法,这也是我构建
表的方法:
message_reply: message_id, r_1, r_2, r_3
构建触发器:
DELIMITER |
CREATE TRIGGER i_message AFTER INSERT ON message
FOR EACH ROW BEGIN
IF NEW.reply_to THEN
INSERT INTO message_replies (message_id, r_1) VALUES (NEW.reply_to, NEW.message_id)
ON DUPLICATE KEY UPDATE r_3 = r_2, r_2 = r_1, r_1 = NEW.message_id;
ELSE
INSERT INTO message_replies (message_id) VALUES (NEW.message_id);
END IF;
END;
|
DELIMITER ;
并选择消息:
SELECT m.*,r1.*,r2.*,r3.* FROM message_replies mr
LEFT JOIN message m ON m.message_id = mr.message_id
LEFT JOIN message r1 ON r1.message_id = mr.r_1
LEFT JOIN message r2 ON r2.message_id = mr.r_2
LEFT JOIN message r3 ON r3.message_id = mr.r_3
当然通过触发器对我进行预处理,这是最快的方法。
使用多组 100k 插入件进行测试,以查看触发器的性能影响 与没有 trigger 的情况下处理 100k 行相比,花费了 0.4 秒的时间 插入的总时间约为 12 秒(在 myIsam 表上)
For simplicity lets strip down the messages table to its minimum, with some sample data
message_id reply_to createdate
1 0 123
2 0 124
3 0 123
4 1 154
5 1 165
the reply_to is the message_id wich the message is a reply to
so im looking for a sql-statement/procedure/function/other table design that lets me select the last 10 messages and for each of those the last 3 replies, i dont mind changing the table structure or even keeping some sort of a record for the last 3 replies
just selecting the last 10 messages is
SELECT * FROM message ORDER BY createdate LIMIT 10;
and for each of those messages the replies are
SELECT * FROM message WHERE reply_to = :message_id: ORDER BY createdate LIMIT 3;
my attempts so far are:
- a triple outer join over the message table as replies
- a plain join but mysql doesnt allow limits in joins
- useing HAVING COUNT(DISTINCT reply_to) <= 3, but ofcourse HAVING is evaluated last
i couldnt get either of those working
my last option atm is to have a separate table to track the last 3 replies per message
message_reply:
message_id, r_1, r_2, r_3
and then updateing that table useing triggers
so a new row in the message table wich is a reply updates the message_reply table
UPDATE message_reply SET r_3 = r_2, r_2 = r_1, r_1 = NEW.reply_to WHERE message_id = NEW.message_id
then i could just query the message table for those records
anyone have a better suggestion or even a working SQL statement?
thanks
EDIT:
added EXPLAIN results
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 3
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 10 Using where; Using join buffer
1 PRIMARY r eq_ref PRIMARY,message_id,message_id_2 PRIMARY 4 func 1
4 DERIVED NULL NULL NULL NULL NULL NULL NULL No tables used
5 UNION NULL NULL NULL NULL NULL NULL NULL No tables used
6 UNION NULL NULL NULL NULL NULL NULL NULL No tables used
NULL UNION RESULT <union4,5,6> ALL NULL NULL NULL NULL NULL
2 DERIVED m ALL NULL NULL NULL NULL 299727
3 DEPENDENT SUBQUERY r ref reply_to,reply_to_2 reply_to_2 4 testv4.m.message_id 29973
EDIT 2:
Well i tried the message_reply table method also this is what i did
build the table:
message_reply: message_id, r_1, r_2, r_3
build the trigger:
DELIMITER |
CREATE TRIGGER i_message AFTER INSERT ON message
FOR EACH ROW BEGIN
IF NEW.reply_to THEN
INSERT INTO message_replies (message_id, r_1) VALUES (NEW.reply_to, NEW.message_id)
ON DUPLICATE KEY UPDATE r_3 = r_2, r_2 = r_1, r_1 = NEW.message_id;
ELSE
INSERT INTO message_replies (message_id) VALUES (NEW.message_id);
END IF;
END;
|
DELIMITER ;
and select the messages:
SELECT m.*,r1.*,r2.*,r3.* FROM message_replies mr
LEFT JOIN message m ON m.message_id = mr.message_id
LEFT JOIN message r1 ON r1.message_id = mr.r_1
LEFT JOIN message r2 ON r2.message_id = mr.r_2
LEFT JOIN message r3 ON r3.message_id = mr.r_3
Ofcourse with the trigger preprocessing it for me this is the fastest way.
tested with a few more sets of 100k inserts to see the performance hit for the trigger
it took a .4 sec longer to process the 100k rows as it did without the tirgger
total time to insert was about 12 sec (on myIsam tables)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
一个工作示例:
编辑 - (请参阅早期查询的修订版)
完整表创建和解释计划
注意:“datetable”表仅包含大约 10 年的所有日期。它仅用于生成行。
上面生成了 1m 条消息,以及一些有效的回复。查询:
时间:0.078秒
解释计划
A working example:
EDIT - (see revision for earlier query)
Full table creation and explain plan
Note: The table "datetable" just contains all dates for about 10 years. It is used just to generate rows.
The above generates 1m messages, and some valid replies. The query:
Time: 0.078 sec
Explain plan
我建议您构建额外的表,并使其根据需要执行尽可能多的步骤。有时为了形象化答案,您需要额外的步骤。最后,您可以将 SQL 编译为一个嵌套语句。
I would suggest you build your extra table, and make it work with as many steps as necessary. Sometimes to visualize the answer you need extra steps. At the end, you can compile the SQL into one nested statement.
注意:这个回答为OMG的评论提供了比较有用的信息,所以即使需要删除,也请暂时保留。
我按照你的指示进行操作,天啊,这就是我的想法
https://stackoverflow.com/questions/tagged/greatest-n-per-group+mysql< /a>
你可能误解了由于第一页结果中看起来最相似的 3 个问题(其中 2 个是我的答案),这些问题涉及整个表格的单个维度(每个类别前 n 个维度)。解决方案总是在表中提供 row_number ALL 条记录,按类别排序。
将其与针对问题域
top-n-category -> 的此问题提供的优化答案进行比较top-m-per-category
,您会意识到这个问题是不同的。无需
访问问题并礼貌地告知(如果没有回答)
,因为Note: This answer provides useful information for comparison for OMG's comments, so even if it needs to be deleted, please leave it up for a while.
I followed your instructions OMG, and this is what I came up with from
https://stackoverflow.com/questions/tagged/greatest-n-per-group+mysql
You may have misunderstood the question because of the 3 that looked most similar form the first page of results (2 of which are my answers), the questions deal with a single dimension (top n per category) for the entire table. The solutions offered invariably row_number ALL records in the table ordered by category.
Compare that to the optimized answer provided for this question for the problem domain
top-n-category -> top-m-per-category
and you will realize that this question is a different one.There is no need to
visit the questions and courteously inform if not answer
because