MySQL 子查询真的很慢...解决方法吗？

发布于 2024-11-06 23:46:34 字数 630 浏览 0 评论 0原文

我在 MySQL 5.0、5.1、5.5 上测试了以下看似简单的查询，发现它非常慢。

select * from entry where session_id in
    (select session_id from entry where created_at > [some timestamp])

多个条目可以具有相同的会话 ID，但具有不同的created_at 时间戳。该查询旨在获取所有条目，其中至少有一个来自同一 session_id 且其created_at 大于指定时间戳的条目。

我见过其他人谈到类似查询的 MySQL 子查询性能问题，并且 MySQL 认为子查询是一个依赖查询，并且它正在对外部查询进行全表扫描。建议的解决方法是这样的：

select * from entry where session_id in
    (select session_id from
        (select session_id from entry where created_at > [some timestamp])
    as temp)

但是，这个 hack 对我不起作用并且使它变得更慢。

关于如何重写这个查询有什么想法吗？

原文

I've tested the following seemingly simple query on MySQL 5.0, 5.1, 5.5 and found it to be extremely slow.

select * from entry where session_id in
    (select session_id from entry where created_at > [some timestamp])

Multiple entry's can have the same session ID, but different created_at timestamps.
The query is meant to grab all entry's that have at least one entry from the same session_id whose created_at is greater than the specified timestamp.

I've seen others speak of MySQL subquery performance issues with similar queries, and that MySQL considers the subquery a dependent query and it is doing a full table scan on the outer query. Suggested workarounds were something like:

select * from entry where session_id in
    (select session_id from
        (select session_id from entry where created_at > [some timestamp])
    as temp)

However, this hack doesn't work for me and makes it even slower.

Any ideas on how to rewrite this query?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

隱形的亼 2024-11-13 23:46:34

根据您的数据分布，使用此

SELECT  e.*
FROM    (
        SELECT  session_id, MAX(created_at)
        FROM    entry
        GROUP BY
                session_id
        HAVING  MAX(created_at) > $mytimestamp
        ) ed
JOIN    entry e
ON      e.session_id = ed.session_id

（在 (session_id,created_at) 上创建索引），或此：（

SELECT  DISTINCT e.*
FROM    entry ed
JOIN    entry e
ON      e.session_id = ed.session_id
WHERE   ed.created_at > $mytimestamp

在 created_at 和 session_id< 上创建两个单独的索引/代码>)

Depending on your data distribution, use this

SELECT  e.*
FROM    (
        SELECT  session_id, MAX(created_at)
        FROM    entry
        GROUP BY
                session_id
        HAVING  MAX(created_at) > $mytimestamp
        ) ed
JOIN    entry e
ON      e.session_id = ed.session_id

(create an index on (session_id, created_at)), or this:

SELECT  DISTINCT e.*
FROM    entry ed
JOIN    entry e
ON      e.session_id = ed.session_id
WHERE   ed.created_at > $mytimestamp

(create two separate indexes on created_at and session_id)

回复收藏 0 原文

疾风者 2024-11-13 23:46:34

我也遇到了双子查询技巧的问题，
顺便说一句，我刚刚发现使用它对我有用（基于您的查询）：

select * from entry where session_id in
    (select (select session_id from entry where created_at > [some timestamp]))

在我的情况下，原始查询可以使用连接或“正常”双子查询技巧工作几个小时，而修改后的双子查询则需要 0 秒：）

I was having a problem with the double subquery trick too,
btw I just found out that using this worked for me (based on your query):

select * from entry where session_id in
    (select (select session_id from entry where created_at > [some timestamp]))

In my case the original query could work for hours using a join or the "normal" double subquery trick, with the modified double subquery it took 0 secs :)

回复收藏 0 原文

夜吻♂芭芘 2024-11-13 23:46:34

怎么样：

SELECT DISTINCT e2.*
    FROM entry e1
        INNER JOIN entry e2
            ON e1.session_id = e2.session_id
    WHERE e1.created_at > [some timestamp]

如果您还没有它们，created_at 和 session_id 上的索引可能也会有所帮助。

How about:

SELECT DISTINCT e2.*
    FROM entry e1
        INNER JOIN entry e2
            ON e1.session_id = e2.session_id
    WHERE e1.created_at > [some timestamp]

If you don't already have them, indexes on created_at and session_id would probably be helpful as well.

回复收藏 0 原文

~没有更多了~