MySQL 子查询真的很慢...解决方法吗?

发布于 2024-11-06 23:46:34 字数 630 浏览 0 评论 0原文

我在 MySQL 5.0、5.1、5.5 上测试了以下看似简单的查询,发现它非常慢。

select * from entry where session_id in
    (select session_id from entry where created_at > [some timestamp])

多个条目可以具有相同的会话 ID,但具有不同的created_at 时间戳。 该查询旨在获取所有条目,其中至少有一个来自同一 session_id 且其created_at 大于指定时间戳的条目。

我见过其他人谈到类似查询的 MySQL 子查询性能问题,并且 MySQL 认为子查询是一个依赖查询,并且它正在对外部查询进行全表扫描。建议的解决方法是这样的:

select * from entry where session_id in
    (select session_id from
        (select session_id from entry where created_at > [some timestamp])
    as temp)

但是,这个 hack 对我不起作用并且使它变得更慢。

关于如何重写这个查询有什么想法吗?

I've tested the following seemingly simple query on MySQL 5.0, 5.1, 5.5 and found it to be extremely slow.

select * from entry where session_id in
    (select session_id from entry where created_at > [some timestamp])

Multiple entry's can have the same session ID, but different created_at timestamps.
The query is meant to grab all entry's that have at least one entry from the same session_id whose created_at is greater than the specified timestamp.

I've seen others speak of MySQL subquery performance issues with similar queries, and that MySQL considers the subquery a dependent query and it is doing a full table scan on the outer query. Suggested workarounds were something like:

select * from entry where session_id in
    (select session_id from
        (select session_id from entry where created_at > [some timestamp])
    as temp)

However, this hack doesn't work for me and makes it even slower.

Any ideas on how to rewrite this query?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

隱形的亼 2024-11-13 23:46:34

根据您的数据分布,使用此

SELECT  e.*
FROM    (
        SELECT  session_id, MAX(created_at)
        FROM    entry
        GROUP BY
                session_id
        HAVING  MAX(created_at) > $mytimestamp
        ) ed
JOIN    entry e
ON      e.session_id = ed.session_id

(在 (session_id,created_at) 上创建索引),或此:(

SELECT  DISTINCT e.*
FROM    entry ed
JOIN    entry e
ON      e.session_id = ed.session_id
WHERE   ed.created_at > $mytimestamp

created_atsession_id< 上创建两个单独的索引/代码>)

Depending on your data distribution, use this

SELECT  e.*
FROM    (
        SELECT  session_id, MAX(created_at)
        FROM    entry
        GROUP BY
                session_id
        HAVING  MAX(created_at) > $mytimestamp
        ) ed
JOIN    entry e
ON      e.session_id = ed.session_id

(create an index on (session_id, created_at)), or this:

SELECT  DISTINCT e.*
FROM    entry ed
JOIN    entry e
ON      e.session_id = ed.session_id
WHERE   ed.created_at > $mytimestamp

(create two separate indexes on created_at and session_id)

疾风者 2024-11-13 23:46:34

我也遇到了双子查询技巧的问题,
顺便说一句,我刚刚发现使用它对我有用(基于您的查询):

select * from entry where session_id in
    (select (select session_id from entry where created_at > [some timestamp]))

在我的情况下,原始查询可以使用连接或“正常”双子查询技巧工作几个小时,而修改后的双子查询则需要 0 秒: )

I was having a problem with the double subquery trick too,
btw I just found out that using this worked for me (based on your query):

select * from entry where session_id in
    (select (select session_id from entry where created_at > [some timestamp]))

In my case the original query could work for hours using a join or the "normal" double subquery trick, with the modified double subquery it took 0 secs :)

夜吻♂芭芘 2024-11-13 23:46:34

怎么样:

SELECT DISTINCT e2.*
    FROM entry e1
        INNER JOIN entry e2
            ON e1.session_id = e2.session_id
    WHERE e1.created_at > [some timestamp]

如果您还没有它们,created_atsession_id 上的索引可能也会有所帮助。

How about:

SELECT DISTINCT e2.*
    FROM entry e1
        INNER JOIN entry e2
            ON e1.session_id = e2.session_id
    WHERE e1.created_at > [some timestamp]

If you don't already have them, indexes on created_at and session_id would probably be helpful as well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文