用户独有的单词 - 不在其中

发布于 2024-10-18 08:24:20 字数 898 浏览 0 评论 0原文

我这里有一个非常简单的表结构。只是与简单 user_id 相关的单词列表。

Word Table: 
    word - varchar(50)
    user_id - integer

我需要找到一个用户使用但其他用户未使用的单词。目前我正在这样做,它在 Postgresql (9.0.3) 上以 200k 字(~.3-.5 秒)工作正常,但在具有相同数据(5+ 分钟)的 MySQL (5.1.54) 上完全失败仍在运行)。所有使用的列都已建立索引。

SELECT  
    word, count(word) as count
FROM 
    words
WHERE         
    word not in (select word from words where user_id <> 99 group by word)
    and user_id = 99
GROUP BY word
ORDER BY count desc LIMIT 20

1)有人知道更好的方法吗?

2)有人知道为什么它在MySql上完全失败吗?

编辑:这修复了 MySQL 上的问题,从 5 分钟以上缩短到 10-20 毫秒 - 感谢 Borealid

SELECT  
    word, count(word) as count
FROM 
    words
WHERE         
    word not in (select distinct word from words where user_id <> 99)
    and user_id = 99
GROUP BY word
ORDER BY count desc LIMIT 20

谢谢。

I have a very simple table structure here. Just a list of words related to a simple user_id.

Word Table: 
    word - varchar(50)
    user_id - integer

I need to find words used by one user that are not used by other users. Currently I'm doing this and it works alright on Postgresql (9.0.3) at 200k words (~.3-.5 seconds) and completely falls over on MySQL (5.1.54) with the same data (5+ mins and it is still running). All used columns are indexed.

SELECT  
    word, count(word) as count
FROM 
    words
WHERE         
    word not in (select word from words where user_id <> 99 group by word)
    and user_id = 99
GROUP BY word
ORDER BY count desc LIMIT 20

1) Anyone know of a better way to do this?

2) Anyone know why it is completely failing on MySql?

EDIT: This fixes the issue on MySQL, from 5 mins+ to 10-20ms - Thanks Borealid

SELECT  
    word, count(word) as count
FROM 
    words
WHERE         
    word not in (select distinct word from words where user_id <> 99)
    and user_id = 99
GROUP BY word
ORDER BY count desc LIMIT 20

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

夜唯美灬不弃 2024-10-25 08:24:20

尝试 NOT EXISTS():

SELECT  
    w1.word, 
    COUNT(w1.word) as count
FROM 
    words w1
WHERE         
    NOT EXISTS (
        SELECT 1 
        FROM 
            words w2 
        WHERE 
            w2.user_id <> 99 
        AND 
            w1.word = w2.word 
        )
AND 
    w1.user_id = 99
GROUP BY 
    w1.word
ORDER BY 
    count DESC 
LIMIT 20;

确保您在 user_id 和单词(或组合)上有索引,使用解释来查看查询计划以及最适合您的内容。

======
编辑:还可以尝试使用 IS NULL 的 LEFT JOIN 解决方案:

SELECT
    w1.word,
    COUNT(w1.word) AS count
FROM
    words w1
        LEFT JOIN words w2 ON (w1.word = w2.word AND w1.user_id <> w2.user_id)
WHERE
    w1.user_id = 99
AND
    w2.word IS NULL
GROUP BY
    w1.word
ORDER BY 
    count DESC 
LIMIT 20;

尝试在两列上建立索引:

CREATE INDEX idx_word_user ON words ( word, user_id);

Try NOT EXISTS():

SELECT  
    w1.word, 
    COUNT(w1.word) as count
FROM 
    words w1
WHERE         
    NOT EXISTS (
        SELECT 1 
        FROM 
            words w2 
        WHERE 
            w2.user_id <> 99 
        AND 
            w1.word = w2.word 
        )
AND 
    w1.user_id = 99
GROUP BY 
    w1.word
ORDER BY 
    count DESC 
LIMIT 20;

Make sure you have an index on the user_id and word (or a combination), use explain to see the queryplan and what works best for you.

======
Edit: Also try the LEFT JOIN solution using a IS NULL:

SELECT
    w1.word,
    COUNT(w1.word) AS count
FROM
    words w1
        LEFT JOIN words w2 ON (w1.word = w2.word AND w1.user_id <> w2.user_id)
WHERE
    w1.user_id = 99
AND
    w2.word IS NULL
GROUP BY
    w1.word
ORDER BY 
    count DESC 
LIMIT 20;

Try an index on both columns:

CREATE INDEX idx_word_user ON words ( word, user_id);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文