这是我在 MySQL 中尝试执行的最佳查询吗?

发布于 2024-09-10 16:59:28 字数 750 浏览 0 评论 0原文

我正在运行许多查询,将不断变化的数据合并到主表中,其中一个查询(如下)似乎运行得很慢。

设置如下:products 表和products_temp 表具有相同的结构。新数据进入 products_temp 表,然后我运行类似于下面的查询,将新数据与主 products 表合并。

INSERT INTO products ( name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc )
SELECT name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc
FROM products_temp 
WHERE feeds_id = 449
AND productid NOT IN (
    SELECT productid
    FROM products
    WHERE feeds_id = 449
)

这两个表都有 feeds_id 索引,但我感觉这没有任何区别。

例如,products 可能包含超过 350 万行,products_temp 可能包含 50,000 行以合并 products

所以我的问题是这需要多长时间?我能多快完成?

I'm running a number of queries that merge constantly-changing data into a master table and one of the queries (below) seems to be running quite slowly.

The set up is as follows: products table and products_temp table have identical structures. New data goes into the products_temp table, then I run queries similar to the one below to merge the new data with the master products table.

INSERT INTO products ( name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc )
SELECT name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc
FROM products_temp 
WHERE feeds_id = 449
AND productid NOT IN (
    SELECT productid
    FROM products
    WHERE feeds_id = 449
)

Both of these tables have indexes on the feeds_id but I have a feeling that isn't making any difference.

As an example products may contain over 3.5 million rows and products_temp may contain 50,000 to merge products.

So my question really is how long should that take? How quick can I make it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

猫瑾少女 2024-09-17 16:59:28

确实,这种技术称为影子表技巧

It is, this technique is called Shadow Table trick.

独﹏钓一江月 2024-09-17 16:59:28

您可以在主表中的 feeds_id 上删除索引,并在 (feeds_id, productid) 上添加唯一键。因此,您将能够使用 INSERT IGNORE 进行合并。注意索引中字段的顺序 - feeds_id 必须在前面,这样您就可以使用该索引按 feeds_id 进行搜索。

NOT IN 可能会导致速度变慢。根据括号内的内容,查询可能会陷入“准备”状态。

如果您仍然遇到速度缓慢的情况,请使用 EXPLAIN 或分析功能。

You may drop index on feeds_id and add a unique key on (feeds_id, productid) in the main table. Thus, you will be able to use INSERT IGNORE for merging. Pay attention to the order of fields in index - feeds_id must be first, so you can perform a search by feeds_id using this index.

NOT IN may cause a slowdown. Depending on what's inside the parenthesis, the query may get stuck in the 'preparing' state.

If you still experience slowdowns, use EXPLAIN or the profiling feature.

心安伴我暖 2024-09-17 16:59:28

尝试重构查询并将其设置为 LEFT JOIN 检查右侧的 NULL

INSERT INTO products ( name, brand, price, feeds_id,
img_url, referral_url, productid, isbn, ean, upc )
SELECT A.name, A.brand, A.price,
A.feeds_id, A.img_url, A.referral_url,
A.productid, A.isbn, A.ean, A.upc
FROM
(SELECT * FROM products_temp A WHERE feeds_id = 449) A
LEFT JOIN
(SELECT productid FROM products WHERE feeds_id = 449) B
USING (productid)
WHERE B.productid IS NULL;

并确保您有此索引

ALTER TABLE products_temp ADD INDEX feeds_id (feeds_id);

Try refactoring the query and set it up as a LEFT JOIN checking for NULL on the right side

INSERT INTO products ( name, brand, price, feeds_id,
img_url, referral_url, productid, isbn, ean, upc )
SELECT A.name, A.brand, A.price,
A.feeds_id, A.img_url, A.referral_url,
A.productid, A.isbn, A.ean, A.upc
FROM
(SELECT * FROM products_temp A WHERE feeds_id = 449) A
LEFT JOIN
(SELECT productid FROM products WHERE feeds_id = 449) B
USING (productid)
WHERE B.productid IS NULL;

also make sure you have this index

ALTER TABLE products_temp ADD INDEX feeds_id (feeds_id);
绝情姑娘 2024-09-17 16:59:28

你应该尽量避免 WHERE x not in (select xxx)。 mysql 查询优化器在子查询方面存在很多问题,例如会忽略索引。

you should greatly avoid WHERE x not in (select xxx). The mysql query optimizer is very bugged with subqueries and will ignore indexes for example.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文