这是我在 MySQL 中尝试执行的最佳查询吗?
我正在运行许多查询,将不断变化的数据合并到主表中,其中一个查询(如下)似乎运行得很慢。
设置如下:products
表和products_temp
表具有相同的结构。新数据进入 products_temp
表,然后我运行类似于下面的查询,将新数据与主 products
表合并。
INSERT INTO products ( name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc )
SELECT name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc
FROM products_temp
WHERE feeds_id = 449
AND productid NOT IN (
SELECT productid
FROM products
WHERE feeds_id = 449
)
这两个表都有 feeds_id
索引,但我感觉这没有任何区别。
例如,products
可能包含超过 350 万行,products_temp
可能包含 50,000 行以合并 products
。
所以我的问题是这需要多长时间?我能多快完成?
I'm running a number of queries that merge constantly-changing data into a master table and one of the queries (below) seems to be running quite slowly.
The set up is as follows: products
table and products_temp
table have identical structures. New data goes into the products_temp
table, then I run queries similar to the one below to merge the new data with the master products
table.
INSERT INTO products ( name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc )
SELECT name, brand, price, feeds_id, img_url, referral_url, productid, isbn, ean, upc
FROM products_temp
WHERE feeds_id = 449
AND productid NOT IN (
SELECT productid
FROM products
WHERE feeds_id = 449
)
Both of these tables have indexes on the feeds_id
but I have a feeling that isn't making any difference.
As an example products
may contain over 3.5 million rows and products_temp
may contain 50,000 to merge products
.
So my question really is how long should that take? How quick can I make it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
确实,这种技术称为影子表技巧。
It is, this technique is called Shadow Table trick.
您可以在主表中的
feeds_id
上删除索引,并在 (feeds_id
,productid
) 上添加唯一键。因此,您将能够使用INSERT IGNORE
进行合并。注意索引中字段的顺序 -feeds_id
必须在前面,这样您就可以使用该索引按feeds_id
进行搜索。NOT IN
可能会导致速度变慢。根据括号内的内容,查询可能会陷入“准备”状态。如果您仍然遇到速度缓慢的情况,请使用
EXPLAIN
或分析功能。You may drop index on
feeds_id
and add a unique key on (feeds_id
,productid
) in the main table. Thus, you will be able to useINSERT IGNORE
for merging. Pay attention to the order of fields in index -feeds_id
must be first, so you can perform a search byfeeds_id
using this index.NOT IN
may cause a slowdown. Depending on what's inside the parenthesis, the query may get stuck in the 'preparing' state.If you still experience slowdowns, use
EXPLAIN
or the profiling feature.尝试重构查询并将其设置为 LEFT JOIN 检查右侧的 NULL
并确保您有此索引
Try refactoring the query and set it up as a LEFT JOIN checking for NULL on the right side
also make sure you have this index
你应该尽量避免 WHERE x not in (select xxx)。 mysql 查询优化器在子查询方面存在很多问题,例如会忽略索引。
you should greatly avoid WHERE x not in (select xxx). The mysql query optimizer is very bugged with subqueries and will ignore indexes for example.