SELECT * FROM table WHERE x IN (...几百个整数...)
表大约有 800 万行。 X 有一个非唯一索引。
显示索引,显示表中键名称 X 上有一个非唯一索引,“seq_in_index”为 1,排序规则 A,基数 7850780,sub_part NULL,packed NULL,index_type BTREE。
不过,此查询可能需要 5 秒才能运行。 整数列表来自另一个系统,我不允许将它们存储在表中,因为它们代表社交网络上的友谊。
有没有比大量 IN 语句更快的方法?
Table has about 8 million rows. There is a non-unique index for X.
Showing indexes, it shows that in the table there is a non-unique index on key name X with "seq_in_index" of 1, collation A, cardinality 7850780, sub_part NULL, packed NULL, index_type BTREE.
Still, this query can take 5 seconds to run. The list of ints comes from another system, and I am not allowed to store them in a table, because they represent friendships on a social network.
Is there a faster way than a massive IN statement?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以将 ID 列表转换为临时表(或表变量,如果 MySql 支持的话)并与其连接。
该表的生存时间与查询一样长,因此您实际上并没有在表中存储任何内容。
You can convert your list of IDs into a temp-table (or table-var if MySql supports them) and join with it.
The table would only live as long as the query so you're not actually storing anything in a table.
您可以尝试将它们存储在临时表中。 该表不会永久存储在数据库中,我认为生成的联接(假设您也为临时表建立索引)会更快,因为它将能够并行处理索引而不必进行索引查找对于每个 int IN 子句。 当然,如果 MySQL 知道它将使用索引,它可能会优化 IN 子句并执行相同的操作,因此它实际上可能不会为您带来任何好处。 不过我会尝试一下,看看是否更快。
You could try storing them in a temporary table. This table wouldn't be stored in the database permanently and I think the resulting join (assuming that you index the temporary table as well) would be faster since it would be able to process the indices in parallel and not have to do an index lookup for each int the IN clause. Of course, MySQL may optimize the IN clause and do the same thing if it knows that it will be using an index so it may not actually gain you anything. I would give a try though and see if it is faster.
正如其他人所建议的,临时表是最合适的解决方案。
但请注意,根据基数和临时表/in() 条件中的行数,优化器可能仍会诉诸使用顺序扫描,因为顺序读取可能比大量随机查找快得多。索引。
此时考虑重新设计关系可能是适当的。
As suggested by others, a temporary table is the most appropriate solution.
Be aware though, that depending on cardinality and the number of rows in your temporary table/in() condition the optimizer may still resort to using a sequential scan because of the fact that sequential reads can be a lot faster than lots of random seeks in the index.
At this point it may be appropriate to consider redesigning the relations.