使用 OR 运算符时正确索引

发布于 2024-11-30 01:45:57 字数 319 浏览 1 评论 0原文

我有一个这样的查询：

SELECT fields FROM table
WHERE field1='something' OR field2='something' 
OR field3='something' OR field4='something'

为此查询索引此类表的正确方法是什么？

像这样的查询需要整整一秒钟才能运行！我有 1 个索引，其中包含所有 4 个字段，所以我认为 mysql 会做这样的事情：

遍历索引中的每一行，这样想： field1 是什么东西？字段2怎么样？字段3？字段4？好的，不，转到下一行。

原文

I have a query like this:

SELECT fields FROM table
WHERE field1='something' OR field2='something' 
OR field3='something' OR field4='something'

What would be the correct way to index such a table for this query?

A query like this takes a entire second to run! I have 1 index with all 4 of those fields in it, so I'd think mysql would do something like this:

Go through each row in the index thinking this:
Is field1 something? How about field2? field3? field4? Ok, nope, go to the next row.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

能否归途做我良人 2024-12-07 01:45:57

您误解了索引的工作原理。

想象一下电话簿（相当于姓氏在前、名字在后的两列索引）。如果我要求您在电话簿中查找姓氏为“Smith”的所有人员，您会受益于姓名按这种方式排序的事实；你可以假设史密斯一家是组织在一起的。但是，如果我要求您找到所有名字为“John”的人，您将无法从索引中受益。约翰斯可以有任何姓氏，因此他们分散在整本书中，你最终不得不从头到尾艰难地搜索。

现在，如果我要求您查找所有姓氏为“史密斯”或名字为“约翰”的人，您可以像以前一样轻松地找到史密斯一家，但这根本无法帮助您找到约翰一家。它们仍然散布在整本书中，你必须费尽心思去寻找它们。

SQL中的多列索引也是如此。索引按第一列排序，如果第一列中存在平局，则按第二列排序，如果前两列都平局，则按第三列排序，等等。它不是按所有列排序的同时地。因此，除了索引中最左边的列之外，多列索引无助于提高搜索词的效率。

回到你原来的问题。

为此查询索引此类表的正确方法是什么？

在每列上创建单独的单列索引。根据 MySQL 的，这些索引之一将是比其他索引更好的选择估计如果使用索引将产生多少 I/O 操作。

现代版本的 MySQL 还具有一些关于索引合并，因此查询可能在给定表中使用多个索引，然后尝试合并结果。否则，MySQL 往往会被限制在给定查询中每个表使用一个索引。

很多人成功使用的另一个技巧是对每个索引列（应该使用各自的索引）进行单独的查询，然后对结果进行 UNION 操作。

SELECT fields FROM table WHERE field1='something' 
UNION
SELECT fields FROM table WHERE field2='something' 
UNION
SELECT fields FROM table WHERE field3='something' 
UNION
SELECT fields FROM table WHERE field4='something'

最后一个观察：如果您发现自己在四个字段中搜索相同的'something'，您应该重新考虑所有四个字段是否实际上是同一事物，并且您因设计一个表而感到内疚< a href="http://en.wikipedia.org/wiki/First_normal_form#Repeating_groups" rel="noreferrer">违反具有重复组的第一范式。如果是这样，也许 field1 到 field4 属于子表中的单个列。那么索引和查询就变得容易多了：

SELECT fields from table INNER JOIN child_table ON table.pk = child_table.fk
WHERE child_table.field = 'something'

You misunderstand how indexes work.

Think of a telephone book (the equivalent of a two-column index on last name first, first name last). If I ask you to find all people in the telephone book whose last name is "Smith," you can benefit from the fact that the names are ordered that way; you can assume that the Smiths are organized together. But if I ask you to find all the people whose first name is "John" you get no benefit from the index. Johns can have any last name, and so they are scattered throughout the book and you end up having to search the hard way, from cover to cover.

Now if I ask you to find all people whose last name is "Smith" OR whose first name is "John", you can find the Smiths easily as before, but that doesn't help you at all to find the Johns. They're still scattered throughout the book and you have to search for them the hard way.

It's the same with multi-column indexes in SQL. The index is sorted by the first column, then sorted by the second column in cases of ties in the first column, then sorted by the third column in cases of ties in both the first two columns, etc. It is not sorted by all columns simultaneously. So your multi-column index doesn't help to make your search terms more efficient, except for the left-most column in the index.

Back to your original question.

What would be the correct way to index such a table for this query?

Create a separate, single-column index on each column. One of these indexes will be a better choice than the others, based on MySQL's estimation of how many I/O operations the index will incur if it is used.

Modern versions of MySQL also have some smarts about index merging, so the query may use more than one index in a given table, and then try to merge the results. Otherwise MySQL tends to be limited to use one index per table in a given query.

Another trick that a lot of people use successfully is to do a separate query for each of your indexed columns (which should use the respective index) and then UNION the results.

SELECT fields FROM table WHERE field1='something' 
UNION
SELECT fields FROM table WHERE field2='something' 
UNION
SELECT fields FROM table WHERE field3='something' 
UNION
SELECT fields FROM table WHERE field4='something'

One final observation: if you find yourself searching for the same 'something' across four fields, you should reconsider if all four fields are actually the same thing, and you're guilty of designing a table that violates First Normal form with repeating groups. If so, perhaps field1 through field4 belong in a single column in a child table. Then it becomes a lot easier to index and query:

SELECT fields from table INNER JOIN child_table ON table.pk = child_table.fk
WHERE child_table.field = 'something'

回复收藏 0 原文

南街女流氓 2024-12-07 01:45:57

除了之前的评论之外：
如果优化器认为这是个好主意，一些像 Mysql/PostgreSql 这样的 RDMS 可以使用索引合并。
因此，您可以为每个字段创建不同的索引，或者创建一些复合索引，例如field1，field2和field3，field4。最后，您应该尝试几种不同的解决方案并选择最佳的解释方案。

回复收藏 0 原文

~没有更多了~