UTF-8 与 Latin1 mysql,UTF-8 上未使用索引
我尝试使用 UTF-8 和 Latin1 字符集创建 mysql 表。当我使用 Latin1 时,会使用索引,而当我使用 UTF-8 时,选择/限制记录时不会使用索引。我的字符集是否缺少某些内容导致发生这种情况?
干杯
柯
I have tried creating my mysql tables with both UTF-8 and Latin1 Char sets. When I use Latin1 my indexes are used, when I use UTF-8 indexes are not used when selecting/limiting records. Is there something I am missing with the Char sets that causes this to happen?
Cheers
Ke
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
仅当表达式的排序规则与索引列上的排序规则匹配时才能使用索引。
如果表达式的
COERCIBILITY
低于列的 COERCIBILITY(即2
),则列的排序规则将转换为表达式的排序规则,并且不使用索引。通常,文字的 COERCIBILITY 为
4
,用户变量为3
,因此这应该不是问题。但是,如果您在
JOIN
或UNION
中混合不同的排序规则,则无法保证转换顺序。在这种情况下,您应该为要转换的列提供显式排序规则(很可能,您希望将
latin1
转换为UTF8
),这应该是该列的排序规则您正在投射到:Indexes can be used only when the expression's collation matches that on the indexed column.
If the expression's
COERCIBILITY
is lower than that of the column (that is2
), the column's collation is casted to that of the expression, and the index is not used.Normally, literals have
COERCIBILITY
of4
and user variables that of3
, so this should be not a problem.However, if you mix different collations in a
JOIN
orUNION
, the cast order is not guaranteed.In this case you should provide explicit collation to the column your are casting (most probably, you want to cast
latin1
toUTF8
), and this should be the collation of the column you are casting to:我现在明白了,我加入的表不是相同的字符集
一旦我更改了这些索引,索引就可以正常工作。
I understand this now, the tables i was joining were not of the same char set
Once I changed these the indexes worked fine.
索引本身使用与其索引的列相同的编码进行存储。将 UTF-8 字符与 latin1 字符进行比较无法使用索引,因为它需要将两者转换为相同的编码,因为索引优化是在字节级别上执行的(并且
ß
latin1 的字节序列与 UTF-8 中的字节序列不同)。The indexes themselves are stored with the same encoding as the columns they are indexing. Comparing a UTF-8 character to a latin1 character cannot make use of the index, because it would need to convert both to the same encoding, since the index optimizations are performed on a byte level (and
ß
in latin1 has a different byte sequence than in UTF-8).