当我将字段更改为 utf8 时,为什么我的查询时间会增加两倍?

发布于 2024-07-12 01:28:21 字数 234 浏览 4 评论 0原文

我最初将字段设置为 latin1_swedish_ci,然后将其更改为 utf8_general_ci(字段和表),然后发现我的查询从 ~1.8 秒变为 ~3.3 秒。 我在该字段上有一个索引,甚至重新创建了该索引(删除然后添加)。 该字段用在 order by 子句中。

如果可能有问题或者这是正常现象,有什么想法吗?

我正在运行 MySQL 5.0。

I originally had my field set as latin1_swedish_ci, which I changed to utf8_general_ci (both field and table) and then found my query went from ~1.8 seconds to ~3.3. I have an index on the field and have even recreated the index (delete then add). The field is used in an order by clause.

Any ideas if there might be a problem or is this normal?

I'm running MySQL 5.0.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

桃扇骨 2024-07-19 01:28:21

latin1_swedish_ci 是一种每个字符一个八位组的编码系统。 一旦您知道了排序规则(或排序)顺序,比较字符和整个字符串就相对简单了。

utf8_general_ci 每个字符需要 1 到 4 个八位字节。 解码这种编码中的八位字节数据比较困难,因此需要更长的时间。

latin1_swedish_ci is a one-octet-per-character encoding system. Once you know the collation (or sorting) order comparing characters and whole strings is relatively trivial.

utf8_general_ci needs between one and four octets per character. Decoding the octet data in this encoding is harder, so it takes longer.

轻许诺言 2024-07-19 01:28:21

我自己不经常使用 mysql,但我也许能够对问题所在提供一些见解。

latin1_swedish_ci 字符集是单八位字节编码系统,这意味着使用该系统编码的每个字符恰好占用一个字节。 将此与 utf8_general_ci 字符集进行对比,其中每个字符由一到四个八位字节组成,这意味着需要一到四个字节来表示每个字符。

这有一个明显的缺点,就是utf8字符占用更多的空间、更多的内存,最重要的是,需要更多的cpu时间来识别。 最明显的优点是utf8字符可以编码为任何unicode字符。

由于这个问题被标记为“查询优化”,因此您需要问自己是否真的需要表示更多“外来”字符,或者单八位组系统(例如纯 ASCII 表)中表示的字符是否是足以满足您的需求。 因为本质上,utf8 会消耗更多的 cpu/内存。

I myself don't use mysql that often but I might be able to give some insights into where the problem lies.

the latin1_swedish_ci character set is a single octet encoding system, meaning that every character encoded with this system takes up exactly one byte. Contrast this with the utf8_general_ci character set, where each character consists of from one to four octets per character, meaning one to four bytes are necessary to represent each character.

This has the obvious disadvantage that utf8 characters takes up more space, more memory, and most importantly, more cpu time to identify. And the most obvious advantage is that utf8 characters can encode for any unicode character.

Since this question is marked with 'query-optimization', you need to ask yourself if you really need to represent the more 'exotic' characters, or if the ones represented in single-octet systems (such as the plain ASCII-table) are enough for your needs. Since by its nature, utf8 will eat more cpu/memory.

总攻大人 2024-07-19 01:28:21

您的查询怎么样?

您是否可以在该字段上使用过滤器,并将参数的数据类型指定为非 utf8 数据类型? 在这种情况下,DBMS 将不得不进行一些转换,这会影响性能。

How does your query look like ?

Is it possible that you use a filter on that field, and that you specify the data-type of your parameter to be a non-utf8 datatype ? In that case, the DBMS will have to do some casting, which will hinder performance.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文