MySQL 是否可以处理单个 utf-8 字符键以及整数?
我正在开发一个中文/日语学习网络应用程序,其中许多表格都按这些语言的字符(“字形”)进行索引。
我想知道字形的整数代码点值是否比使用单个 utf8 字符(对于主键和索引)更能提高性能?
使用单个 utf8 字符将非常有用,因为我可以在我使用的 shell 中很好地看到 unicode 字符,这使得调试该应用程序的 SQL 查询变得更加容易。
理论上,MySQL 会将单个 utf8 字符视为唯一的整数值,类似于中型整数(3 个字节)...但我怀疑 MySQL 会将列作为字符串处理。
由于 MySQL 将我的单个 utf8 字符视为字符串,是否会出现性能问题?
您是否建议坚持使用索引和主键的整数代码点,并且也许使用 CONVERT() 或其他运算符来获取结果中的 utf8 字符?
I' working on a Chinese/Japanese learning web app where many tables are indexed by the characters (the "glyphs") of those languages.
I'm wondering if the integer codepoint value of the glyph would be better for performance than using a single utf8 character (for primary key and indexes)?
Using a single utf8 character would be very useful because I can see the unicode characters fine in the shell I'm using, and this makes debugging the SQL queries of this app easier.
In theory MySQL would treat a single utf8 character as a unique integer value similarly to a mediumint (3 bytes)... but I suspect MySQL will handle the column as a string instead.
Would there be performance issues due to MySQL treating my single utf8 char as a string?
Would you recommend to stick to the integer codepoint for indexes and primary keys, and perhaps use CONVERT() or other operator to get the utf8 character in results?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
MySQL 会将 UTF-8 字符存储并索引为多字节字符串,是的。因此,我希望整数是一个更快的键,尽管性能差异不太可能很大。
另一个可能的问题是,在 MySQL 6.0 之前,utf8 字符集不支持基本多语言平面之外的字符(即每个字符限制为三个字节)。如果你想在补充表意平面中使用一些非常晦涩的汉字,那是不行的。
MySQL will store and index a UTF-8 character as a multi-byte string, yes. So I would expect integer to be a faster key, though the difference in performance is unlikely to be significant.
Another possible issue is that until MySQL 6.0, the utf8 character set doesn't support characters outside the Basic Multilingual Plane (ie it's limited to three bytes per character). If you want to use some of the really obscure kanji in the Supplementary Ideographic Plane, that'd be no good.