MySQL 中不区分大小写的 unicode 排序规则
我有一个数据库,我们在其中存储用户名,每个名字的首字母大写——即 IsaacSparling。我正在尝试对我的 MySQL (v5.1.46) 数据库执行不区分大小写的自动完成。表的字符集为 UTF8,排序规则为 utf8_unicode_ci。我也针对 utf8_general_ci 排序规则进行了这些测试。
纯 ASCII 文本工作正常:(
mysql> select username from users where username like 'j%';
+----------------+
| username |
+----------------+
| J******** |
| J*********** |
| J************* |
+----------------+
3 rows in set (0.00 sec)
mysql> select username from users where username like 'J%';
+----------------+
| username |
+----------------+
| J******** |
| J*********** |
| J************* |
+----------------+
3 rows in set (0.00 sec)
名称经过编辑,但它们就在那里)。
然而,当我尝试对 ASCII 集之外的 unicode 字符执行相同操作时,却没有这样的运气:
mysql> select username from users where username like 'ø%';
Empty set (0.00 sec)
mysql> select username from users where username like 'Ø%';
+-------------+
| username |
+-------------+
| Ø********* |
+-------------+
1 row in set (0.00 sec)
一些调查使我发现了这一点: http://bugs.mysql.com/bug.php?id=19567 (tl;dr,这是 unicode 排序规则的一个已知错误,修复它是在 '新功能的优先级——即不会在任何合理的时间内完成)。
有没有人发现任何有效的解决方法,允许在 MySQL 中不区分大小写地搜索 unicode 字符?任何想法表示赞赏!
I've got a database where we store usernames with a capital first letter of each name -- ie, IsaacSparling. I'm trying to do case insensitive autocomplete against my MySQL (v5.1.46) db. Table has a charset of UTF8 and a collation of utf8_unicode_ci. I've done these tests against the utf8_general_ci collation as well.
Plain ASCII text works fine:
mysql> select username from users where username like 'j%';
+----------------+
| username |
+----------------+
| J******** |
| J*********** |
| J************* |
+----------------+
3 rows in set (0.00 sec)
mysql> select username from users where username like 'J%';
+----------------+
| username |
+----------------+
| J******** |
| J*********** |
| J************* |
+----------------+
3 rows in set (0.00 sec)
(names redacted, but they're there).
However, when I try to do the same for unicode characters outside the ASCII set, no such luck:
mysql> select username from users where username like 'ø%';
Empty set (0.00 sec)
mysql> select username from users where username like 'Ø%';
+-------------+
| username |
+-------------+
| Ø********* |
+-------------+
1 row in set (0.00 sec)
Some investigation has lead me to this: http://bugs.mysql.com/bug.php?id=19567 (tl;dr, this is a known bug with the unicode collations, and fixing it is at 'new feature' priority -- ie, won't be finished in any reasonable timeframe).
Has anybody discovered any effective workarounds that allow for case-insensitive searching for unicode characters in MySQL? Any thoughts appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您关心的是能够按文本对字段值进行排序,而不关心它是大写还是小写,我认为您可以做的最好的事情是在寻址字段时而不是仅输入用户名>,输入
LOWER(username) username
,然后您可以通过该字段通过其名称调用它来完美地使用订单IF what you care about is being able to order the field values by the text without caring if it is in upper or lower case I think the best thing you can do is when addressing the field instead of typing just
username
, typeLOWER(username) username
and then you can perfectly use an order by that field calling it by its name您尝试过使用 CONVERT 吗?类似的东西
可能对你有用。
Have you tried using CONVERT? Something like
might work for you.
我刚刚使用查询“
我的字符集客户端设置为'utf8',但字符集连接和字符集结果设置为'latin1'”解决了同样的问题。因此,UPPER、LOWER、LIKE 函数没有按预期工作。
我只是
在连接后立即插入该行以获得不区分大小写的搜索工作。
I just resolved the same problem using the query
My character_set_client was set to 'utf8', but character_set_connection and character_set_results were set to 'latin1'. Thus, the functions UPPER, LOWER, LIKE did not work as expected.
I just inserted the line
right after connection to get the case-insensitive searching work.
对我来说工作正常,版本 5.1.42-community
也许你的 mysql 客户端没有正确发送 unicode 字符。我用 sqlYog 进行了测试,它在 utf8_unicode_ci 和 utf8_general_ci 排序规则下都工作得很好
Works fine for me with version 5.1.42-community
Maybe your mysql client did not send the unicode characters properly. I tested with sqlYog and it worked just fine with both utf8_unicode_ci and utf8_general_ci collations