如何在 MySQL 中处理字符串比较的引号和撇号以便它们匹配(排序规则)

发布于 2024-10-07 05:00:42 字数 488 浏览 6 评论 0原文

MySQL 使用排序规则进行字符串比较,因为某些字符应该匹配

示例:

SELECT 'é' = 'e' COLLATE utf8_unicode_ci;
SELECT 'oe' = 'œ' COLLATE utf8_unicode_ci; 

两者都返回 true

现在,我如何对引号 (') 和撇号 (') 执行相同操作

这不是同一个字符,而是在写入“时使用的正确字符” it's” 或“l'oiseau”(法语)都是撇号。

事实是 utf8_general_ci 或 utf8_unicode_ci 都不会整理它们。

简单的解决方案是将所有内容存储在引号中,并在用户进行搜索时替换所有撇号,但这是错误的。

真正的解决方案是创建基于 utf8_unicode_ci 的自定义排序规则并将两者标记为等效,但这需要编辑 XML 配置文件并重新启动数据库,而这并不总是可行。

你会怎么做?

MySQL uses collations to do string comparison because some characters should match

Exemple:

SELECT 'é' = 'e' COLLATE utf8_unicode_ci;
SELECT 'oe' = 'œ' COLLATE utf8_unicode_ci; 

both return true

Now, how can I do the same with quotes (') vs apostrophes (’)

This is not the same character, the proper character to use when writing “it’s” or “l’oiseau” (in french) are both the apostrophe.

The fact is that neither utf8_general_ci or utf8_unicode_ci collate them.

The easy solution is to store everything in quotes and do a replace of all the apostrophes when a user does a search, but it’s wrong.

The real solution would be to create a custom collation based on utf8_unicode_ci and mark both as equivalent, but that requires to edit XML config files and to restart the database, which isn’t always possible.

How would you do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

眼眸里的快感 2024-10-14 05:00:42

自定义排序规则似乎是最合适的,但如果不可能,也许您可​​以定制搜索以使用正则表达式。它并不完全理想,但在某些情况下可能有用。至少它允许您以正确的格式存储数据(无需替换引号),并且只需对搜索查询本身进行替换:

INSERT INTO mytable VALUES
(1, 'Though this be madness, yet there is method in ''t'),
(2, 'Though this be madness, yet there is method in ’t'),
(3, 'There ’s daggers in men’s smiles'),
(4, 'There ’s daggers in men''s smiles');

SELECT * FROM mytable WHERE data REGEXP 'There [\'’]+s daggers in men[\'’]+s smiles';

+----+--------------------------------------+
| id | data                                 |
+----+--------------------------------------+
|  3 | There ’s daggers in men’s smiles     |
|  4 | There ’s daggers in men's smiles     |
+----+--------------------------------------+

SELECT * FROM mytable WHERE data REGEXP 'Though this be madness, yet there is method in [\'’]+t';

+----+-----------------------------------------------------+
| id | data                                                |
+----+-----------------------------------------------------+
|  1 | Though this be madness, yet there is method in 't   |
|  2 | Though this be madness, yet there is method in ’t   |
+----+-----------------------------------------------------+

A custom collation seems to be the most appropriate, but if that is not possible, perhaps you could tailor your searches to use regular expressions. It's not exactly ideal, but may be of use in some situations. At least it allows you to store data in the correct format (without having to replace quotes), and just do the replacements on the search query itself:

INSERT INTO mytable VALUES
(1, 'Though this be madness, yet there is method in ''t'),
(2, 'Though this be madness, yet there is method in ’t'),
(3, 'There ’s daggers in men’s smiles'),
(4, 'There ’s daggers in men''s smiles');

SELECT * FROM mytable WHERE data REGEXP 'There [\'’]+s daggers in men[\'’]+s smiles';

+----+--------------------------------------+
| id | data                                 |
+----+--------------------------------------+
|  3 | There ’s daggers in men’s smiles     |
|  4 | There ’s daggers in men's smiles     |
+----+--------------------------------------+

SELECT * FROM mytable WHERE data REGEXP 'Though this be madness, yet there is method in [\'’]+t';

+----+-----------------------------------------------------+
| id | data                                                |
+----+-----------------------------------------------------+
|  1 | Though this be madness, yet there is method in 't   |
|  2 | Though this be madness, yet there is method in ’t   |
+----+-----------------------------------------------------+
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文