mysql搜索变音符号不敏感?

发布于 2024-08-07 15:46:06 字数 150 浏览 5 评论 0原文

如何使变音符号不敏感,

的波斯语字符串与

例如这个带有变音符号

mySql 中删除的变音符号不同

符号

是否有一种方法告诉 mysql 忽略变音符号或者我是否必须删除所有变音 手动在我的字段中添加变音符号?

How do I make a diacritic insensitive,

ex this persian string with diacritics

هواى بَر آفتابِ بارِز

is not the same as with removed diacritics in mySql

هواى بر آفتاب بارز

Is there a way of telling mysql to ignore the diacritics or do I have to remove all the diacritics in my fields manually?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

亽野灬性zι浪 2024-08-14 15:46:06

这有点像不区分大小写的问题。

SELECT * FROM blah WHERE UPPER(foo) = "THOMAS"

只需在比较之前将两个字符串都转换为无变音符号即可。

It's a bit like case-insensitivity problem.

SELECT * FROM blah WHERE UPPER(foo) = "THOMAS"

Just convert both strings to diacritic-free before comparing.

那一片橙海, 2024-08-14 15:46:06

我正在使用 utf8 (utf8_general_ci) 并且搜索没有变音符号的阿拉伯语不起作用,它不是不敏感的,或者是但不能正常工作。

我尝试使用十六进制查看带有和不带有变音符号的字符,它看起来像 mysql 将其视为两个不同的字符。

我正在考虑使用十六进制和替换(大量替换)来搜索单词,同时过滤变音符号。

我的解决方案是对阿拉伯语单词进行不敏感的搜索:

SELECT arabic_word FROM Word
WHERE
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(HEX(REPLACE(
arabic_word, "-", "")), "D98E", ""), "D98B", ""), "D98F", ""), "D98C", 
""),"D991",""),"D992",""),"D990",""),"D98D","") LIKE ?', '%'.$search.'%'

以十六进制格式设置的值是我们要过滤的变音符号。
丑陋,但我没有找到另一个答案。

I'm using utf8 (utf8_general_ci) and searching arabic without diacritics doesn't work, it isn't insensitive or it is but don't work properly.

I tried looking at the character with and without a diacritic using Hex and it looks like mysql considering it as two distinct character.

I'm thinking about using hex and replace (a lot of replace) to search for words while filtering diacritics.

my solution to have an insensitive search for arabic words:

SELECT arabic_word FROM Word
WHERE
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(HEX(REPLACE(
arabic_word, "-", "")), "D98E", ""), "D98B", ""), "D98F", ""), "D98C", 
""),"D991",""),"D992",""),"D990",""),"D98D","") LIKE ?', '%'.$search.'%'

the values formatted in hexadecimal are diacritics that we want to filter.
ugly but I didn't find another anwser.

可爱咩 2024-08-14 15:46:06

您是否已经阅读了所有MySQL 字符集支持来检查如果您的问题还没有答案?特别是要理解排序规则。

我疯狂猜测使用 utf8_general_ci 可以为你做正确的事情

Did you already read all of MySQL Character Set Support to check if the answer to your question isn't already there? Especially collations are to be understood.

I wild guess is that using utf8_general_ci could do the right thing for you

爱本泡沫多脆弱 2024-08-14 15:46:06

在进行查询之前进行设置

set names 'utf8'

通常可以解决拉丁语查找的问题。我不确定这是否也适用于阿拉伯语。

Setting

set names 'utf8'

before making a query usually does the trick for latin lookups. I'm not sure if this works for arabic as well.

流云如水 2024-08-14 15:46:06

我找到的最干净的解决方案是:

SELECT arabic_word 
FROM Word
WHERE ( arabic_word REGEXP '{$search}' OR SOUNDEX( arabic_word ) = SOUNDEX( '{$search}' ) );

我没有检查 SOUNDEX 函数的成本。我想这可以用于小表,但不适用于大型数据集。

The cleanest solution I've come to is:

SELECT arabic_word 
FROM Word
WHERE ( arabic_word REGEXP '{$search}' OR SOUNDEX( arabic_word ) = SOUNDEX( '{$search}' ) );

I haven't checked the costs of the SOUNDEX function. I guess this could for small tables, but not for large datasets.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文