Mongodb 将重音字符匹配为基础字符

发布于 2024-12-09 05:33:52 字数 268 浏览 3 评论 0原文

在 MongoDB“db.foo.find()”语法中,我如何告诉它匹配所有字母及其重音版本?

例如,如果我的数据库中有一个姓名列表:
若昂
弗朗索瓦
Jesús

我如何允许搜索字符串“Joao”、“Francois”或“Jesus”来匹配给定的名称?
我希望我不必每次都进行这样的搜索:
db.names.find({name : /Fr[aã...][nñ][cç][所有重音 o 字符][所有重音 i 字符]s/ })

In MongoDB "db.foo.find()" syntax, how can I tell it to match all letters and their accented versions?

For example, if I have a list of names in my database:
João
François
Jesús

How would I allow a search for the strings "Joao", "Francois", or "Jesus" to match the given name?
I am hoping that I don't have to do a search like this every time:
db.names.find({name : /Fr[aã...][nñ][cç][all accented o characters][all accented i characters]s/ })

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

从 Mongo 3.2 开始,您可以使用 $text 并将 $diacriticSensitive 设置为 false:

{
  $text:
    {
      $search: <string>,
      $language: <string>,
      $caseSensitive: <boolean>,
      $diacriticSensitive: <boolean>
    }
}

请参阅 Mongo 文档中的更多信息:https://docs.mongodb.com/manual/reference/operator/query/text/

As of Mongo 3.2, you can use $text and set $diacriticSensitive to false:

{
  $text:
    {
      $search: <string>,
      $language: <string>,
      $caseSensitive: <boolean>,
      $diacriticSensitive: <boolean>
    }
}

See more in the Mongo docs: https://docs.mongodb.com/manual/reference/operator/query/text/

计㈡愣 2024-12-16 05:33:52

我建议您添加一个索引字段,例如简化字符串的 NameSearchable,例如

  • João ->若昂
  • ·弗朗索瓦 ->弗朗索瓦
  • ·赫苏斯 ->耶稣
  • 于尔根 -> JUERGEN

搜索时可以使用在数据库中插入新项目时使用的相同映射。具有正确大小写和重音符号的原始字符串将被保留。

最重要的是,查询可以利用索引。 不区分大小写的查询和正则表达式查询不能使用索引(根正则表达式除外),并且在大型集合上增长速度会非常慢。

哦,由于可以从原始字符串创建简化字符串,因此将其添加到现有集合中不是问题。

I suggest you add an indexed field like NameSearchable of simplified strings, e.g.

  • João -> JOAO
  • François -> FRANCOIS
  • Jesús -> JESUS
  • Jürgen -> JUERGEN

The same mapping that is used when inserting new items in the database can be used when searching. The original string with correct casing and accents will be preserved.

Most importantly, the query can make use of indexing. Case insensitive queries and regex queries can not use indexes (with the exception of rooted regexs) and will grow prohibitively slow on large collections.

Oh, and since the simplified strings can be created from the original strings, it's not a problem to add this to existing collections.

时光瘦了 2024-12-16 05:33:52

在此博客中: http://tech.rgou.net/en/php/pesquisas-nao-sectiveis-ao-caso-e-acento-no-mongodb-e-php/

有人用过你试图做的方法。据我所知,这是最新 MongoDB 版本的唯一解决方案。

In this blog: http://tech.rgou.net/en/php/pesquisas-nao-sensiveis-ao-caso-e-acento-no-mongodb-e-php/

Somebody used the approach you were trying to do. This is as far as I know the only solution for the latest MongoDB version.

宛菡 2024-12-16 05:33:52

看起来更像是 mongoDb 目前不支持的模糊匹配搜索。
你可以尝试的是:

/1。将每个条目的名称变体存储在集合中的单独元素中。然后可以通过查找搜索项是否存在于变体数组中来运行查询。

/2。为同一集合中的每个名称存储 soundex 字符串。然后对于您的搜索字符串,获取 soundex 字符串,并查询数据库,您将获得与您的查询具有相似 Soundex 结果的结果。
您可以在脚本中进一步过滤和验证该数据。
示例:

François 的 Soundex 代码 = F652,Francois 的 Soundex 代码 = F652

Jesús 的 Soundex 代码 = J220,Jesus 的 Soundex 代码 = J220

在此处查看更多信息:
http://creativyst.com/Doc/Articles/SoundEx1/SoundEx1.htm#SoundExConverter

It seems more like fuzzy matching search which mongoDb does not support currently.
What you can try is:

/1. Store variations of the name in seperate element in the collection for each entry. Then the query can be run by finding if the search term exists within the variations array.

or

/2. Store soundex string for each of the names in the same collection. Then for your search string, get a soundex string , and query the database, you will get result which has similar Soundex result to your query.
You can filter and verify that data more in your script.
example :

Soundex code for François = F652, Soundex Code for Francois = F652

Soundex code for Jesús = J220, Soundex Code for Jesus = J220

Check more here :
http://creativyst.com/Doc/Articles/SoundEx1/SoundEx1.htm#SoundExConverter

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文