MongoDB 复合索引,其中一个字段可以为空

发布于 2025-01-10 08:00:16 字数 190 浏览 0 评论 0原文

如何在 mongo 中创建复合索引,其中一个字段可能不存在或为空?

例如,在下面的文档中,如果我创建一个复合索引名称+年龄。在某些文件中年龄不存在或为空的情况下,我如何才能实现这一目标?

{
  name: "Anurag",
  age: "21",
},
{
  name: "Nitin",
},

How can I create compound index in mongo where one of the fields maybe not present or be null?

For example in below documents if I create a compound index name+age. How can I still achieve this with age being not present or null in some documents?

{
  name: "Anurag",
  age: "21",
},
{
  name: "Nitin",
},

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

久随 2025-01-17 08:00:16

您可以创建部分索引,如下所示:

  db.contacts.createIndex(
   { name: 1 },
   { partialFilterExpression: { age: { $exists: true } } }
   )

解释:

根据文档部分索引仅对集合中满足指定过滤表达式的文档进行索引。通过对集合中的文档子集建立索引,部分索引具有较低的存储要求,并降低了索引创建和维护的性能成本。在这种特殊情况下,假设您的集合有 100k 个文档,但只有 5 个文档存在“age”字段,在这种情况下,部分索引将仅在索引中包含这 5 个字段,从而优化索引存储空间并提供更好的性能。

为了让查询优化器选择此部分索引,查询谓词必须包含名称字段的条件以及年龄字段的非空匹配。

以下示例查询将能够使用索引:

 db.contacts.find({name:"John"})
 db.contacts.find({name:"John",age:{$gt:20}})
 db.contacts.find({name:"John",age:30})

以下示例查询是基于此索引的“覆盖查询”:(

db.contacts.find({name:"John",age:30},{_id:0,name:1,age:1})

此查询将非常高效,因为它直接从索引返回数据)

以下示例查询将无法使用索引:

db.contacts.find({name:"John",age:{$exists:false}})
db.contacts.find({name:"John",age:null})
db.contacts.find({age:20})

请注意,如果您需要将年龄字段与姓名一起搜索,您需要进行一些分析,因为姓名字段具有很好的选择性,如果您仅按年龄搜索,则不会使用该索引,也许一个好的选择是创建额外的仅在年龄字段上建立稀疏/部分索引,因此如果这是可能的搜索用例,您可以获取特定年龄的联系人列表。

You can create partial Index as follow:

  db.contacts.createIndex(
   { name: 1 },
   { partialFilterExpression: { age: { $exists: true } } }
   )

Explained:

As per the documentation partial indexes only index the documents in a collection that meet a specified filter expression. By indexing a subset of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for index creation and maintenance. In this particular case imagine your collection have 100k documents , but only 5 documents have the "age" field existing , in this case the partial index will include only those 5 fields in the index optimizing the index storage space and providing better performance.

For the query optimizer to choose this partial index, the query predicate must include a condition on the name field as well as a non-null match on the age field.

Following example queries will be able to use the index:

 db.contacts.find({name:"John"})
 db.contacts.find({name:"John",age:{$gt:20}})
 db.contacts.find({name:"John",age:30})

Following example query is a "covered query" based on this index:

db.contacts.find({name:"John",age:30},{_id:0,name:1,age:1})

( this query will be highly efficient since it return the data directly from the index )

Following example queries will not be able to use the index:

db.contacts.find({name:"John",age:{$exists:false}})
db.contacts.find({name:"John",age:null})
db.contacts.find({age:20})

Please, note you need to perform some analysis on if you need to search on the age field together with the name , since name field has a very good selectivity and this index will not be used in case you search only by age , maybe a good option is to create additional sparse/partial index only on the age field so you could fetch a list with contacts by certain age if this a possible search use case.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文