MongoDB 嵌入式集合的首选模式。文档与数组

发布于 2024-12-15 06:48:04 字数 677 浏览 0 评论 0原文

我相信至少有两种方法可以在 mongodb 文档中嵌入数据。在简化的情况下,我们可以有这样的东西:

{
    'name' : 'bill',
    'lines': {
       'idk73716': {'name': 'Line A'},
       'idk51232': {'name': 'Line B'},
       'idk23321': {'name': 'Line C'}
    }
}

并作为数组:

{
    'name' : 'bill',
    'lines': [
       {'id': 'idk73716', 'name': 'Line A'},
       {'id': 'idk51232', 'name': 'Line B'},
       {'id': 'idk23321', 'name': 'Line C'}
    ]
}

正如您在这个用例中看到的,保留每行的 id 很重要。

我想知道这两种模式之间是否有优缺点。特别是在使用索引时,我感觉第二个可能更容易使用,因为可以在“lines.id”甚至“lines.name”上创建索引来搜索所有文档中的 id 或名称。在第一个示例中,我没有找到任何有效的解决方案来索引 ids(“idk73716”等)。

如果您有这样的用例,通常首选使用第二种方法吗?

I believe there at least two ways to have embedded data in a mongodb document. In a simplified case we could have something like this:

{
    'name' : 'bill',
    'lines': {
       'idk73716': {'name': 'Line A'},
       'idk51232': {'name': 'Line B'},
       'idk23321': {'name': 'Line C'}
    }
}

and as an array:

{
    'name' : 'bill',
    'lines': [
       {'id': 'idk73716', 'name': 'Line A'},
       {'id': 'idk51232', 'name': 'Line B'},
       {'id': 'idk23321', 'name': 'Line C'}
    ]
}

As you can see in this use case it's important to keep the id of each line.

I'm wondering if there are pros and cons between these two schemas. Especially when it comes to using indexes I have the feeling that the second may be easier to work with as one could create an index on 'lines.id' or even 'lines.name' to search for an id or name accross all documents. I didn't find any working solution to index the ids ('idk73716' and so on) in the first example.

Is it generally preferred to use the second approach if you have a use case like this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

七月上 2024-12-22 06:48:04

在第一种方法中,您无法索引 id 字段,因为 id 用作键。它的行为类似于键值字典。如果您有一组已知的 id(当然数量较少),则此方法很有用。假设在您的第一个示例中,id 在 front 是众所周知的,

>>db.your_colleection.find()
 { "_id" : ObjectId("4ebbb6f974235464de49c3a5"), "name" : "bill", 
  "lines" : { 
             "idk73716" : { "name" : "Line A" },
             "idk51232" : { "name" : "Line B" } ,
             "idk23321":  { "name" : "Line C" }
            } 
  }

因此要查找 id 字段 idk73716 的值,您可以通过

 db.your_colleection.find({},{'lines.idk73716':1})
 { "_id" : ObjectId("4ebbb6f974235464de49c3a5"), "lines" : { "idk73716" : { "name" : "Line A" } } }

空 { } 表示查询,第二部分 {'lines.idk73716':1} 是查询选择器。

将 ids 作为键具有单独选择特定字段的优势。尽管 {'lines.idk73716':1} 是一个字段选择器,但在这里它充当查询和选择器。但这不能在你的第二种方法中完成。假设第二个集合是这样的

> db.second_collection.find()
{ "_id" : ObjectId("4ebbb9c174235464de49c3a6"), "name" : "bill", "lines" : [
    {
        "id" : "idk73716",
        "name" : "Line A"
    },
    {
        "id" : "idk51232",
        "name" : "Line B"
    },
    {
        "id" : "idk23321",
        "name" : "Line C"
    }
] }
> 

并且您对字段 id 建立了索引,因此如果您想

> db.second_collection.find({'lines.id' : 'idk73716' })

{ "_id" : ObjectId("4ebbb9c174235464de49c3a6"), "name" : "bill", "lines" : [
    {
        "id" : "idk73716",
        "name" : "Line A"
    },
    {
        "id" : "idk51232",
        "name" : "Line B"
    },
    {
        "id" : "idk23321",
        "name" : "Line C"
    }
] }
> 

通过查看上面的输出来按 id 进行查询,可以看出没有办法单独选择匹配的子(嵌入)文档,但是它在第一种方法中是可能的。这是 mongodb 的默认行为。

see

db.second_collection.find({'lines.id' : 'idk73716' },{'lines':1})

将获取所有行,而不仅仅是 idk73716

{ "_id" : ObjectId("4ebbb9c174235464de49c3a6"), "lines" : [
    {
        "id" : "idk73716",
        "name" : "Line A"
    },
    {
        "id" : "idk51232",
        "name" : "Line B"
    },
    {
        "id" : "idk23321",
        "name" : "Line C"
    }
] }

希望这有帮助

编辑

感谢@Gates VP 指出

db.your_collection.find({'lines.idk73716':{$exists:true}})。如果你
想要使用“ids as key”版本,exists 查询将起作用,但是
它将不可索引

我们仍然可以使用 $exists 来查询 id,但它不会是可索引的

In your first approach you can't index the id fields, since id used as key. Its kind of act like key value dictionary. This approach is useful if you have the known set of ids (of course less number).Assume In your first example the id is well known at front ,

>>db.your_colleection.find()
 { "_id" : ObjectId("4ebbb6f974235464de49c3a5"), "name" : "bill", 
  "lines" : { 
             "idk73716" : { "name" : "Line A" },
             "idk51232" : { "name" : "Line B" } ,
             "idk23321":  { "name" : "Line C" }
            } 
  }

so to find the values for id field idk73716, you can do this by

 db.your_colleection.find({},{'lines.idk73716':1})
 { "_id" : ObjectId("4ebbb6f974235464de49c3a5"), "lines" : { "idk73716" : { "name" : "Line A" } } }

the empty {} denotes the query, and the second part {'lines.idk73716':1} is a query selector.

having ids as keys having an advantage of picking the particular field alone. Even though {'lines.idk73716':1} is a field selector, here it serves as a query and selector. but this cannot be done in your second approach. Assume the second collection is kind of like this

> db.second_collection.find()
{ "_id" : ObjectId("4ebbb9c174235464de49c3a6"), "name" : "bill", "lines" : [
    {
        "id" : "idk73716",
        "name" : "Line A"
    },
    {
        "id" : "idk51232",
        "name" : "Line B"
    },
    {
        "id" : "idk23321",
        "name" : "Line C"
    }
] }
> 

And you indexed the field id, so if you want to query by id

> db.second_collection.find({'lines.id' : 'idk73716' })

{ "_id" : ObjectId("4ebbb9c174235464de49c3a6"), "name" : "bill", "lines" : [
    {
        "id" : "idk73716",
        "name" : "Line A"
    },
    {
        "id" : "idk51232",
        "name" : "Line B"
    },
    {
        "id" : "idk23321",
        "name" : "Line C"
    }
] }
> 

by seeing the above output, its visible that there is no way to pick the matching sub(embedded) documents alone, but it is possible in the the first approach. This is the default behavior of mongodb.

see

db.second_collection.find({'lines.id' : 'idk73716' },{'lines':1})

will fetch all lines, not just idk73716

{ "_id" : ObjectId("4ebbb9c174235464de49c3a6"), "lines" : [
    {
        "id" : "idk73716",
        "name" : "Line A"
    },
    {
        "id" : "idk51232",
        "name" : "Line B"
    },
    {
        "id" : "idk23321",
        "name" : "Line C"
    }
] }

Hope this helps

EDIT

Thanks to @Gates VP for pointing out

db.your_collection.find({'lines.idk73716':{$exists:true}}). If you
want to use the "ids as keys" version, the exists query will work, but
it will not be indexable

We still can use $exists to query the id, but it will not be indexable

锦上情书 2024-12-22 06:48:04

今天我们有 $eleMatch 运算符来实现这一点,如下所述 - 仅检索 MongoDB 集合中对象数组中的查询元素

但是这个问题提出了一些有趣的设计选择,我今天也在努力做出这些选择。
如果嵌入文档中需要频繁的 CRUD,那么给定的两个选项中的首选应该是什么?

我发现,当 ID 用作属性名称时,使用新的 $set/$unset 运算符在嵌入文档上执行 CRUD 很容易。如果客户端可以获取 ID 进行编辑,那么它比数组更好,IMO。
这是 Mongodb 的另一篇关于模式设计和做出这些设计决策的有用博客文章

http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1

Today we have $eleMatch operator to achieve this, as discussed here - Retrieve only the queried element in an object array in MongoDB collection

But this question poses some interesting design choices, which I am also struggling to make today.
What should be the preferred choice from given two options if frequent CRUD is required in embedded documents?

I found, it is easy to perform CRUD with new $set/$unset operators, on embedded documents, when ID s used as property names. And if client can get hold of ID to make edits, it is better than array, IMO.
Here is another useful blogpost by Mongodb about schema design and making these design decisions

http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文