MongoDB 关系:嵌入还是引用?

发布于 2024-10-25 03:23:23 字数 864 浏览 1 评论 0原文

我想设计一个带有一些评论的问题结构。我应该使用哪种关系来进行注释:embed 还是 reference

带有一些注释的问题,例如 stackoverflow,将具有如下结构:

Question
    title = 'aaa'
    content = 'bbb'
    comments = ???

首先,我想到使用嵌入式注释(我认为 MongoDB 中推荐使用 embed),就像这样:

Question
    title = 'aaa'
    content = 'bbb'
    comments = [ { content = 'xxx', createdAt = 'yyy'}, 
                 { content = 'xxx', createdAt = 'yyy'}, 
                 { content = 'xxx', createdAt = 'yyy'} ]

很清楚,但我担心这种情况: 如果我想编辑指定的评论,如何获取其内容和问题?< /strong> 没有 _id 让我找到一个,也没有 question_ref 让我找到它的问题。 (是否有办法在没有 _idquestion_ref 的情况下做到这一点?)

我是否必须使用 ref 而不是 embed?那么我是否必须创建一个新的评论集合?

I want to design a question structure with some comments. Which relationship should I use for comments: embed or reference?

A question with some comments, like stackoverflow, would have a structure like this:

Question
    title = 'aaa'
    content = 'bbb'
    comments = ???

At first, I thought of using embedded comments (I think embed is recommended in MongoDB), like this:

Question
    title = 'aaa'
    content = 'bbb'
    comments = [ { content = 'xxx', createdAt = 'yyy'}, 
                 { content = 'xxx', createdAt = 'yyy'}, 
                 { content = 'xxx', createdAt = 'yyy'} ]

It is clear, but I'm worried about this case: If I want to edit a specified comment, how do I get its content and its question? There is no _id to let me find one, nor question_ref to let me find its question. (Is there perhaps a way to do this without _id and question_ref?)

Do I have to use ref rather than embed? Do I then have to create a new collection for comments?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

葵雨 2024-11-01 03:23:24

我在自己研究这个问题时看到了这个小演示。我对它的布局(信息和演示)如此之好感到惊讶。

http://openmymind.net/Multiple-Collections-Versus-Embedded-Documents

它总结道:

一般来说,如果您有很多[子文档]或者它们很大,那么单独的集合可能是最好的。

较小和/或较少的文档往往更适合嵌入。

I came across this small presentation while researching this question on my own. I was surprised at how well it was laid out, both the info and the presentation of it.

http://openmymind.net/Multiple-Collections-Versus-Embedded-Documents

It summarized:

As a general rule, if you have a lot of [child documents] or if they are large, a separate collection might be best.

Smaller and/or fewer documents tend to be a natural fit for embedding.

时光清浅 2024-11-01 03:23:24

如果我想编辑指定的评论,如何获取其内容和问题?

您可以通过子文档进行查询:db.question.find({'comments.content' : 'xxx'})

这将返回整个问题文档。要编辑指定的评论,您必须在客户端上找到该评论,进行编辑并将其保存回数据库。

一般来说,如果您的文档包含对象数组,您会发现这些子对象需要在客户端进行修改。

If I want to edit a specified comment, how to get its content and its question?

You can query by sub-document: db.question.find({'comments.content' : 'xxx'}).

This will return the whole Question document. To edit the specified comment, you then have to find the comment on the client, make the edit and save that back to the DB.

In general, if your document contains an array of objects, you'll find that those sub-objects will need to be modified client side.

晨敛清荷 2024-11-01 03:23:24

是的,我们可以使用文档中的参考。就像SQL i joins 一样填充另一个文档。在 MongoDB 中,它们没有连接来映射一对多关系文档。相反,我们可以使用 populate 来实现我们的场景。

var mongoose = require('mongoose')
  , Schema = mongoose.Schema
  
var personSchema = Schema({
  _id     : Number,
  name    : String,
  age     : Number,
  stories : [{ type: Schema.Types.ObjectId, ref: 'Story' }]
});

var storySchema = Schema({
  _creator : { type: Number, ref: 'Person' },
  title    : String,
  fans     : [{ type: Number, ref: 'Person' }]
});

填充是用其他集合中的文档自动替换文档中指定路径的过程。我们可以填充单个文档、多个文档、普通对象、多个普通对象或从查询返回的所有对象。让我们看一些例子。

您可以获得更多信息,请访问:http://mongoosejs.com/docs/populate.html

Yes, we can use the reference in the document. To populate another document just like SQL i joins. In MongoDB, they don't have joins to map one to many relationship documents. Instead that we can use populate to fulfil our scenario.

var mongoose = require('mongoose')
  , Schema = mongoose.Schema
  
var personSchema = Schema({
  _id     : Number,
  name    : String,
  age     : Number,
  stories : [{ type: Schema.Types.ObjectId, ref: 'Story' }]
});

var storySchema = Schema({
  _creator : { type: Number, ref: 'Person' },
  title    : String,
  fans     : [{ type: Number, ref: 'Person' }]
});

The population is the process of automatically replacing the specified paths in the document with the document(s) from other collection(s). We may populate a single document, multiple documents, plain objects, multiple plain objects, or all objects returned from a query. Let's look at some examples.

Better you can get more information please visit: http://mongoosejs.com/docs/populate.html

老街孤人 2024-11-01 03:23:24

我知道这已经很老了,但是如果您正在寻找OP关于如何仅返回指定评论的问题的答案,您可以使用 $(查询) 运算符,如下所示:

db.question.update({'comments.content': 'xxx'}, {'comments.
: true})

I know this is quite old but if you are looking for the answer to the OP's question on how to return only specified comment, you can use the $ (query) operator like this:

db.question.update({'comments.content': 'xxx'}, {'comments.
: true})
心如荒岛 2024-11-01 03:23:24

MongoDB 提供了无模式的自由,如果没有很好地思考或计划,从长远来看,此功能可能会导致痛苦,

有 2 个选项:嵌入或参考。我不会详细解释定义,因为上面的答案已经很好地定义了它们。

嵌入时,您应该回答一个问题是您的嵌入文档会增长,如果是,那么会增长多少(请记住每个文档的大小限制为 16 MB)因此,如果您对帖子有评论之类的内容,评论的限制是多少如果该帖子被病毒式传播并且人们开始添加评论。在这种情况下,引用可能是更好的选择(但即使引用也可能会增长并达到 16 MB 的限制)。

那么如何平衡它,答案是不同模式的组合,检查这些链接,并根据您的用例创建您自己的混合搭配。

https://www.mongodb.com/blog/post/building -with-patterns-a-summary

MongoDB gives freedom to be schema-less and this feature can result in pain in the long term if not thought or planned well,

There are 2 options either Embed or Reference. I will not go through definitions as the above answers have well defined them.

When embedding you should answer one question is your embedded document going to grow, if yes then how much (remember there is a limit of 16 MB per document) So if you have something like a comment on a post, what is the limit of comment count, if that post goes viral and people start adding comments. In such cases, reference could be a better option (but even reference can grow and reach 16 MB limit).

So how to balance it, the answer is a combination of different patterns, check these links, and create your own mix and match based on your use case.

https://www.mongodb.com/blog/post/building-with-patterns-a-summary

https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1

也只是曾经 2024-11-01 03:23:24

如果我想编辑指定的评论,如何获取其内容和
它的问题?

如果您跟踪了评论数量以及要更改的评论索引,则可以使用 点运算符SO示例< /a>)。

你可以做f.ex。

db.questions.update(
    {
        "title": "aaa"       
    }, 
    { 
        "comments.0.contents": "new text"
    }
)

(作为编辑问题内评论的另一种方式)

If I want to edit a specified comment, how do I get its content and
its question?

If you had kept track of the number of comments and the index of the comment you wanted to alter, you could use the dot operator (SO example).

You could do f.ex.

db.questions.update(
    {
        "title": "aaa"       
    }, 
    { 
        "comments.0.contents": "new text"
    }
)

(as another way to edit the comments inside the question)

腹黑女流氓 2024-11-01 03:23:23

这与其说是一门科学,不如说是一门艺术。 有关模式的 Mongo 文档 是一个很好的参考,但这里有一些需要考虑的事项:

  • 尽可能多地投入

    文档数据库的好处在于它消除了大量的连接。您的第一直觉应该是将尽可能多的内容放入一个文档中。因为 MongoDB 文档具有结构,并且因为您可以在该结构内高效地查询(这意味着您可以获取所需的文档部分,因此文档大小不应该让您太担心),因此不需要立即规范化数据,例如你会在 SQL 中。特别是,任何脱离其父文档而无用的数据都应成为同一文档的一部分。

  • 将可以从多个位置引用的数据分离到自己的集合中。

    这与其说是“存储空间”问题,不如说是“数据一致性”问题。如果许多记录引用相同的数据,则更新单个记录并在其他位置保留对其的引用会更高效且不易出错。

  • 文档大小注意事项

    MongoDB 对单个文档施加 4MB(16MB,1.8)大小限制。在 GB 数据的世界中,这听起来很小,但它也是 3 万条推文或 250 个典型的 Stack Overflow 答案或 20 张闪烁的照片。另一方面,这比人们想要在典型网页上一次呈现的信息要多得多。首先考虑什么能让你的查询更容易。在许多情况下,对文档大小的关注将是不成熟的优化。

  • 复杂的数据结构:

    MongoDB可以存储任意深度嵌套的数据结构,但无法有效地搜索它们。如果您的数据形成树、森林或图,则实际上需要将每个节点及其边存储在单独的文档中。 (请注意,也应该考虑专门为此类数据设计的数据存储)

    指出不可能返回文档中元素的子集。如果您需要从每个文档中挑选一些内容,那么将它们分开会更容易。

  • 数据一致性

    MongoDB 在效率和一致性之间进行了权衡。规则是对单个文档的更改始终原子,而对多个文档的更新绝不应该被假定为原子的。也无法“锁定”服务器上的记录(您可以使用例如“锁定”字段将其构建到客户端的逻辑中)。当您设计架构时,请考虑如何保持数据一致。一般来说,文档中保留的内容越多越好。


对于您所描述的内容,我将嵌入评论,并为每个评论提供一个带有 ObjectID 的 id 字段。 ObjectID 中嵌入了时间戳,因此您可以根据需要使用该时间戳,而不是在创建时创建。

This is more an art than a science. The Mongo Documentation on Schemas is a good reference, but here are some things to consider:

  • Put as much in as possible

    The joy of a Document database is that it eliminates lots of Joins. Your first instinct should be to place as much in a single document as you can. Because MongoDB documents have structure, and because you can efficiently query within that structure (this means that you can take the part of the document that you need, so document size shouldn't worry you much) there is no immediate need to normalize data like you would in SQL. In particular any data that is not useful apart from its parent document should be part of the same document.

  • Separate data that can be referred to from multiple places into its own collection.

    This is not so much a "storage space" issue as it is a "data consistency" issue. If many records will refer to the same data it is more efficient and less error prone to update a single record and keep references to it in other places.

  • Document size considerations

    MongoDB imposes a 4MB (16MB with 1.8) size limit on a single document. In a world of GB of data this sounds small, but it is also 30 thousand tweets or 250 typical Stack Overflow answers or 20 flicker photos. On the other hand, this is far more information than one might want to present at one time on a typical web page. First consider what will make your queries easier. In many cases concern about document sizes will be premature optimization.

  • Complex data structures:

    MongoDB can store arbitrary deep nested data structures, but cannot search them efficiently. If your data forms a tree, forest or graph, you effectively need to store each node and its edges in a separate document. (Note that there are data stores specifically designed for this type of data that one should consider as well)

    It has also been pointed out than it is impossible to return a subset of elements in a document. If you need to pick-and-choose a few bits of each document, it will be easier to separate them out.

  • Data Consistency

    MongoDB makes a trade off between efficiency and consistency. The rule is changes to a single document are always atomic, while updates to multiple documents should never be assumed to be atomic. There is also no way to "lock" a record on the server (you can build this into the client's logic using for example a "lock" field). When you design your schema consider how you will keep your data consistent. Generally, the more that you keep in a document the better.

For what you are describing, I would embed the comments, and give each comment an id field with an ObjectID. The ObjectID has a time stamp embedded in it so you can use that instead of created at if you like.

短暂陪伴 2024-11-01 03:23:23

一般来说,如果实体之间存在一对一或一对多关系,则嵌入是很好的选择;如果存在多对多关系,则引用是很好的选择。

In general, embed is good if you have one-to-one or one-to-many relationships between entities, and reference is good if you have many-to-many relationships.

一页 2024-11-01 03:23:23

事实上,我很好奇为什么没有人谈论 UML 规范。经验法则是,如果您有聚合,那么您应该使用引用。但如果是组合,那么耦合性就更强,应该使用嵌入文档。

你很快就会明白为什么这是合乎逻辑的。如果一个对象可以独立于父对象而存在,那么即使父对象不存在,您也会想要访问它。由于您无法将其嵌入到不存在的父级中,因此您必须使其存在于自己的数据结构中。如果存在父对象,只需在父对象中添加对象的引用即可将它们链接在一起。

真的不知道这两种关系有什么区别?
这是一个解释它们的链接:
UML 中的聚合与组合

Actually, I'm quite curious why nobody spoke about the UML specifications. A rule of thumb is that if you have an aggregation, then you should use references. But if it is a composition, then the coupling is stronger, and you should use embedded documents.

And you will quickly understand why it is logical. If an object can exist independently of the parent, then you will want to access it even if the parent doesn't exist. As you just can't embed it in a non-existing parent, you have to make it live in it's own data structure. And if a parent exist, just link them together by adding a ref of the object in the parent.

Don't really know what is the difference between the two relationships ?
Here is a link explaining them:
Aggregation vs Composition in UML

书信已泛黄 2024-11-01 03:23:23

好吧,我有点晚了,但仍然想分享我的模式创建方式。

我对所有可以用一个词描述的东西都有模式,就像你在经典的 OOP 中所做的那样。

EG

  • 评论
  • 帐户
  • 用户
  • 博客文章
  • ...

每个模式都可以保存为文档或子文档,因此我为每个模式声明这一点。

文档:

  • 可以作为参考。 (例如,用户发表了评论 -> 评论具有对用户的“由”引用)
  • 是您的应用程序中的“根”。 (例如博客文章 -> 有一个关于该博客文章的页面)

子文档:

  • 只能使用一次/从不作为参考。 (例如,评论保存在博客文章中)
  • 在您的应用程序中永远不是“根”。 (评论仅显示在博客文章页面中,但该页面仍然是关于博客文章的)

Well, I'm a bit late but still would like to share my way of schema creation.

I have schemas for everything that can be described by a word, like you would do it in the classical OOP.

E.G.

  • Comment
  • Account
  • User
  • Blogpost
  • ...

Every schema can be saved as a Document or Subdocument, so I declare this for each schema.

Document:

  • Can be used as a reference. (E.g. the user made a comment -> comment has a "made by" reference to user)
  • Is a "Root" in you application. (E.g. the blogpost -> there is a page about the blogpost)

Subdocument:

  • Can only be used once / is never a reference. (E.g. Comment is saved in the blogpost)
  • Is never a "Root" in you application. (The comment just shows up in the blogpost page but the page is still about the blogpost)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文