博客的 mongodb 架构设计

发布于 2024-10-20 20:56:38 字数 491 浏览 1 评论 0原文

您将如何为具有基于文档的数据库 (mongodb) 的类似博客的网站设计架构。该站点具有以下对象：用户、文章、评论。用户可以向文章添加评论。每个用户还可以为每个评论投票一次。

我希望能够有效地执行这些查询：
1. 获取文章 A、文章 A 的评论以及每个评论的投票数
2. 获取用户B对所有文章的所有评论
3.获取用户B投票的所有评论

我的第一个尝试是将文章和评论放在单独的集合中，并且评论可以包含投票的用户列表。这使得查询 1 和 2 变得简单。对于 3，我添加了投票收集，用于跟踪用户的投票。

有一些明显的缺点，例如重复用户投票数据，并且查询 1 将需要两次调用数据库。有更好的方法吗？

Article {
  "user_id"
}

Comment {
   "user_id",
   "article_id",
   [user_voted],
}

Vote {
    "user_id",
    "comment_id",
}

原文

How would you design the schema for a blog-like site with document-based databases (mongodb). The site has the following objects: User, Article, Comment. User can add Comments to Article. Each User can also vote exactly once per Comment.

I want to be able to do these queries efficiently:
1. get Article A, comments on Article A and # of votes per comments
2. get all comments by User B across all articles
3. get all comments User B voted for

My first attempt is to put articles and comments in separate collections and comment can contain a list of users that voted for it. This makes query 1 and 2 simple. And for 3, I added Vote collection which keep tracks of votes by users.

There's some obvious drawback such as duplicating user vote data and query 1 will take two calls to the database. Is there a better approach?

Article {
  "user_id"
}

Comment {
   "user_id",
   "article_id",
   [user_voted],
}

Vote {
    "user_id",
    "comment_id",
}

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

猫腻 2024-10-27 20:56:38

Article {
  "_id" : "A",
  "title" : "Hello World",
  "user_id" : 12345,
  "text" : 'My test article',

  "comments" : [
    { 'text' : 'blah', 'user_id' : 654321, 'votes' : [987654]},
    { 'text' : 'foo', 'user_id' : 987654, 'votes' : [12345, 654321] },
    ...
  ]
}

这里的基本前提是我将 Comments 嵌套在 Article 中。 投票仅适用于评论，因此它们已与每个评论一起存储为数组。在本例中，我刚刚存储了 user_id。如果您想存储更多信息（time_created 等），那么您可以对对象数组进行投票：

... 'votes' : [ { user_id : 987654, ts : 78946513 } ] ...

如何有效地执行查询：

获取文章 A、对文章 A 的评论以及每条评论的投票数

db.articles.find( { _id : 'A' } )

这只需一个查询即可获取所有内容。您可能需要执行一些客户端逻辑来计算每个评论的投票数，但这非常简单。

获取用户 B 对所有文章的所有评论

db.articles.ensureIndex( { "comments.user_id" : 1 } )
db.articles.find( { "comments.user_id" : 987654 } ) // returns all document fields

该索引将允许有效地搜索文档中的评论。

目前无法仅从子数组中提取匹配项。该查询实际上将返回带有该用户评论的所有文章。如果这可能是太多数据，您可以进行一些修剪。

db.articles.find( { "comments.user_id" : 987654 }, { "title" : 1, "comments.user_id" : 1 })

获取用户 B 投票的所有评论

db.articles.ensureIndex( { "comments.votes" : 1 } )
db.articles.find( { "comments.votes" : 987654 } )

再次强调，这将返回所有文章，而不仅仅是评论。

这里需要做出权衡。返回文章可能看起来我们带回了太多数据。但是，当您进行查询 #3 时，您打算向用户显示什么？

如果没有评论本身，获取“我投票的评论”列表并不是很有用。当然，如果没有文章本身（或者至少只有标题），评论就不是很有用。

大多数时候，查询 #3 会转化为从 Votes 到 Comments 再到 Articles 的联接。既然如此，那为什么不直接把文章带回来呢？

Article {
  "_id" : "A",
  "title" : "Hello World",
  "user_id" : 12345,
  "text" : 'My test article',

  "comments" : [
    { 'text' : 'blah', 'user_id' : 654321, 'votes' : [987654]},
    { 'text' : 'foo', 'user_id' : 987654, 'votes' : [12345, 654321] },
    ...
  ]
}

The basic premise here is that I've nested the Comments inside of the Article. The Votes only apply to a Comment, so they've been stored as an array with each Comment. In this case, I've just stored the user_id. If you want to store more information (time_created, etc.), then you can votes an array of objects:

... 'votes' : [ { user_id : 987654, ts : 78946513 } ] ...

How to perform your queries efficiently:

get Article A, comments on Article A and # of votes per comments

db.articles.find( { _id : 'A' } )

This gets everything with one query. You may have to do some client-side logic to count votes per comment, but this is pretty trivial.

get all comments by User B across all articles

db.articles.ensureIndex( { "comments.user_id" : 1 } )
db.articles.find( { "comments.user_id" : 987654 } ) // returns all document fields

The index will allow for efficiently searching the comments within a document.

There's currently no way to extract only the matches from a sub-array. This query will in fact return all of the articles with comments by that user. If this is potentially way too much data, you can do some trimming.

db.articles.find( { "comments.user_id" : 987654 }, { "title" : 1, "comments.user_id" : 1 })

get all comments User B voted for

db.articles.ensureIndex( { "comments.votes" : 1 } )
db.articles.find( { "comments.votes" : 987654 } )

Again, this will return all of the Articles, not just the comments.

There's a trade-off to be made here. Returning the article may seem like we're bringing back too much data. But what are you planning to display to the user when you make query #3?

Getting a list of "comments I've voted for" is not terribly useful without the comment itself. Of course the comment is not very useful without the article itself (or at least just the title).

Most of the time, query #3 devolves into a join from Votes to Comments to Articles. If that's the case, then why not just bring back the Articles to start with?

回复收藏 0 原文

~没有更多了~