您将如何使用文档存储(例如 CouchDB、Redis、MongoDB、Riak 等)构建博客
我有点不好意思承认这一点,但我很难概念化如何在非关系世界中构建数据。特别是考虑到大多数文档/KV 存储的功能略有不同。
我想从一个具体的例子中学习,但我找不到任何人讨论如何构建,例如,使用 CouchDB/Redis/MongoDB/Riak/等的博客。
有许多我认为很重要的问题:
- 哪些数据位应该非规范化(例如,标签可能与文档一起存在,但用户呢)
- 如何在文档之间链接?
- 创建聚合视图的最佳方法是什么,尤其是需要排序的视图(例如博客索引)
I'm slightly embarrassed to admit it, but I'm having trouble conceptualizing how to architect data in a non-relational world. Especially given that most document/KV stores have slightly different features.
I'd like to learn from a concrete example, but I haven't been able to find anyone discussing how you would architect, for example, a blog using CouchDB/Redis/MongoDB/Riak/etc.
There are a number of questions which I think are important:
- Which bits of data should be denormalised (e.g. tags probably live with the document, but what about users)
- How do you link between documents?
- What's the best way to create aggregate views, especially ones which require sorting (such as a blog index)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Ryan Bates 几周前制作了一个关于 mongoid 的截屏视频,他使用了博客应用程序的示例:http:// /railscasts.com/episodes/238-mongoid 这可能是您入门的好地方。
Ryan Bates made a screencast a couple of weeks ago about mongoid and he uses the example of a blog application: http://railscasts.com/episodes/238-mongoid this might be a good place for you to get started.
决定哪些对象应该是独立的以及哪些应该作为其他对象的一部分嵌入主要是平衡读/写性能/工作量的问题 - 如果子对象是独立的,则更新它意味着仅更改一个文档,但在读取该文档时父对象只有 id,需要额外的查询来获取数据。如果嵌入子对象,则当您读取父文档时,所有数据都在那里,但进行更改需要找到使用该对象的所有文档。
文档之间的链接与 SQL 没有太大区别 - 您存储一个用于查找适当记录的 ID。主要区别在于,您不是通过父 ID 过滤子表来查找记录,而是在父文档中拥有子 ID 列表。对于多对多关系,两侧都会有一个 id 列表,而不是中间有一个表。
不同平台之间的查询功能差异很大,因此对于如何解决这个问题没有明确的答案。然而,作为一般规则,您通常会在编写文档时设置视图/索引,而不是像使用 SQL 那样仅存储文档并稍后运行临时查询。
Deciding which objects should be independent and which should be embedded as part of other objects is mostly a matter of balancing read/write performance/effort - If a child object is independent, updating it means changing only one document but when reading the parent object you have only ids and need additional queries to get the data. If the child object is embedded, all the data is right there when you read the parent document, but making a change requires finding all the documents that use that object.
Linking between documents isn't much different from SQL - you store an ID which is used to find the appropriate record. The key difference is that instead of filtering the child table to find records by parent id, you have a list of child ids in the parent document. For many-many relationships you would have a list of ids on both sides rather than a table in the middle.
Query capabilities vary a lot between platforms so there isn't a clear answer for how to approach this. However as a general rule you will usually be setting up views/indexes when the document is written rather than just storing the document and running ad-hoc queries later as you would with SQL.
首先,我认为您希望从列表中删除 redis,因为它是键值存储而不是文档存储。 Riak 也是一个键值存储,但它可以是一个带有库的文档存储,例如 Ripple。
简而言之,使用文档存储对应用程序进行建模就是要弄清楚:
您想要在另一个文档中存储(嵌入)哪些数据。如果该文档仅属于一个文档,那么将其存储在另一个文档中“可能”是一个不错的选择。
{ 文章 : { 评论 : [{ 内容: 'yada yada', 时间戳: '20/11/2010' }] } }
您需要考虑的另一个警告是嵌入文档的大小有多大,因为在 mongodb 中,嵌入文档的最大大小为 5MB。
{ 文章: { 标签: ['新闻','酒吧'] } }
{ user: { role_ids: [1,2,3]}}
这是有关文档存储建模的简要概述。祝你好运。
First of all I think you would want to remove redis from the list as it is a key-value store instead of a document store. Riak is also a key-value store, but you it can be a document store with library like Ripple.
In brief, to model an application with document store is to figure out:
What data you would want to store (embed) inside another document. If that document only solely belongs to one document, then it 'might' be a good option to store it inside another document.
{ article : { comments : [{ content: 'yada yada', timestamp: '20/11/2010' }] } }
Another caveat you would want to consider is how big the size of the embedded document will be because in mongodb, the maximum size of embedded document is 5MB.
{ article: { tags: ['news','bar'] } }
{ user: { role_ids: [1,2,3]}}
This is a brief overview about modelling with document store. Good luck.