什么类型的 NoSQL 数据库最适合存储分层数据?
什么类型的 NoSQL 数据库最适合存储分层数据?
举例来说,我想用树结构存储论坛的帖子:
original post
+ re: original post
+ re: original post
+ re2: original post
+ re3: original post
+ re2: original post
What type of NoSQL database is best suited to store hierarchical data?
Say for example I want to store posts of a forum with a tree structure:
original post
+ re: original post
+ re: original post
+ re2: original post
+ re3: original post
+ re2: original post
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
MongoDB 和 CouchDB 提供了解决方案,但没有内置功能。请参阅表示层次结构的问题关系数据库,因为我见过的大多数其他 NoSQL 解决方案在这方面都是类似的;您必须编写自己的算法来在添加、删除和移动节点时重新计算该信息。一般来说,您要在快速读取时间(例如嵌套集)或快速写入时间之间做出决定(邻接列表)。有关这些方面的更多选项,请参阅上述 SO 问题 - 平板方法 似乎与您的问题最一致。
抽象化这些考虑因素的一个标准是 Java 内容存储库 (JCR),两者Apache JackRabbit 和 JBoss eXo 是实现。请注意,在幕后,两者仍在进行某种算法计算以维持如上所述的层次结构。此外,JCR 还处理权限、文件存储和其他几个方面 - 因此它对于您的项目来说可能有点大材小用。
MongoDB and CouchDB offer solutions, but not built in functionality. See this SO question on representing hierarchy in a relational database as most other NoSQL solutions I've seen are similar in this regard; where you have to write your own algorithms for recalculating that information as nodes are added, deleted and moved. Generally speaking you're making a decision between fast read times (e.g. nested set) or fast write times (adjacency list). See aforementioned SO question for more options along these lines - the flat table approach appears most aligned with your question.
One standard that does abstract away these considerations is the Java Content Repository (JCR), both Apache JackRabbit and JBoss eXo are implementations. Note, behind the scenes both are still doing some sort of algorithmic calculations to maintain hierarchy as described above. In addition, the JCR also handles permissions, file storage, and several other aspects - so it may be overkill for your project.
您可能需要的是面向文档的数据库,例如 MongoDB 或 CouchDB。
查看允许您在 MongoDB 中存储分层数据的不同技术的示例:
http://www.mongodb.org/display/DOCS/Trees+in+MongoDB
What you possibly need is a document-oriented database like MongoDB or CouchDB.
See examples of different techniques which allow you to store hierarchical data in MongoDB:
http://www.mongodb.org/display/DOCS/Trees+in+MongoDB
最常见的是IBM的IMS。还有缓存数据库
请参阅这个问题发布在 stackexchange 的 dba 部分。
The most common one is IBM's IMS.There is also Cache Database
See this question posted on dba section of stackexchange.
显然,LDAP。 OpenLDAP 可以轻松完成它。
LDAP, obviously. OpenLDAP would make short work of it.
面对同样的问题,我决定使用 Lua + Redis 创建自己的(非常简单)解决方案https:// /github.com/qbolec/Redis-Tree/
Faced with the same issue, I decided to create my own (very simple) solution using Lua + Redis https://github.com/qbolec/Redis-Tree/
在数学中,更具体地说,在图论中,树是一种无向图,其中任何两个顶点恰好由一条路径连接。因此任何 图数据库 肯定可以完成这项工作。顺便说一句,像树这样的普通图可以简单地映射到任何关系或非关系数据库。要将分层数据存储到关系数据库中,请查看 Bill Karwin 的精彩演示。还有一些具有存储树木设施的 ORM。例如 TypeORM 支持邻接列表和闭包表模式来存储层次结构。
非关系型数据库之王 [恕我直言] 是 Mongodb。查看它的文档。了解它如何存储树木。树是最常见的图形,随处可见。任何成熟的数据库解决方案都应该有一种处理树的方法。
In mathematics, and, more specifically, in graph theory, a tree is an undirected graph in which any two vertices are connected by exactly one path. So any graph db will do the job for sure. BTW an ordinary graph like a tree can be simply mapped to any relational or non-relational DB. To store hierarchical data into a relational db take a look at this awesome presentation by Bill Karwin. There are also ORMs with facilities to store trees. For example TypeORM supports the Adjacency list and Closure table patterns for storing hierarchical structures.
The king of Non-relational DBs [IMHO] is Mongodb. Check out it's documentation. to find out how it stores trees. Trees are the most common kind of graphs and they are used everywhere. Any well-established DB solution should have a way to deal with trees.
Exist-db 实现了 XML 持久化的分层数据模型
Exist-db implemented hierarchical data model for xml persistence
图数据库可能也能解决这个问题。如果neo4j在扩展方面对你来说还不够,可以考虑Titan,它基于各种存储背部-结束包括 HBase 并且应该可以很好地扩展。它不像neo4j那么成熟,但它是一个非常有前途的项目。
Graph databases would probably also solve this problem. If neo4j is not enough for you in terms of scaling, consider Titan, which is based on various storage back-ends including HBase and should scale very well. It is not as mature as neo4j, but it is a very promising project.
刚刚在一个培训课程上度过了周末,使用 MUMUPS db 作为全栈 javascript 浏览器应用程序开发框架的后端。很棒的东西!我推荐 GPL 下的 MUMPS 的 GT.M 发行版。或者尝试 http://sourceforge.net/projects/mumps/?source=recommended香草腮腺炎。查看 http://robtweed.wordpress.com/ 了解 ewd.js js 框架以及有关 MUMPS 的更多信息。
Just spent the weekend at a training course using MUMUPS db as a back-end for a full stack javascript browser application development framework. Great stuff! I'd recommend GT.M distro of MUMPS under GPL. Or try http://sourceforge.net/projects/mumps/?source=recommended for vanilla MUMPS. Check out http://robtweed.wordpress.com/ for ewd.js js framework and more info on MUMPS.
这是给你的一个非答案。 SQL Server 2008!!!它非常适合递归查询。或者,您可以采用老式路线并将层次结构数据存储在单独的表中以避免递归。
我认为关系数据库非常适合树数据。无论是查询性能还是易用性。需要注意的是......每次有人发帖时,您都将插入到一个索引表中,并且可能还插入到其他几个索引表中。插入性能可能是 Facebook calibre 论坛上的一个问题。
Here's a non-answer for you. SQLServer 2008!!!! It's great for recursive queries. Or you can go the old fashioned route and store hierarchy data in a separate table to avoid recursion.
I think relational databases lend themselves very well to tree data. Both in query performance and ease of use. With one caveat.... you will be inserting into an indexed table, and probably several other indexed tables every time someone makes a post. Insert performance could be an issue on a facebook caliber forum.
查看 MarkLogic。您可以从网站下载演示副本。它是非结构化数据的数据库,属于 NoSQL 数据库分类。我知道非结构化数据是一个含义很丰富的术语,但只需将其视为不太适合 RDBMS 的行和列的数据(如分层数据)。
Check out MarkLogic. You can download a demo copy from the website. It is a database for unstructured data and falls under the NoSQL classification of databases. I know unstructured data is a pretty loaded term but just think of it as data that does not fit well in the rows and columns of a RDBMS (like hierarchical data).
Amazon Web Service 的简单存储服务 (AWS S3) 是一种原生支持分层数据的 NoSql 存储服务。基于路径的键本质上是分层的,并且可以使用属性(mime 类型,例如 application/json、text/csv 等)来键入 blob 值。 S3 的优点包括能够扩展到极大的总体容量、版本控制以及几乎无限的并发写入。缺点包括不支持条件写入(乐观并发)或一致读取(仅适用于先写后读),也不支持引用/关系。它也是纯粹基于使用情况的,因此需求的广泛变化不需要复杂的扩展基础设施或过度配置的容量。
A NoSql storage service with native support for hierarchical data is Amazon Web Service's Simple Storage Service (AWS S3). The path based keys are hierarchical by nature, and the blob values may be typed using attributes (mime type, e.g. application/json, text/csv, etc.). Advantages of S3 include the ability to scale to both extremely large overall capacity, versioning, as well as nearly infinite concurrent writes. Disadvantages include no support for conditional writes (optimistic concurrency), or consistent reads (only for read-after write) and no support for references/relationships. It is also purely usage based so wide variations in demand do not require complex scaling infrastructure or over-provisioned capacity.
Clicknouse db 明确支持 分层结构数据
Clicknouse db has explicit support for hierarchical data