如何决定使用哪种NoSQL技术?

发布于 2024-09-19 20:58:41 字数 1884 浏览 8 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

野味少女 2024-09-26 20:58:41

MongoDB

可扩展性: 高度可用且一致,但在关系和许多分布式写入方面表现不佳。它的主要好处是存储和索引无模式文档。文档大小上限为 4mb,索引仅对有限的深度有意义。请参阅http://www.paperplanes.de/2010/2/25/notes_on_mongodb。 html

最适合: 深度有限的树结构

用例: 多样化类型层次结构、生物系统学、图书馆目录

Neo4j

可扩展性: 高度可用,但不是分布式的。强大的遍历框架,可实现节点空间的高速遍历。仅限于数十亿个节点/关系的图表。请参阅http://highscalability.com/neo4j-graph-database-kicks-buttox

最适合: 具有无限深度和循环加权连接的深度图

用例: 社交网络、拓扑分析、语义 Web 数据、推理

HBase

可扩展性:可靠、一致的存储(PB 级及以上)。支持具有有限的稀疏属性集的大量对象。与 Hadoop 协同工作以执行大型数据处理作业。 http://www.ibm.com/developerworks/opensource/library /os-hbase/index.html

最适合:有向、非循环图

用例:日志分析、语义网络数据、机器学习

MongoDB

Scalability: Highly available and consistent but sucks at relations and many distributed writes. It's primary benefit is storing and indexing schemaless documents. Document size is capped at 4mb and indexing only makes sense for limited depth. See http://www.paperplanes.de/2010/2/25/notes_on_mongodb.html

Best suited for: Tree structures with limited depth

Use Cases: Diverse Type Hierarchies, Biological Systematics, Library Catalogs

Neo4j

Scalability: Highly available but not distributed. Powerful traversal framework for high-speed traversals in the node space. Limited to graphs around several billion nodes/relationships. See http://highscalability.com/neo4j-graph-database-kicks-buttox

Best suited for: Deep graphs with unlimited depth and cyclical, weighted connections

Use Cases: Social Networks, Topological analysis, Semantic Web Data, Inferencing

HBase

Scalability: Reliable, consistent storage in the petabytes and beyond. Supports very large numbers of objects with a limited set of sparse attributes. Works in tandem with Hadoop for large data processing jobs. http://www.ibm.com/developerworks/opensource/library/os-hbase/index.html

Best suited for: directed, acyclic graphs

Use Cases: Log analysis, Semantic Web Data, Machine Learning

意犹 2024-09-26 20:58:41

我知道这似乎是一个奇怪的地方,但是 Heroku 最近对他们的 noSQL 产品很着迷,并且对许多当前项目有一个很好的概述。它绝不是 Slideshare 印刷机,但它将帮助您开始比较过程:

http://blog.heroku.com/archives/2010/7/20/nosql/?utm_medium=email&utm_source= EmailBlast&utm_content=619506254&utm_campaign=HerokuSeptemberNewsletter-VersionB&utm_term=NoSQLHerokuandYou

I know this might seem like an odd place to point to but, Heroku has recently gone nuts with their noSQL offerings and have an OK overview of many of the current projects. It is in no way a Slideshare press but it will help you start the comparison process:

http://blog.heroku.com/archives/2010/7/20/nosql/?utm_medium=email&utm_source=EmailBlast&utm_content=619506254&utm_campaign=HerokuSeptemberNewsletter-VersionB&utm_term=NoSQLHerokuandYou

心如狂蝶 2024-09-26 20:58:41

查看一下 NoSQL 数据库的概览比较:

http://kkovacs.eu /cassandra-vs-mongodb-vs-couchdb-vs-redis

Checkout this for at glance comparison of NoSQL dbs:

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

高冷爸爸 2024-09-26 20:58:41

MongoDB:

MongoDB 是文档数据库,与关系数据库不同。该文档存储半结构化数据,如 JSON 对象(无架构)

主要特征:

  1. 架构可以随着应用程序的发展而改变
  2. 完全索引
  3. 负载平衡和负载均衡。数据分片
  4. 数据复制
  5. 一致性与一致性CAP 理论中的分区(一致性-可用性-分区)

何时使用:

  1. 实时分析
  2. 高速日志记录
  3. 半结构化数据管理

何时不使用:

  1. 具有强 ACID 属性(原子性、一致性、隔离性和持久性)的高度事务性应用程序。在此用例中,RDBMS 是首选。
  2. 操作涉及关系 - 外键等的数据集

HBASE:

HBase 是一个开源、非关系型、分布式列族数据库

主要特性:

  1. 它提供了一种存储大量稀疏数据的容错方式(在大量空或不重要数据中捕获的少量信息,例如查找 50 个最大的数据)一组 20 亿条记录中的项,或者查找代表小于 0.1% 的巨大集合的非零项)
  2. 支持每行不同的变量模式
  3. 可以作为 MapReduce 作业的输入和输出
  4. 压缩、内存中操作,以及每列上的布隆过滤器(一种数据结构,旨在快速且高效地告诉您某个元素是否存在于集合中)
    5.在CAP上实现CP

何时使用HBase:

  1. 如果您正在按键加载数据、按键(或范围)搜索数据、按键提供数据、查询数据按键按行
  2. 存储不太符合架构(变量架构)的数据

何时不使用 HBase:

  1. 对于关系分析
  2. 全表扫描
  3. 要聚合的数据,按行而不是列进行分析

Neo4j:

Neo4j 是使用属性图数据模型的图形数据库(数据存储为图形和节点以及与属性的关系)

主要功能:

  1. 支持完整的 ACID(原子性、一致性、隔离性和持久性)规则
  2. 使用 Apache Lucence
  3. Schema 免费、自下而上的数据模型设计
  4. 支持索引由于可用于图形的紧凑存储和内存缓存,实现了高可扩展性

<强>何时使用:

  1. 主数据管理
  2. 网络和 IT 运营
  3. 实时建议
  4. 欺诈检测
  5. 社交网络(如 Facebook)

何时不使用:

  1. 批量查询/扫描
  2. 如果您的应用程序需要分区和扫描数据分片

在这篇文章中查看各种 NoSQL 技术的比较

来源:

Wiki、幻灯片共享Cloudera,教程点,Neo4j

MongoDB:

MongoDB is document database unlike Relational database. The document stores semi structured data like JSON object ( schema free)

Key features:

  1. Schema can change over evolution of application
  2. Full indexing
  3. Load balancing & Data sharding
  4. Data replication
  5. Consistency & Partitioning in CAP theory ( Consistency-Availability-Partitioning)

When to use:

  1. Real time analytics
  2. High speed logging
  3. Semi structured data management

When not to use:

  1. Highly transactional applications with strong ACID properties ( Atomicity, Consistency, Isolation & Durability). RDBMS is preferred in this use case.
  2. Operating on data sets involving relations - foreign keys etc

HBASE:

HBase is an open source, non-relational, distributed column family database

Key features:

  1. It provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection)
  2. Supports variable schema where each row is different
  3. Can serve as the input and output for MapReduce job
  4. Compression, in-memory operation, and Bloom filters on a per-column (A data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set)
    5.Achieve CP on CAP

When to use HBase:

  1. If you’re loading data by key, searching data by key (or range), serving data by key, querying data by key
  2. Storing data by row that doesn’t conform well to a schema (variable schema)

When not to use HBase:

  1. For relational analytics
  2. Full table scans
  3. Data to be aggregated, analyzed by rows instead of columns

Neo4j:

Neo4j is graph database using Property Graph Data Model (Data is stored as a graph and nodes & relationships with properties)

Key features:

  1. Supports full ACID(Atomicity, Consistency, Isolation and Durability) rules
  2. Supports Indexes by using Apache Lucence
  3. Schema free, bottom-up data model design
  4. High scalability has been achieved due to compact storage and memory caching available for graphs

When to use:

  1. Master data management
  2. Network and IT Operations
  3. Real time recommendations
  4. Fraud detection
  5. Social network (like facebook)

When not to use:

  1. Bulk queries/Scans
  2. If your application requires Partitioning & Sharding of data

Have a look at comparison of various NoSQL technologies in this article

Sources:

Wiki, Slide share, Cloudera,Tutorials Point,Neo4j

清风挽心 2024-09-26 20:58:41

您还可以评估多模型 DBMS,作为第二代 NoSQL 产品。有了多型号,您就不必只选择一种型号,而是可以选择多种型号。

第一个多模型 NoSQL 是 OrientDB

You could also evaluate a Multi-Model DBMS, as the second generation of NoSQL product. With a Multi-Model you don't have all the compromises on choosing just one model, but rather more than one model.

The first multi-model NoSQL is OrientDB.

寄离 2024-09-26 20:58:41

关于 MongoDB 和 NoRM(MongoDB 的 .net 扩展)的相当不错的文章
http://lukencode.com/2010/07 /09/开始使用-mongodb-and-norm/

Pretty decent article here on MongoDB and NoRM (.net extensions for MongoDB)
http://lukencode.com/2010/07/09/getting-started-with-mongodb-and-norm/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文