当前位置：文江博客话题详情

何时用 NoSQL 取代 RDBMS/ORM

发布于 2024-09-15 06:39:10 字数 132 浏览 2 评论 0原文

什么样的项目可以从使用 NoSQL 数据库而不是 ORM 包装的 RDBMS 中受益？

示例：

Stackoverflow 类似网站？
社会团体？
论坛？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

余生共白头 2024-09-22 06:39:11

你的问题很笼统。 NoSQL 描述了一系列彼此截然不同的数据库技术。大致有：

键值存储（Redis、Riak）
三重存储（AllegroGraph）
列族存储（Bigtable、Cassandra）
面向文档的存储（CouchDB、MongoDB）
图形数据库（Neo4j）

项目可以从使用在项目的开发阶段期间，您可以使用文档数据库，因为您不必设计复杂的实体关系图或编写复杂的连接查询。我在这个答案中详细介绍了文档数据库的其他用途。

如果您的应用程序需要处理大量数据，那么当您使用专门的 NoSQL 解决方案（例如 Cassandra）时，开发阶段可能会更长。但是，当您的应用程序进入生产环境时，它将极大地受益于 Cassandra 的性能和可扩展性。

一般来说，如果应用程序具有以下要求：

水平扩展
使用数据模型 X
执行 Y 操作

该应用程序将受益于使用适合存储数据模型 X 并对数据执行 Y 操作的 NoSQL 解决方案。如果您需要有关某种类型的 NoSQL 数据库的更具体的答案，您需要更新您的问题。

开发过程中的好处（例如比 SQL 更容易使用，无许可成本）？
性能方面的优势（例如，在一百万个并发用户的情况下运行得像地狱一样）？
什么类型的 NoSQL 数据库？

更新

键值存储在大多数情况下只能通过键查询。它们对于存储简单数据非常有用，例如用户会话、简单配置文件数据或预先计算的值和输出。尽管可以在键值对中存储更复杂的数据，但这会给应用程序带来维护“手动”索引以执行更高级查询的责任。

Triplestores 用于存储资源描述元数据。除了维基百科告诉我的内容之外，我对这些商店一无所知，所以你会必须对此做一些研究。

列族存储是为存储和处理大量数据而构建的。它们由 Google 搜索引擎和 Facebook 收件箱搜索< /a>.数据通过 MapReduce 函数查询。虽然 MapReduce 函数一开始可能很难掌握，但概念非常简单。这里有一个类比（希望）可以解释这个概念：

假设您有多个装满收据的鞋盒，并且您想要计算您的总费用。您邀请一些朋友过来，并为每个鞋盒分配一个人。每个人在鞋盒中写下每张收据的总额。这个选择所需数据的过程就是Map部分。

当一个人写下他的（部分）收据总额时，他可以对这些总额进行求和。这是Reduce 部分，可以重复多次，直到处理完所有收据。最后，你所有的朋友聚集在一起，总结他们的总金额，给你你的总开支。这是最后的Reduce 步骤。

这种方法的优点是，您可以拥有任意数量的鞋盒，并且可以将任意数量的人分配给一个鞋盒，但最终仍然会得到相同的结果。每个鞋盒都可以被视为数据库网络中的一台服务器。每个好友都可以看作服务器上的一个线程。使用 MapReduce，您可以将数据分布在许多服务器上，并让每个服务器处理部分查询，从而优化数据库的性能。

面向文档的存储在这个问题中进行了解释，所以我不会在这里讨论它们。

图形数据库用于存储高度连接的对象的网络，例如社交网络上的用户。这些数据库针对图操作进行了优化，例如查找两个节点之间的最短路径，或者查找距离当前节点三跳以内的所有节点。此类操作在 RDBMS 系统或其他 NoSQL 数据库上相当昂贵，但在图形数据库上非常便宜。

Your question is very general. NoSQL describes a collection of database techniques that are very different from each other. Roughly, there are:

Key-value stores (Redis, Riak)
Triplestores (AllegroGraph)
Column-family stores (Bigtable, Cassandra)
Document-oriented stores (CouchDB, MongoDB)
Graph databases (Neo4j)

A project can benefit from the use of a document database during the development phase of the project, because you won't have to design complex entity-relation diagrams or write complex join queries. I've detailed other uses of document databases in this answer.

If your application needs to handle very large amounts of data, the development phase will likely be longer when you use a specialized NoSQL solution such as Cassandra. However, when your application goes into production, it will greatly benefit from the performance and scalability of Cassandra.

Very generally speaking, if an application has the following requirements:

scale horizontally
work with data model X
perform Y operations

the application will benefit from using a NoSQL solution that is geared towards storing data model X and perform Y operations on the data. If you need more specific answers regarding a certain type of NoSQL database, you'll need to update your question.

Benefits during development (e.g. easier to use than SQL, no licensing costs)?
Benefits in terms of performance (e.g. runs like hell with a million concurrent users)?
What type of NoSQL database?

Update

Key-value stores can only be queried by key in most cases. They're useful to store simple data, such as user sessions, simple profile data or precomputed values and output. Although it is possible to store more complex data in key-value pairs, it burdens the application with the responsibility of maintaining 'manual' indexes in order to perform more advanced queries.

Triplestores are for storing Resource Description Metadata. I don't anything about these stores, except for what Wikipedia tells me, so you'll have to do some research on that.

Column-family stores are built for storing and processing very large amounts of data. They are used by Google's search engine and Facebook's inbox search. The data is queried by MapReduce functions. Although MapReduce functions may be hard to grasp in the beginning, the concept is quite simple. Here's an analogy which (hopefully) explains the concept:

Imagine you have multiple shoe-boxes filled with receipts, and you want to calculate your total expenses. You invite some of your friends over and assign a person to each shoe-box. Each person writes down the total of each receipt in his shoe-box. This process of selecting the required data is the Map part.

When a person has written down the totals of (some of) his receipts, he can sum up these totals. This is the Reduce part and can be repeated multiple times until all receipts have been handled. In the end, all of your friends come together and sum up their total sums, giving you your total expenses. That's the final Reduce step.

The advantage of this approach is that you can have any number of shoe-boxes and you can assign any number of people to a shoe-box and still end up with the same result. Each shoe-box can be seen as a server in the database's network. Each friend can be seem as a thread on the server. With MapReduce you can have your data distributed across many servers and have each server handle part of the query, optimizing the performance of your database.

Document-oriented stores are explained in this question, so I won't discuss them here.

Graph databases are for storing networks of highly connected objects, like the users on a social network for example. These databases are optimized for graph operations, such as finding the shortest path between two nodes, or finding all nodes within three hops from the current node. Such operations are quite expensive on RDBMS systems or other NoSQL databases, but very cheap on graph databases.

回复收藏 0 原文