当前位置：文江博客话题详情

这些数据库管理系统在网络分区期间实际上如何行事？

发布于 2025-01-26 08:12:43 字数 347 浏览 5 评论 0原文

我正在考虑部署一个数据库管理系统，该系统在各个地区（一个国家的各个数据中心）进行了复制。我目前正在研究以下候选人：

MongoDB（NOSQL，CP System）
蟑螂（SQL，CP System）
Cassanda（NOSQL，AP系统）

这三个节点之间的网络分区期间如何表现？假设我们将所有这些部署在3节点群集中。

如果在网络失败期间，两个次级/追随者节点与领导者分离，会发生什么？

MongoDB和蟑螂块会在网络分区期间读取吗？如果是这样，在整个分区期间，或仅在领导选举期间（蟑螂）？

Cassandra在网络分区期间会允许阅读吗？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

哽咽笑 2025-02-02 08:12:43

从理论上讲，这三个的答案都是相同的：取决于提出读取请求的应用程序。您可以选择可用性（读取成功，但可能已经过时）或一致性（通常会失败）。细节在这三个方面有所不同，数据库实际上能够兑现其提供的保证的程度。

Cassandra

Cassandra理论上： Cassandra读写并写道，指出了多少个节点需要确认请求才能被认为是成功的。这使您可以将一致性，可用性和吞吐量要求调整为单个工作负载。为了在N节点群集中具有强大的一致性，您可以在读取和写入中总共需要N+1个ACK。在您的3个节点示例中，您可能需要所有3个节点才能写入ACK，而读取只有1个节点。在这种情况下，在任何网络分区中都无法接受写作，因此在不牺牲一致性的情况下读取 can 。或者，您可能需要3个节点进行读取，而只需要1个节点才能写入，从而扭转了可用性。更常见的是，应用程序倾向于对读取和写入都需要大多数：在这种情况下每个节点。这意味着在网络分区期间，读取和写入都可能失败，但可以最大程度地提高整体性能。只需要1个ACK来进行所有查询，并且有一些不一致的生活也很常见。

Cassandra在实践中： 您将不得不忍受一些不一致无论如何。 Cassandra通常不会通过Jepsen测试套件来检测不一致的写作。在重负载和网络分区下，即使在其他要求时，您也可能会得到一些损坏的数据。

MongoDB

MongoDB理论上： mongoDB具有主要和次要节点。如果启用次要读取，则会获得可能已过时的数据。如果您不这样做，请阅读尝试仅转到主节点，因此，如果您与之隔离，则某些读数将失败，直到MongoDB恢复。

MongoDB在实践中：从历史上看，MongoDB在测试其一致性时做得不好 - 它的早期版本使用从根本上有缺陷，即使要求完全一致性，也会导致陈旧且肮脏的读物。截至2017年，他们似乎已经用。在这三个中，Mongo是我没有直接合作过的那个，所以我将其留下。

cock虫

蟑螂从理论上讲：默认情况下，cock虫选择一致性。如果幸运的话，在网络分区的前9秒内进行了一些读取，将击中该节点，该节点在满足请求所需的所有数据上获得了9秒租赁。只要节点不能建立法定人数，他们就无法创建新的租赁，因此最终所有读取都开始失败，因为没有一个节点可以确信其他两个节点不接受新的写入。但是，蟑螂允许“有限的陈旧性读物”，无需租赁即可提供。 的查询从促销时间到系统时间，使用_max_staleness（'10s'）从促销_codes中选择代码，将继续在网络分区中成功10-19秒。

cockleachdb在实践中：蟑螂带来了Aphyr，研究人员的jepsen分析了我上面链接的，早期其开发过程。现在，它运行夜间jepsen测试，因此不太可能以这种特定方式违反其一致性保证。

总结

所有三个数据库都努力支持选择一致性或可用性。在网络分区期间，以“一致模式”的读取将开始失败，直到大多数节点重新建立通信为止。在网络分区期间，以“可用性模式”读取的可能性较小，但是您从一个孤立的节点中读取的风险是有风险，而另外两个节点则彼此重新建立了沟通，并开始接受新写入。在这三个数据库中，卡桑德拉（Cassandra）具有指定这种行为的最灵活性，而cock虫具有最可靠的一致性保证。

The answer for all three is, in theory, the same: it's up to the application making the read request. You can choose either availability (the read succeeds but could be out of date) or consistency (the read generally fails). The details vary among the three, as do the degree to which the databases are able to actually honor the guarantees they make.

Cassandra

Cassandra in theory: Cassandra reads and writes specify how many nodes need to acknowledge the request in order for it to be considered successful. This allows you to tune your consistency, availability, and throughput requirements to individual workloads. For strong consistency in an N-node cluster, you can require a total of N+1 acks across both reads and writes. In your 3 node example, you could require all 3 nodes to ack for a write, and only 1 for a read. In this case, writes can't be accepted during any network partition, so reads can without sacrificing consistency. Or you could require 3 nodes for a read and only 1 for a write, reversing the availability. More commonly, applications tend to require a majority for both reads and writes: 2 nodes each in this case. This means that both reads and writes can fail during a network partition, but can maximize overall performance. It's also common to just require 1 ack for all queries and live with some inconsistency.

Cassandra in practice: You're going to have to live with some inconsistency regardless. Cassandra generally doesn't pass the Jepsen test suite for detecting inconsistent writes; under heavy load and a network partition you're likely to end up with some corrupted data even when requesting otherwise.

MongoDB

MongoDB in theory: MongoDB has a primary and secondary nodes. If you enable secondary reads, you get data that could be out of date. If you don't, read attempts only go to the primary node, so if you're cut off from that some reads will fail until MongoDB recovers.

MongoDB in practice: Historically, MongoDB has not done well when its consistency is tested--its earlier versions use a protocol that is considered fundamentally flawed, leading to stale and dirty reads even when requesting full consistency. As of 2017, it tentatively seemed like they had fixed those issues with a new protocol. Of these three, Mongo's the one I haven't worked with directly so I'll leave it at that.

CockroachDB

CockroachDB in theory: By default, CockroachDB chooses consistency. If you're lucky, some reads in the first 9 seconds of a network partition will hit the node that acquired a 9-second lease on all the data needed to serve the request. As long as the nodes can't establish a quorum, they can't create new leases, so eventually all reads start failing as no one node can be confident that the other two nodes aren't accepting new writes. However, Cockroach allows "bounded staleness reads" that can be served without a lease. Queries of the form SELECT code FROM promo_codes AS OF SYSTEM TIME with_max_staleness('10s') will continue to succeed for 10-19 seconds into a network partition.

CockroachDB in practice: CockroachDB brought in Aphyr, the researcher whose Jepsen analyses I linked above, early on it its development process. It now runs nightly Jepsen tests simulating a network partition under load and verifying consistency, so it's unlikely to violate its consistency guarantee in that particular way.

Summary

All three databases make an effort to support choosing either consistency or availability. Reads in "consistent mode" will start failing during a network partition until a majority of nodes reestablish communication with each other. Reads in "availability mode" will be less likely to fail during a network partition, but there's a risk you're reading from one isolated node while the other two have reestablished communication with each other and started accepting new writes. Of the three databases, Cassandra has the most flexibility for specifying this behavior per-query, while CockroachDB has the most reliable guarantee of consistency.

回复收藏 0 原文

~没有更多了~