hbase-site.xml 中的 Zookeeper 仲裁设置到底是什么?

发布于 2024-10-07 08:40:20 字数 45 浏览 8 评论 0原文

hbase-site.xml 中的 Zookeeper 仲裁设置到底是什么?

What exactly is the zookeeper quorum setting in hbase-site.xml?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

波浪屿的海角声 2024-10-14 08:40:20

hbase-default.xml 中所述, 这是设置:

ZooKeeper 仲裁中的服务器的逗号分隔列表。例如,“host1.mydomain.com、host2.mydomain.com、host3.mydomain.com”。默认情况下,它设置为 localhost 以实现本地和伪分布式操作模式。对于完全分布式设置,应将其设置为 ZooKeeper 仲裁服务器的完整列表。如果在 hbase-env.sh 中设置了 HBASE_MANAGES_ZK,则这是我们将启动/停止 ZooKeeper 的服务器列表。

Edward J. Yoon 此处。 为清楚起见,我进行了编辑:

Apache Zookeeper 是分布式应用程序的协调服务,例如 Google 的 Chubby。很多项目都使用zookeeper,我们(Apache Hama)也使用zookeeper来进行Bulk Synchronous Parallel计算框架的屏障同步。

今天,我调查了更多有关 Zookeeper 项目的 paxos 和动态仲裁功能的信息,以便更好地命名该类org.apache.hama.zookeeper.QuorumPeer。由于文档不够( http://hadoop.apache。 org/zookeeper/docs/r3.0.0/api/index.html ),我不明白“quorum”的含义,因为这个术语对我来说有点奇怪。但是,“org.apache.hama.zookeeper.QuorumPeer”是正确的名称! xD

那么,什么是法定人数以及为什么我们需要法定人数?

根据维基百科,法定人数是审议机构开展该团体业务所需的最低成员人数。通常情况下,这是预期参加的大多数人,尽管许多机构的法定人数可能较低或较高。

众所周知,容错机制是分布式系统的重要功能之一。 Quorum 算法用于防止裂脑情况。当裂脑情况发生时,zookeeper根据Quorum算法确定“主分区”和“辅助分区”。然后,主组中的服务器接收并处理用户的请求,次要组中的服务器变为只读。

该系统何时从裂脑状态中恢复?当它们再次合并到一个分区时。在内部,zookeeper 使用原子广播协议而不是 Paxos。

您还应该阅读原始版本,以防我错误翻译了他试图呈现的概念。

我对 Apache Zookeeper 中的仲裁机制的理解是,它明确定义了跨多个预定义主机的复制仲裁。如果未满足此法定人数,则不同意的分区将被拆分到辅助分区,直到 Zookeeper 可以将它们与主分区重新集成。

这为 Hadoop 的最终一致性模型添加了更多粒度。与此同时,HBase 目前正在进一步将 Zookeeper 与其代码集成。

As described in hbase-default.xml, here's the setting:

Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on.

What this actually does has been answered by Edward J. Yoon here. With editing on my part, for clarity:

The Apache Zookeeper is a coordination service for distributed applications, like Google's Chubby. Many projects uses zookeeper, and we (Apache Hama) also use zookeeper for barrier synchronization of Bulk Synchronous Parallel computing framework.

Today, I surveyed more about the paxos and dynamic quorum features of the Zookeeper project, to better name the class org.apache.hama.zookeeper.QuorumPeer. Because of documentation is not enough ( http://hadoop.apache.org/zookeeper/docs/r3.0.0/api/index.html ), I didn't understand the meaning of "quorum", as this term was somewhat odd to me. But, "org.apache.hama.zookeeper.QuorumPeer" is the proper name!! xD

So, what is the Quorum and why do we need a Quorum?

According to Wikipedia, Quorum is the minimum number of members of a deliberative body necessary to conduct the business of that group. Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.

As you know, a Fault-Tolerant mechanism is one of the important functions of distributed system. The Quorum algorithm is used to prevent a split-brain condition. When split-brain condition occurs, according to the Quorum algorithm, zookeeper determines the "Primary Partition" and "Secondary Partition". Then, the servers in primary group receive and process user's request, and the servers in secondary group become read-only.

When does this system recover from a split-brain condition? When they're merged to one partition again. Internally, zookeeper uses atomic broadcast protocol instead of Paxos.

You should also read the original version, in case I mistranslated the concepts he was trying to present.

My understanding of the quorum mechanism in Apache Zookeeper is it explicitly defines a replication quorum across several pre-defined hosts. If this quorum is not met, the partitions that disagree are split off to a secondary partition until Zookeeper can reintegrate them with the primary partition.

This adds more granularity to Hadoop's eventual consistency model. HBase, meanwhile, is currently in the process of further integrating Zookeeper with its code.

浪荡不羁 2024-10-14 08:40:20

从 hbase-default.xml 文件:

ZooKeeper 仲裁中的服务器的逗号分隔列表。
例如,“host1.mydomain.com、host2.mydomain.com、host3.mydomain.com”。
默认情况下,对于本地和伪分布式模式,此设置为 localhost
的操作。对于完全分布式设置,应将其设置为完整
ZooKeeper 仲裁服务器列表。如果在 hbase-env.sh 中设置了 HBASE_MANAGES_ZK
这是我们将启动/停止 ZooKeeper 的服务器列表。

从入门的要求部分:

自版本 0.20.0 起,HBase 依赖于 ZooKeeper。 HBase 在 ZooKeeper 中保留其根表的位置、当前的 master 是谁以及当前有哪些区域参与集群。客户端和服务器现在必须知道它们的 ZooKeeper Quorum 位置,然后才能执行其他操作(通常它们从 CLASSPATH 上提供的配置中获取此信息)。默认情况下,HBase 将为您管理单个 ZooKeeper 实例。在独立和伪分布式模式下,这通常就足够了,但对于完全分布式模式,您应该配置 ZooKeeper 仲裁(更多信息如下)。

希望有帮助。

From the hbase-default.xml file:

Comma separated list of servers in the ZooKeeper Quorum.
For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
By default this is set to localhost for local and pseudo-distributed modes
of operation. For a fully-distributed setup, this should be set to a full
list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.

And from the Getting Started's Requirements section:

HBase depends on ZooKeeper as of release 0.20.0. HBase keeps the location of its root table, who the current master is, and what regions are currently participating in the cluster in ZooKeeper. Clients and Servers now must know their ZooKeeper Quorum locations before they can do anything else (Usually they pick up this information from configuration supplied on their CLASSPATH). By default, HBase will manage a single ZooKeeper instance for you. In standalone and pseudo-distributed modes this is usually enough, but for fully-distributed mode you should configure a ZooKeeper quorum (more info below).

Hope that helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文