关于Paxos实现的问题
我正在集群模拟器应用程序中实现 Paxos,使用 Wikipedia 中提供的文档。不幸的是,它留下了一些解释的空间,并且没有提供关于关键实施问题的太多信息。这是不清楚和不完整的。
- 假设集群分为 3 个区域,每个区域包含 3 个节点(总共 = 9 个节点)。如果区域之间的通信中断会发生什么?任何领导者都不可能达到法定人数(即 5)。
Paxos不是要进入死循环了吗?我想如果一个人无法与至少法定数量的节点进行通信,那么就不应该启动 Paxos。
- 在第 1b 阶段:“如果提案编号 N 大于任何先前的提案,则每个接受者承诺不接受小于 N 的提案,并发送它最后接受的值将此实例发送给提议者'。
什么是“它接受的最后一个值”?是提案人之前的提案编号吗? 在这种情况下,“实例”到底指的是什么?
在第 1a 阶段:是否包含与“准备”消息一致的值,或者是否推迟到“接受”消息!信息?或者它确实重要吗?
在第 2a 阶段:“如果任何接受者已经接受了某个值,则领导者必须选择一个具有最大提案编号 N 的值”。
这里的价值是什么?是提案编号吗?我相信不是,但这句话不清楚。
在阶段 2a:“否则,提案者可以自由选择任何值”。这意味着什么?有什么价值?对于提案编号?
Paxos 似乎依赖 N(提案数)不断增加的值来工作?这是正确的吗?
维基百科条目没有讨论节点在开始参与 Paxos 之前应设置的初始值。这些是什么?
PS:我没有足够的声誉来创建“Paxos”标签(有志愿者吗?)
I am implementing Paxos in a cluster simulator application, using the documentation available in Wikipedia. Unfortunately, it leaves several doors open to interpretation and does not provide much information on key implementation issues. It is unclear and incomplete.
- Assuming a cluster divided in 3 regions, each containing 3 nodes (total = 9 nodes). What happens if communication is broken between regions? There is no way any leader can reach quorum (which is 5).
Isn't Paxos going to enter an infinite loop? I guess one should not initiate Paxos if one cannot communicate with at least a quorum of nodes.
- In Phase 1b: 'If the proposal number N is larger than any previous proposal, then each Acceptor promises not to accept proposals less than N, and sends the value it last accepted for this instance to the Proposer'.
What is 'the last value it accepted'? Is it any previous proposal number from the proposer?
What does 'instance' refer to exactly in this case?
In Phase 1a: Does one include the value to agree on with the Prepare message or is this deferred to the Accept! message? Or it does matter?
In Phase 2a: 'If any of the Acceptors have already accepted a value, the leader must Choose a value with the maximum proposal number N'.
What is value here? Is it the proposal number? I believe not, but this phrase is unclear.
In Phase 2a: 'Otherwise, the Proposer is free to choose any value'. What does this mean? A value for what? For the proposal number?
Paxos seems to rely on an increasing value of N (proposal number) to work? Is this correct?
The wikipedia entry does not discuss the initial values a node should set before starting to participate in Paxos. What are these?
P.S.: I don't have enough reputation to create a 'Paxos' tag (any volunteer?)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Paxos 中的命名法有点不直观。
Paxos 要求您至少可以获得法定人数(在您的情况下为 5 个节点)。选择您的三个区域的解决方案;三个区域之间有两个网络分区是一个非常坏的消息。我还使用 Paxos 的一个版本,它可以将节点成员资格从一个实例更改为下一个实例。这对于分区和节点故障很有用。
Paxos 的简单实现并不能保证终止,因为多个节点可以跳过准备阶段。有两种方法可以解决这个问题。一是在开始新的准备阶段之前进行随机退避。第二种是将所有请求路由到指定的领导者,该领导者充当提议者(领导者由 Paxos 实例选择。另请参阅 Multi-paxos)
当节点从提议者接收到 Accept!(roundId, value) 消息并且它没有承诺不接受该值(由于准备!(higherRoundId) 消息)时,它会存储该值和 roundId(我将它们称为
acceptedValue
和acceptedRoundId
)。由于后续的 Accept!(...) 消息,它可能会覆盖这些内容。当节点从提议者接收到Prepare!(roundId)消息时,它将roundId存储为promiseRoundId = max(roundId,promiseRoundId)。然后,它将一个
Promise!(acceptedRoundId,acceptedValue)
发送回提议者。注意:如果节点没有收到 Accept!(...) 消息,它将回复Promise!(null, null)
。无需发送。我不知道。
该值是算法正在达成共识的实际数据。我将其改写为
要开始接受阶段,提案者必须根据准备阶段的结果选择要接受的值。如果任何 Acceptor 回复 Promise(roundId, value),则 Proposer 必须使用与最高 roundId 关联的值。否则,Proposer 只收到 Promise(null, null),并且可以选择任何值发送给接受者。
注意:这里的 Proposal 编号与 roundId 相同。
这是你想要达成共识的价值观。这通常是分布式系统中的状态更改,可能由客户端请求触发。
回合 ID(又名提案编号)应该不断增加,并且必须在所有节点上的每个实例都是唯一的。 Paxos 论文假设您可以做到这一点,因为实现起来很简单。这是一种在所有节点上产生相同结果的方案:
或者用伪代码(显然缺乏一些主要的优化):
The nomenclature in Paxos is a little unintuitive.
Paxos requires you can get at least a quorum (5 nodes in your case). Go with your solution of three regions; having two network partitions between the three regions is very bad news. I also use a version of Paxos which can change node membership from one instance to the next. This is useful for partitions and node failure.
A naive implementation of Paxos is not guaranteed to terminate because multiple nodes can leap-frog Prepare phases. There are two ways of getting around this. One is to have a random backoff before starting new Prepare phases. The second is to route all requests to a designated leader, that acts as proposer (The leader is chosen by a Paxos instance. See also Multi-paxos)
When a node receives an Accept!(roundId, value) message from a Proposer and it hasn't promised to not accept the value (due to a Prepare!(higherRoundId) message), it stores the value and the roundId (I'll call them
acceptedValue
andacceptedRoundId
). It may write over these due to subsequent Accept!(...) messages.When a node receives a Prepare!(roundId) message from a Proposer, it stores roundId as
promiseRoundId = max(roundId, promiseRoundId)
. It then sends aPromise!(acceptedRoundId, acceptedValue)
back to the Proposer. NB: if a node hasn't received an Accept!(...) message, it replies withPromise!(null, null)
.There is no need to send it. I don't.
The value is the actual data the algorithm is reaching consensus on. I'll rephrase this to
To start the Accept Phase, The Proposer must choose a value to be accepted depending on the results of the Prepare phase. If any Acceptor replied with Promise(roundId, value), the Proposer must use the value associated with the highest roundId. Otherwise, the Proposer received only Promise(null, null), and may choose any value to send to the acceptors.
NB: Proposal number here is the same thing as roundId.
This is the value you want to have consensus on. This is typically a state change across the distributed system, perhaps triggered by a client request.
Round ids (aka proposal numbers) should be increasing and must be unique per instance across all nodes. The Paxos paper assumes you can do this because it is trivial to achieve. Here's one scheme that produces the same results on all nodes:
roundId = i*M + index[node]
where i is the ith round this node is starting (that is i is unique per node per paxos instance, and is monotonically increasing).Or in pseudo-code (which is clearly lacking a few major optimizations):
我找到了以下文档解释Paxos 的更多细节。我已经相应地更新了维基百科条目。
我能找到的问题的答案是:
Paxos 仅在至少有法定数量的节点可以相互通信时才起作用(在我们的例子中为 5)。因此,如果一个节点无法与至少法定数量的节点通信,则它不应该尝试 Paxos。
它是最后接受的命题编号和对应的值。
它指的是接受者。
该值不包含在Prepare消息中,而是留给Accept Request消息。
如果接受者已经接受了提案者的提案,他们可以返回相应的提案编号和值,否则什么也不返回。
第二个问题落下,因为维基百科条目具有误导性。人们可以为给定的提案选择任意值,或者从与先前接受的提案相对应的值中导出它。
是的。提案者 p 需要不断增加其提案的数量。
节点应保留其最后接受的提案编号,并最终保留相应的值。他们应该坚持下去。首次连接时,给定提议者的初始提议编号应为空(或任何等效值)。
I have found the following document explaining Paxos in more details. I have updated the wikipedia entry accordingly.
The answers to my question I could find are:
Paxos only works if at least a quorum of nodes can communicate with each other (in our case 5). Hence, if a node cannot communicate with at least a quorum of nodes, it should not try Paxos.
It is the last accepted proposition number and corresponding value.
It refers to the acceptor.
The value is not included in the Prepare message, it is left to the Accept Request message.
If acceptors have already accepted a proposal from the proposer, they can return the corresponding proposal number and value, else nothing.
The second question falls, since the Wikipedia entry was misleading. One can choose an arbitrary value for a given proposal or derive it from values corresponding to proposals accepted earlier.
Yes. A proposer p needs to number its proposals increasingly.
Nodes should keep their last accepted proposal number and eventually, the corresponding value too. They should persist it. When connecting for the first time, the initial proposal number for a given proposer should be null (or any equivalent).
每个提议者都有稳定的存储。每个提案者都会记住(在稳定存储中)其尝试发布的编号最高的提案,并以比其已使用的提案编号更高的提案编号开始第一阶段。
Each proposer has stable storage. Each proposer remembers (in stable storage) the highest-numbered proposal it has tried to issue, and begins phase 1 with a higher proposal number than any it has already used.