在这种情况下,Paxos 代理的正确行为是什么?
我正在研究 Paxos,但我对算法在这个人为示例中的行为方式感到困惑。我希望下图能够解释这个场景。
几点:
- 每个代理充当提议者/接受者/学习者
- 准备消息 提案消息的格式为
(instance,proposal_num)
- 提案消息的格式为
(instance,proposal_num,proposal_val)
- 同时启动提案流程
- Server1 和 Server2 都决定在开始消息中 M1、M2 和 M3 同时发生
在这里,虽然协议是“正确的”,即只选择了一个值 S2
,但 Server1 和 Server2 认为选择它是因为提案编号不同。
Paxos 算法是否仅在 Decide(...)
消息发送给学习者时终止?我一定是误解了 Paxos Made Simple,但我认为这个选择是在提案者达到其 Propose(...)
消息的法定人数时做出的。
如果仅在向代理发送 Decide(...)
消息后才做出选择,则 Server2 是否应终止发送 Decide(1, 5, S2)
当它恢复时,因为它看到了稍后的 Prepare(1, 7)
?
I'm looking into Paxos and I'm confused about how the algorithm should behave in this contrived example. I hope the diagram below explains the scenario.
A few points:
- Each agent acts as a proposer/acceptor/learner
- Prepare messages have form
(instance, proposal_num)
- Propose messages have form
(instance, proposal_num, proposal_val)
- Server1 and Server2 both decide to start the proposal process at the same time
- In the beginning messages M1, M2 and M3 occur simultaneously
It seems here that although the protocol is 'correct', i.e. only one value S2
is chosen, Server1 and Server2 believe that it was chosen because of different proposal numbers.
Does the Paxos algorithm only terminate when the Decide(...)
message is sent out to the learners? I must be misunderstanding Paxos Made Simple but I thought the choice was made the moment the proposers achieved quorum for their Propose(...)
messages.
If the choice is only made after the Decide(...)
message is sent out to the agents, should Server2 terminate its sending of Decide(1, 5, S2)
when it recovers because it saw a later Prepare(1, 7)
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
只是要重新定义术语(让我们也扔掉 1,因为我们只检查一次 Paxos 迭代):
1) Propose(n) == suggest(n),来自当前身份为 n 的提议者的消息
2) AcceptPrepare(n, v) == ack(n,v),消息发送给提议者 n。如果该节点尚未接受任何值,则 v 为空,ow v 等于它已接受的值
3) CreateDecide(n,v) == Accept!(x,v),要求节点接受来自提议者的该值身份x。如果节点已确认准备(n)消息(其中 n > ),则节点将拒绝该消息。 x
一旦prepare(n)达到法定人数——也就是说,大多数人都确认了消息——那么身份为n的提议者就会发出一条命令accept!(n,v)。如果aprepare(n+x),则x>n。 0,由身份为 n+x 的提议者发出——并且被多数人确认——在 ack(n,v) 消息和accept!(n,v)之间,然后多数人承诺不会接受带有时间戳 < 的建议值n+x,x>n 0(又名节点将拒绝接受!(n,v))
一旦大多数节点收到他们未承诺忽略的接受!(n,v)消息,就会做出选择。
因此,当 server2 重新上线并发送 Accept!(5,S2) 时,它将被忽略,因为 5 < 7.
Just gonna redefine the terms (let's also throw out the 1 because we're only examining one Paxos iteration):
1) Propose(n) == propose(n), message from a proposer with current identity n
2) AcceptPrepare(n,v) == ack(n,v), message sent to proposer n. v is blank if this node hasn't accepted any value yet, o.w. v equals the value it has accepted
3) CreateDecide(n,v) == accept!(x,v), asking the nodes to accept this value from proposer with identity x. nodes will reject the message if they've ack'ed a prepare(n) message where n > x
Once quorum is achieved for prepare(n) -- that is, a majority have ack'ed the message -- then the proposer with identity n sends out a command accept!(n,v). If a prepare(n+x), x > 0, was sent out by a proposer with identity n+x -- and it's ack'ed by a majority -- between the ack(n,v) messages and the accept!(n,v), then a majority has promised not to accept values proposed with a timestamp < n+x, x > 0 (AKA the nodes will reject the accept!(n,v))
The choice is made as soon as a majority receives an accept!(n,v) message which they have not promised to ignore.
Thus, when server2 comes back online and it sends accept!(5,S2), it will be ignored because 5 < 7.
为了与已接受的答案进行对比,算法本身永远不需要在任何非常明确的意义上终止。谈论每个节点单独终止其对算法的参与更有意义,这是实现定义的。那么也许您可以说,当所有参与节点都退出时,算法本身就终止了,如果这是一个有用的信息的话。
一旦大多数接受者发送了针对同一张选票的
AcceptPropose
消息,该算法就有效地收敛(从某种意义上说,一旦发生这种情况,就无法选择什么)值可能最终确定),但这不是在实践中可以观察到的情况:例如,如果网络在发送这组AcceptPropose
消息之前开始丢弃消息,则没有节点将能够了解到算法已经收敛,直到连接恢复。然而,一旦一个节点知道算法已经收敛(通过收到来自多数的
AcceptPropose
消息),它就可以安全地通过传统方式共享所选值,例如通过广播或八卦发送决策消息,并让其他节点转发该消息,直到每个节点都知道已实现收敛。一旦知道算法收敛到的值,每个节点就可以终止其对算法的参与,尽管它可能更愿意继续参与更长时间,具体取决于您的实现限制。
你必须考虑一下容错性,才能让自己相信终止决策可以保持活跃性:如果所有知道决定值的节点在共享它之前就死掉了,那么进展仍然可能吗?幸运的是,答案是肯定的:只要大多数节点仍然活着,如果其中任何一个节点知道决定的值,那么它可以与其他节点共享它,如果不知道,那么大多数参与节点只需要选择更高的选票数并进行另一轮。
在接受的答案中需要注意的一件事是这句话:
首先,协议中没有关于承诺忽略
AcceptPropose
消息的内容。 Promise 与应忽略/拒绝的Propose
消息有关。大多数AcceptPropose
消息始终可用于了解所选值,无论选票号如何。其次,一旦大多数发送
AcceptPropose
消息,就会有效地做出选择。您无法直接观察到这一点,因此必须等到至少一个节点收到来自多数节点的AcceptPropose
消息后才能知道已做出选择。一旦发生这种情况,您可以通过更多AcceptPropose
消息或通过Decided
消息的广播/八卦来共享所选值,具体取决于哪种更适合您的实现约束。To give a counterpoint to the accepted answer, the algorithm itself never really needs to terminate in any terribly well-defined sense. It makes more sense to talk about each node individually terminating its participation in the algorithm, which is implementation-defined. Then perhaps you could say that the algorithm itself has terminated when all the participating nodes have dropped out, if that were a useful thing to want to know.
The algorithm has effectively converged as soon as a majority of acceptors have sent their
AcceptPropose
messages for the same ballot (in the sense that once that has happened there is no option about what value may ultimately be decided) but this is not a state of affairs that can be observed in practice: for instance if the network were to start dropping messages just before this set ofAcceptPropose
messages were sent, no node would be able to learn that the algorithm had converged until connectivity were restored.However, once one node knows that the algorithm has converged (by having received
AcceptPropose
messages from a majority) it is safe for it to share the chosen value via conventional means, e.g. by sending aDecide
message by broadcast or gossip, and for other nodes to forward it on until every node knows that convergence was achieved.Each node can terminate its participation in the algorithm once it knows the value to which the algorithm has converged, although it may prefer to keep participating for longer, depending on your implementation constraints.
You have to think a bit about failure-tolerance to convince yourself that terminate-on-decided preserves liveness: if all the nodes that know what value was decided were to die before they could share it, would progress still be possible? The answer is, fortunately, yes: as long as a majority of nodes remains alive, if any of them knows the decided value then it can share it with the others, and if not then there's a majority of participating nodes that just need to pick a higher ballot number and run another round.
One thing to be careful of in the accepted answer is this phrase:
Firstly, there's nothing in the protocol about promising to ignore
AcceptPropose
messages. Promises pertain to whichPropose
messsages should be ignored/rejected. A majority ofAcceptPropose
messages can always be used to learn the chosen value, regardless of the ballot number.Secondly, the choice is effectively made as soon as a majority sends
AcceptPropose
messages. You can't observe this directly so you have to wait until at least one node has receivedAcceptPropose
messages from a majority before knowing that the choice has been made. Once that's happened, you can share the chosen value via moreAcceptPropose
messages or via a broadcast/gossip ofDecided
messages, depending on which suits your implementation constraints better.