两阶段提交

发布于 2024-12-04 02:09:10 字数 365 浏览 1 评论 0原文

我相信大多数人都知道 2PC(两阶段提交协议)是什么以及如何在 Java 或大多数现代语言中使用它。基本上,它用于确保当您有 2 个或更多数据库时事务同步。

假设我有两个数据库(A 和 B)在两个不同的位置使用 2PC。在 A 和 B 准备好提交事务之前,两个 DB 都会向事务管理器报告,表示它们已准备好提交。因此,当事务管理器被确认时,它将向 A 和 B 发送回信号,告诉他们继续进行。

这是我的问题:假设 A 收到信号并提交了交易。一切完成后,B正准备做同样的事情,但有人拔掉了电源线,导致整个服务器关闭。当B重新上线时,B会做什么?那么B是如何做到的呢?

请记住,A 已提交,但 B 未提交,并且我们正在使用 2PC(因此,2PC 的设计不再起作用,不是吗?)

I believe most of people know what 2PC (two-phase commit protocol) is and how to use it in Java or most of modern languages. Basically, it is used to make sure the transactions are in sync when you have 2 or more DBs.

Assume I've two DBs (A and B) using 2PC in two different locations. Before A and B are ready to commit a transaction, both DBs will report back to the transaction manager saying they are ready to commit. So, when the transaction manager is acknowledged, it will send a signal back to A and B telling them to go ahead.

Here is my question: let's say A received the signal and commited the transaction. Once everything is completed, B is about to do the same but someone unplugs the power cable, causing the whole server shutdown. When B is back online, what will B do? And how does B do it?

Remember, A is committed but B is not, and we are using 2PC (so, the design of 2PC stops working, does not it?)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

杀手六號 2024-12-11 02:09:10

关于两阶段提交

两阶段提交并不保证分布式事务不会失败,但它确实保证它不会在 TM 不知情的情况下悄然失败。

为了让 B 报告事务已准备好提交,B 必须将事务保存在持久存储中(即 B 必须能够保证事务在所有情况下都可以提交)。在这种情况下,B 已持久化事务,但事务管理器尚未收到来自 B 的确认 B 已完成提交的消息。

当B重新上线时,事务管理器将再次轮询B并要求其提交事务。如果 B 已经提交了事务,它将报告事务已提交。如果 B 尚未提交事务,那么它将提交,因为它已经持久化了事务,因此仍然可以提交事务。

为了使 B 在这种情况下失败,它必须经历丢失数据或日志条目的灾难性故障。事务管理器仍然会意识到 B 尚未报告成功提交。1

实际上,如果 B 无法再提交事务,则意味着导致 B 退出的灾难导致了数据丢失,当 TM 要求 B 提交一个它不知道或认为不处于可提交状态的 TxID 时,B 会报告错误。

因此,两阶段提交并不能防止灾难性故障的发生,但它确实可以防止故障被忽视。在这种情况下,如果 B 无法提交,事务管理器将向应用程序报告错误。

应用程序仍然必须能够从错误中恢复,但是在应用程序不知道不一致状态的情况下,事务不能默默地失败。

语义

  • 如果资源管理器或网络在第 1 阶段出现故障,
    事务管理器将检测到致命错误(无法连接到
    资源管理器)并将子事务标记为失败。当
    网络恢复后,它将中止所有网络上的事务
    参与的资源管理器。

  • 如果资源管理器或网络在第 2 阶段出现故障,
    事务管理器将继续轮询资源管理器,直到
    它又回来了。当它重新连接回资源管理器时
    它会告诉 RM 提交事务。如果 RM 返回
    TM 将意识到“未知 TxID”错误
    RM 中存在数据丢失问题。

  • 如果 TM 在第 1 阶段出现故障,那么客户端将阻塞,直到
    TM 会恢复,除非超时或由于以下原因收到错误
    网络连接中断。在这种情况下,客户会意识到
    错误,并且可以重试或自行启动中止。

  • 如果 TM 在第 2 阶段出现故障,那么它将阻止客户端,直到
    TM 又恢复了。它已经将交易报告为
    可提交且不应向客户端呈现致命错误,
    尽管它可能会阻塞,直到 TM 恢复为止。 TM还是会
    让事务处于未提交状态并将轮询 RM
    当它恢复时提交。

资源管理器中的提交后数据丢失事件不由事务管理器处理,而是 RM 弹性的函数。

两阶段提交不能保证容错 - 请参阅 Paxos 了解这是一个解决容错问题的协议示例,但它确实保证分布式事务的部分失败不会被忽视。

  1. 请注意,此类故障还可能丢失先前提交的事务中的数据。两阶段提交并不能保证资源管理器不会丢失或损坏数据,或者灾难恢复过程不会搞砸。

On Two-Phase Commit

Two phase commit does not guarantee that a distributed transaction can't fail, but it does guarantee that it can't fail silently without the TM being aware of it.

In order for B to report the transaction as being ready to commit, B must have the transaction in persistent storage (i.e. B must be able to guarantee that the transaction can commit in all circumstances). In this situation, B has persisted the transaction but the transaction manager has not yet received a message from B confirming that B has completed the commit.

The transaction manager will poll B again when B comes back online and ask it to commit the transaction. If B has already committed the transaction it will report the transaction as committed. If B has not yet committed the transaction it will then commit as it has already persisted it and is thus still in a position to commit the transaction.

In order for B to fail in this situation, it would have to undergo a catastrophic failure that lost data or log entries. The transaction manager would still be aware that B had not reported a successful commit.1

In practice, if B can no longer commit the transaction, it would imply that the disaster that took B out had caused data loss, and B would report an error when the TM asked it to commit a TxID that it wasn't aware of or didn't think was in a commitable state.

Thus, two phase commit does not prevent a catastrophic failure from occuring, but it does prevent the failure from going unnoticed. In this scenario the transaction manager will report an error back to the application if B cannot commit.

The application still has to be able to recover from the error, but the transaction cannot fail silently without the application being made aware of the inconsistent state.

Semantics

  • If a resource manager or network goes down in phase 1, the
    transaction manager will detect a fatal error (can't connect to
    resource manager) and mark the sub-transaction as failed. When the
    network comes back up it will abort the transaction on all of the
    participating resource managers.

  • If a resource manager or network goes down in phase 2, the
    transaction manager will continue to poll the resource manager until
    it comes back up. When it re-connects back to the resource manager
    it will tell the RM to commit the transaction. If the RM returns an
    error along the lines of 'Unknown TxID' the TM will be aware that
    there is a data loss issue in the RM.

  • If the TM goes down in phase 1 then the client will block until the
    TM comes back up, unless it times out or receives an error due to the
    broken network connection. In this case the client is made aware of
    the error and can either re-try or initiate the abort itself.

  • If the TM goes down in phase 2 then it will block the client until
    the TM comes back up. It has already reported the transaction as
    committable and no fatal error should be presented to the client,
    although it may block until the TM comes back up. The TM will still
    have the transaction in an uncommitted state and will poll the RMs
    to commit when it comes back up.

Post-commit data loss events in the resource managers are not handled by the transaction manager and are a function of the resilience of the RMs.

Two-phase commit does not guarantee fault tolerance - see Paxos for an example of a protocol that does address fault tolerance - but it does guarantee that partial failure of a distributed transaction cannot go un-noticed.

  1. Note that this sort of failure could also lose data from previously committed transactions. Two phase commit does not guarantee that the resource managers can't lose or corrupt data or that DR procedures don't screw up.
情话已封尘 2024-12-11 02:09:10

我相信三阶段提交是一种更好的方法。不幸的是,我还没有发现有人实施这种技术。

http://the-paper-trail.org/blog/共识协议-三相提交/

以下是上述文章的基本部分:

2PC的根本困难在于,一旦协调者做出提交决定并传达给某些副本,副本直接执行提交语句,而不检查其他副本是否收到消息。然后,如果提交的副本与协调器一起崩溃,系统无法知道事务的结果是什么(因为只有协调器和收到消息的副本才能确定)。由于事务可能已经在崩溃的副本上提交,因此协议不能悲观地中止——因为事务可能具有无法撤消的副作用。同样,协议不能乐观地强制事务提交,因为最初的投票可能是中止。

这个问题大部分是通过在 2PC 中添加一个额外的阶段来解决的,这毫不奇怪地为我们提供了一个三阶段提交协议。这个想法很简单。我们将 2PC 的第二阶段——“提交”——分为两个子阶段。第一个是“准备提交”阶段。当协调器在第一阶段收到一致的“是”票时,会将此消息发送给所有副本。收到此消息后,副本进入一种能够提交事务的状态(通过获取必要的锁等),但最重要的是不要执行任何以后无法撤消的工作。然后他们回复协调器,告诉它已收到“准备提交”消息。

此阶段的目的是将投票结果传达给每个副本,以便无论哪个副本死亡都可以恢复协议的状态。

协议的最后阶段所做的事情与 2PC 中原始的“提交或中止”阶段几乎完全相同。如果协调器收到来自所有副本的“准备提交”消息传递的确认,则可以安全地继续提交事务。但是,如果未确认交付,协调器无法保证协议状态在崩溃时能够恢复(如果您容忍固定数量的 f 次失败,则协调器一旦收到 f+1 就可以继续执行)确认)。在这种情况下,协调者将中止事务。

如果协调器在任何时候崩溃,恢复节点可以接管事务并从任何剩余副本查询状态。如果已提交事务的副本崩溃了,我们知道所有其他副本都已收到“准备提交”消息(否则协调器不会进入提交阶段),因此恢复节点将能够确定事务能够被提交,并安全地引导协议得出结论。如果任何副本向恢复节点报告它尚未收到“准备提交”,则恢复节点将知道事务尚未在任何副本上提交,因此能够悲观地中止或重新运行协议从一开始。

那么3PC能解决我们所有的问题吗?不完全是,但也很接近了。在网络分区的情况下,轮子会脱落——想象一下所有收到“准备提交”的副本都在分区的一侧,而那些没有收到的副本则在另一侧。然后两个分区将继续分别提交或中止事务的恢复节点,并且当网络合并时系统将具有不一致的状态。因此,3PC 与 2PC 一样,可能存在不安全的运行情况,但总会取得进展,因此满足其活性特性。 3PC 不会因单节点故障而阻塞,这一事实使其对于高可用性比低延迟更重要的服务更具吸引力。

I believe three phase commit is a much better approach. Unfortunately I haven't found anyone implementing such a technology.

http://the-paper-trail.org/blog/consensus-protocols-three-phase-commit/

Here are the essential parts of the above article :

The fundamental difficulty with 2PC is that, once the decision to commit has been made by the co-ordinator and communicated to some replicas, the replicas go right ahead and act upon the commit statement without checking to see if every other replica got the message. Then, if a replica that committed crashes along with the co-ordinator, the system has no way of telling what the result of the transaction was (since only the co-ordinator and the replica that got the message know for sure). Since the transaction might already have been committed at the crashed replica, the protocol cannot pessimistically abort – as the transaction might have had side-effects that are impossible to undo. Similarly, the protocol cannot optimistically force the transaction to commit, as the original vote might have been to abort.

This problem is – mostly – circumvented by the addition of an extra phase to 2PC, unsurprisingly giving us a three-phase commit protocol. The idea is very simple. We break the second phase of 2PC – ‘commit’ – into two sub-phases. The first is the ‘prepare to commit’ phase. The co-ordinator sends this message to all replicas when it has received unanimous ‘yes’ votes in the first phase. On receipt of this messages, replicas get into a state where they are able to commit the transaction – by taking necessary locks and so forth – but crucially do not do any work that they cannot later undo. They then reply to the co-ordinator telling it that the ‘prepare to commit’ message was received.

The purpose of this phase is to communicate the result of the vote to every replica so that the state of the protocol can be recovered no matter which replica dies.

The last phase of the protocol does almost exactly the same thing as the original ‘commit or abort’ phase in 2PC. If the co-ordinator receives confirmation of the delivery of the ‘prepare to commit’ message from all replicas, it is then safe to go ahead with committing the transaction. However, if delivery is not confirmed, the co-ordinator cannot guarantee that the protocol state will be recovered should it crash (if you are tolerating a fixed number f of failures, the co-ordinator can go ahead once it has received f+1 confirmations). In this case, the co-ordinator will abort the transaction.

If the co-ordinator should crash at any point, a recovery node can take over the transaction and query the state from any remaining replicas. If a replica that has committed the transaction has crashed, we know that every other replica has received a ‘prepare to commit’ message (otherwise the co-ordinator wouldn’t have moved to the commit phase), and therefore the recovery node will be able to determine that the transaction was able to be committed, and safely shepherd the protocol to its conclusion. If any replica reports to the recovery node that it has not received ‘prepare to commit’, the recovery node will know that the transaction has not been committed at any replica, and will therefore be able either to pessimistically abort or re-run the protocol from the beginning.

So does 3PC fix all our problems? Not quite, but it comes close. In the case of a network partition, the wheels rather come off – imagine that all the replicas that received ‘prepare to commit’ are on one side of the partition, and those that did not are on the other. Then both partitions will continue with recovery nodes that respectively commit or abort the transaction, and when the network merges the system will have an inconsistent state. So 3PC has potentially unsafe runs, as does 2PC, but will always make progress and therefore satisfies its liveness properties. The fact that 3PC will not block on single node failures makes it much more appealing for services where high availability is more important than low latencies.

南巷近海 2024-12-11 02:09:10

尽管付出了一切努力,最终还是会出现问题,但您的情况并不是唯一的情况。假设 A 和 B 都向 TM 报告“准备提交”,然后有人拔掉 TM 和 B 之间的线路。B 正在等待 TM 的批准(或不批准),但它肯定赢了不要永远等待,直到 TM 重新连接(出于明显的原因,在整个等待时间内,其自己的事务涉及的资源必须保持锁定/不可访问)。因此,当 B 等待自己的口味太久时,它将采取所谓的“启发式决策”。也就是说,它将独立于 TM 决定提交或回滚,基于,嗯,我真的不知道是什么,但这并不重要。显然,任何此类启发式决策都可能偏离 TM 做出的实际提交决策。

Your scenario is not the only one where things can ultimately go wrong despite all effort. Suppose A and B have both reported "ready to commit" to TM, and then someone unplugs the line between TM and, say, B. B is waiting for the go-ahead (or no-go) from TM, but it certainly won't keep waiting forever until TM reconnects (its own resources involved in the transaction must stay locked/inaccessible throughout the entire wait time for obvious reasons). So when B is kept waiting too long for its own taste, it will take what is called "heuristic decisions". That is, it will decide to commit or rollback independently from TM, based on, well, I don't really know what, but that doesn't really matter. It should be obvious that any such heuristic decisions can deviate from the actual commit decision taken by TM.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文