ACID 中的“D”表示“持久性”,维基百科将其定义为:“每个提交的事务都受到保护,不会发生断电/崩溃/错误,并且不会被系统丢失,因此可以保证完成”。
然而,这意味着每个事务在报告为成功之前都必须同步到磁盘,而不仅仅是刷新。 ('flush'=发送到操作系统级别,'sync'=发送到物理磁盘盘片)。这将导致无法实现高事务率的 RDBMS。
流行的 RDBMS 真的会同步每笔交易吗?
The 'D' in ACID means "Durability" which is defined by Wikipedia as: "every committed transaction is protected against power loss/crash/errors and cannot be lost by the system and can thus be guaranteed to be completed".
However, this would mean that every transaction must be sync'd to disk before being reported as successful, not just flushed. ('flush'=sent to the operating system level, 'sync'=sent to the physical disk platter). This would make it impossible to implement a high transaction rate RDBMS.
Do popular RDBMS's really sync every transaction?
发布评论
评论(3)
使用磁盘进行持久化的数据库确实必须写入磁盘以使事务持久化。此外,它们还必须同步到磁盘以避免写回缓存造成任何损失。
为了实现高性能,数据库将使用组提交,其中提交周期中的多个事务将使用相同的写入/同步操作以使所有事务持久化。如果它们都附加到同一事务日志,则这是可能的。
这可能意味着单个提交的响应可能会延迟(同时等待其他人加入提交周期),但整个数据库的总体吞吐量要大得多,因为写入/同步的成本在多个事务中分摊。例如,每个单独的事务可能需要 10 毫秒,但数千个事务都能够在同一周期内提交。
数据库通常会测量有多少事务处于活动状态,以判断是否值得让任何单个事务等待其他事务加入提交周期,以便在负载非常轻的系统上,事务不需要等待其他事务。
并非所有数据库都使用磁盘来保证持久性。例如,VoltDB 依赖于多个服务器内存中保存的事务副本。如果其中一台服务器挂掉,交易在其他地方仍然可用。因此,交易只需要确保交易已被传输到足够数量的服务器以确保持久性。
这也提出了一个问题:什么是耐用?单盘耐用吗?如果磁盘出现故障则不会。 RAID 阵列耐用吗?如果发生灾难性的 RAID 损坏,则不会。持久性的唯一保证是跨多个远程数据库实例复制事务 - 但并非每个人都需要这种级别的保证。耐久性不应被视为二元选择,而应被视为耐久性水平的选择。
Databases that use disk for persistence do indeed have to write to the disk to make the transaction durable. Moreover, they also have to sync to the disk to avoid any loss from a write back cache.
To achieve high performance, databases will use group commits whereby multiple transactions in a commit cycle will use the same write/sync operation to make all of the transactions durable. This is possible where they are all appending to the same transaction log.
This may mean that the response of an individual commit may be delayed (while waiting for others to join the commit cycle) but the overall throughput is much greater across the whole database because the cost of the write/sync is amortized across multiple transactions. For example, each individual transaction may take 10mS, but thousands of transactions are all able to commit in the same cycle.
It is normal for a database to measure how many transactions are active to judge whether it is worth making any single transaction wait for others to join the commit cycle such that on a very lightly loaded system, a transaction need not wait for others.
Not all databases use disk to guarantee durability. For example, VoltDB relies on copies of the transaction held in memory across multiple servers. If one of the servers dies, the transaction is still available elsewhere. Hence a transaction only needs to ensure that transaction has been transmitted to a sufficient population of servers to ensure durability.
This also raises the question of what is durable? Is a single disk durable? Not if the disk fails. Is a RAID array durable? Not if there is a catastrophic RAID corruption. The only guarantee of durability is where transactions are replicated across multiple remote database instances - but not not everybody needs that level of guarantee. Durability should not be considered a binary option but rather as a choice of level of durability.
“磁盘”不仅仅是一个文件。提交将写入事务日志,然后用于更新数据库。如果系统在更新之前崩溃,可以从日志中重建事务。
The 'disk' is more than just one file. A commit would write to the transaction log, which would then be used to update the database. If the system crashes before the update, the transactions can be reconstructed from the log.
是的 - 这称为回滚日志。
为什么你认为这是不可能的?
而且,如果您说同步每笔交易都没有发生,那么您建议如何解决该问题?
Yes - it's called a rollback log.
Why do you think it's impossible?
And, if you say that synching every transaction isn't happening, what do you propose as a solution to the problem?