文件写入失败时如何保证文件完整性？

发布于 2024-07-12 08:02:25 字数 2507 浏览 6 评论 0原文

在我之前的问题中，我发现您可以使用 FileChannel 的锁来确保读取和写入的顺序。

但是，如果写入器在写入过程中失败（例如 JVM 崩溃），您该如何处理这种情况？这个基本算法看起来像，

WRITER:
  lock file
  write file
  release file

READER:
  lock file
  read file
  release file

如果 JVM 在写入文件期间崩溃，那么锁肯定会被释放，但现在我有一个不完整的文件。我想要一些完整的东西总是可读的。要么是旧内容，要么是新内容，没有介于两者之间的内容。

我的第一个策略是写入临时文件，然后将内容复制到“实时”文件中（同时确保良好的锁定）。其算法是，

WRITER:
  lock temp file
  write temp file
  lock file
  copy temp to file
  release file
  release temp
  delete temp

READER:
  lock file
  read file
  release file

一件好事是删除临时不会删除临时文件，如果它已经被另一个写入器锁定。

但该算法无法处理 JVM 在将临时文件复制到文件期间崩溃的情况。然后我添加了一个复制标志，

WRITER:
  lock temp file
  write temp file
  lock file
  create copying flag
  copy temp to file
  delete copying flag
  release file
  release temp
  delete temp

READER:
  lock file
  if copying flag exists
    copy temp to file
    delete copying flag
    delete temp 
  end
  read file
  release file

永远不会有两个东西访问复制文件，因为它受到文件锁的保护。

现在，这是这样做的方法吗？确保非常简单的事情看起来非常复杂。有没有一些 Java 库可以帮我处理这个问题？

编辑

好吧，我在第三次尝试中犯了一个错误。当读取器将 temp 复制到文件时，它不会持有 temp 的锁。而且它不是一个简单的修复来简单地锁定临时文件！这将导致写入者和读取者以不同的顺序获取锁，并可能导致死锁。这一直变得越来越复杂。这是我的第四次尝试，

WRITER:
  lock file
  write temp file
  create copying flag
  copy temp to file
  delete copying flag
  delete temp
  release file

READER:
  lock file
  if copying flag exists
    copy temp to file
    delete copying flag
    delete temp 
  end
  read file
  release file

这次临时文件由主锁保护，因此它甚至不需要自己的锁。

编辑2

当我说 JVM 崩溃时，我实际上是说停电了并且您没有 UPS。

编辑3

我仍然犯了另一个错误。您不应该锁定正在写入或读取的文件。这会导致问题，因为除非使用 Java 中的 RandomAccessFile，否则无法同时获得读锁和写锁，而 Java 中没有实现输入/输出流。

相反，您想要做的只是锁定一个保护您正在读取或写入的文件的锁定文件。这是更新的算法：

WRITER:
  lock
  write temp file
  create copying flag
  copy temp to file
  delete copying flag
  delete temp
  release

READER:
  lock
  if copying flag exists
    copy temp to file
    delete copying flag
    delete temp 
  end
  read file
  release

锁定和释放保护文件、临时文件和复制标志。现在唯一的问题是读者锁无法共享，但它永远不可能真正共享。读者总是有机会修改文件，因此首先设置共享锁就是错误的。

原文

Follow up to: How to safely update a file that has many readers and one writer?

In my previous questions, I figured out that you can use FileChannel's lock to ensure an ordering on reads and writes.

But how do you handle the case if the writer fails mid-write (say the JVM crashes)? This basic algorithm would look like,

WRITER:
  lock file
  write file
  release file

READER:
  lock file
  read file
  release file

If the JVM crashes during write file, sure the lock would be released, but now I have an incomplete file. I want something complete to always be readable. Either the old content the new content and nothing in between.

My first strategy was to write to a temporary file, and then copy the contents into the "live" file (while ensure good locking). The algorithm for this is,

WRITER:
  lock temp file
  write temp file
  lock file
  copy temp to file
  release file
  release temp
  delete temp

READER:
  lock file
  read file
  release file

One nice thing is the delete temp won't delete the temp if it has already been locked by another writer.

But that algorithm doesn't handle if the JVM crashes during copy temp to file. So then I added a copying flag,

WRITER:
  lock temp file
  write temp file
  lock file
  create copying flag
  copy temp to file
  delete copying flag
  release file
  release temp
  delete temp

READER:
  lock file
  if copying flag exists
    copy temp to file
    delete copying flag
    delete temp 
  end
  read file
  release file

There won't ever be two things accessing the copying file as it is guarded by the file lock.

Now, is this the way to do it? It seems very complicated to ensure something very simple. Is there some Java library that handles this for me?

EDIT

Well, I managed I make a mistake in my third attempt. The reader doesn't hold the lock to temp when it does copy temp to file. Also its not a simple fix to simply lock the temp file! That would cause the writer and reader to acquire locks in different orders and can lead to deadlock. This is getting more complicated all the time. Here's my fourth attempt,

WRITER:
  lock file
  write temp file
  create copying flag
  copy temp to file
  delete copying flag
  delete temp
  release file

READER:
  lock file
  if copying flag exists
    copy temp to file
    delete copying flag
    delete temp 
  end
  read file
  release file

This time the temp file is guarded by main lock, so it doesn't even need its own lock.

EDIT 2

When I say JVM crash, I actually mean say the power went out and you didn't have a UPS.

EDIT 3

I still managed to make another mistake. You shouldn't lock on the file you are writing to or reading from. This will cause problems, since you can't get both the read and write lock unless you use RandomAccessFile in Java, which does not implement Input/Output stream.

What you want to do instead is just lock on a lock file that guards the file you are read or writing. Here's the updated algorithm:

WRITER:
  lock
  write temp file
  create copying flag
  copy temp to file
  delete copying flag
  delete temp
  release

READER:
  lock
  if copying flag exists
    copy temp to file
    delete copying flag
    delete temp 
  end
  read file
  release

lock and release guards the file, the temp file and the copying flag. The only problem is now the reader lock can't be shared, but it never could be really. The reader always had a chance to modify the file, therefore it would have been wrong to make a shareable lock in the first place.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

她如夕阳 2024-07-19 08:02:25

尽管事实上没有万无一失的、跨操作系统、跨文件系统的解决方案，“写入唯一的临时文件并重命名”策略仍然是您的最佳选择。大多数平台/文件系统尝试使文件重命名（有效）原子化。请注意，您想使用+单独的+锁定文件进行锁定。

因此，假设您要更新“myfile.txt”：

锁定“myfile.txt.lck”（这是一个单独的空文件）
将更改写入唯一的临时文件，例如“myfile.txt.13424.tmp”（使用File.createTempFile())
以获得额外保护，但可能会更慢，请在继续之前 fsync 临时文件 (FileChannel.force(true))。
在某些平台（Windows）上将“myfile.txt.13424.tmp”重命名为“myfile.txt”
解锁“myfile.txt.lck”

，由于文件操作的限制，您需要多做一点舞蹈（您可以移动“myfile”重命名之前将“.txt”更改为“myfile.txt.old”，读取时如果需要则使用“.old”文件进行恢复）。

回复收藏 0 原文

待＂谢繁草 2024-07-19 08:02:25

我认为没有一个完美的答案。我不完全知道你需要做什么，但是你可以写入一个新文件，然后成功后重命名文件，而不是复制。重命名速度很快，因此应该不太容易发生崩溃。如果在重命名阶段失败，这仍然无济于事，但您已经将风险窗口最小化了。

再次，我不确定它是否适用或与您的需求相关，但是您可以在文件末尾写入一些文件块以显示所有数据已写入吗？

回复收藏 0 原文

流星番茄 2024-07-19 08:02:25

如果没有某些操作系统支持，则在不需要打开文件的程序来完成或回滚可能正在进行的操作的情况下，不可能使任意复杂的文件操作成为原子操作。如果该警告可以接受，则可以执行如下操作：

在文件开头附近，不跨越 512 字节边界，包含两个 4 或 8 字节数字（取决于最大文件大小），指示逻辑长度文件的名称和更新记录的位置（如果有）。大多数时候更新记录值为零；写入非零值的行为将提交更新序列；当更新序列完成时，它将被重写为零。
在更新序列开始之前，确定更新序列完成时文件逻辑长度的上限（有多种方法可以绕过此限制，但它们会增加复杂性）
要开始更新序列，请在文件中查找当前逻辑长度或未来逻辑长度（的上限）中较大者的距离，然后写入更新记录，每个记录由文件偏移量、字节数组成写入，以及要写入的数据。写入执行所有更新所需的尽可能多的此类记录，并以偏移量为 0、长度为 0 的记录结束。
要提交更新序列，请刷新所有挂起的写入并等待其完成，然后写入新文件长度和第一个更新记录的位置。
最后，通过按顺序处理所有更新记录、刷新所有写入并等待其完成、将更新记录位置设置为零以及（可选）将文件截断为其逻辑长度来更新文件。
如果尝试打开更新序列位置非零的文件，请先完成执行任何挂起的写入（使用上述最后一步），然后再对其执行任何其他操作。

如果在写入更新记录位置之前写入文件的原始操作失败，则所有写入操作都将被有效忽略。如果在写入更新记录位置之后但在清除之前失败，则下次打开文件时将提交所有写入操作（其中一些操作可能已经执行过，但再次执行它们应该是无害的）。如果在写入更新记录位置后失败，则文件更新将完成，并且失败根本不会影响它。

其他一些方法使用单独的文件来保存挂起的写入。在某些情况下，这可能是个好主意，但它的缺点是将文件分成两部分，而这两部分必须保存在一起。仅复制一个文件，或意外地将在不同时间创建的两个文件的副本配对，可能会导致数据丢失或损坏。

Without some operating-system support, it's not going to be possible to have arbitrarily-complex file operations be atomic without requiring programs which open a file to either complete or roll back operations that may have been in progress. If that caveat is acceptable, one could do something like the following:

Near the start of a file, not straddling a 512-byte boundary, include two 4- or 8-byte numbers (depending upon maximum file size), indicating the logical length of the file and the location of an update record (if any). Most of the time the update-record value will be zero; the act of writing a non-zero value will commit an update sequence; it will be rewritten with zero when the update sequence is complete.
Before an update sequence is begun, determine an upper bound for the logical length of the file when the update sequence is complete (there are ways of getting around this limitation, but they add complexity)
To start an update sequence, seek into the file a distance which is the larger of its present logical length or the (upper bound of the) future logical length, and then write update records, each consisting of a file offset, a number of bytes to write, and the data to be written. Write as many such records as are required to perform all updates, and end with a record that has 0 offset and 0 length.
To commit an update sequence, flush all pending writes and wait for their completion, and then write the new file length and the location of the first update record.
Finally, update the file by processing all the update records in sequence, flushing all writes and awaiting their completion, and setting the update-record location to zero, and (optionally) truncating the file to its logical length.
If an attempt is made to open a file where the update-sequence location is non-zero, finish performing any pending writes (using the last step described above) before doing anything else with it.

If the original operation that writes the file fails before the update-record location is written, all of the write operations will be effectively ignored. If it fails after the update-record location is written but before it is cleared, the next time the file is open all of the write operations will be committed (some of them may have already been performed, but performing them again should be harmless). If it fails after the update-record location is written, the file update will be complete and the failure won't affect it at all.

Some other approaches use a separate file to hold pending writes. In some cases, that may be a good idea, but it has the disadvantage of splitting a file into two parts, both of which must be kept together. Copying just one file, or accidentally pairing copies of the two files which were made at different times, could result in data loss or corruption.

回复收藏 0 原文

殊姿 2024-07-19 08:02:25

我假设您有一个不断附加的大文件。
VM 崩溃的情况并不经常发生。但如果发生这种情况，您需要一种方法来回滚失败的更改。您只需要一种方法来知道回滚多远。例如，通过将最后一个文件长度写入新文件：

WRITER:
  lock file
  write file position to pos-file
  write file
  remove pos-file
  unlock file

如果写入器崩溃，您的其中一个读取器将获得读取锁定。他们必须检查 pos 文件。如果他们发现了，就会发生车祸。如果他们查看文件内部，他们就会知道将更改回滚多远才能再次获得一致的文件。当然，回滚过程必须以与写入过程类似的方式发生。

当您不追加而是替换文件时，可以使用相同的方法：

WRITER:
  lock file
  write writing-in-progress-file
  write file
  remove writing-in-progress-file
  unlock file

与先前适用于读者的规则相同。当正在写入的文件存在但读取器已经获得读锁时，写入的文件处于不一致状态。

I assume you have a large file which you are continously appending to.
Crashes of the VM do not happen very often. But if they occur, you need a way to roll back the failed changes. You just need a way to know how far to roll back. For example by writing the last file length to a new file:

WRITER:
  lock file
  write file position to pos-file
  write file
  remove pos-file
  unlock file

If the writer crashes one of your readers will get the read lock. They have to check for the pos-file. If they find one a crash occurred. If they look inside the file they know how far to roll back the changes the get a consistent file again. Of course the roll back procedure has to happen in a similar way like the write procedure.

When you are not appending but replacing the file, you can use the same method:

WRITER:
  lock file
  write writing-in-progress-file
  write file
  remove writing-in-progress-file
  unlock file

Same rules as previously apply for the reader. When the writing-in-progress-file exists but the reader already acquired the read lock the written file is in a inconsistent state.

回复收藏 0 原文