如何在 POSIX 中持久地重命名文件?

发布于 2024-09-24 09:02:21 字数 628 浏览 9 评论 0原文

在 POSIX 文件系统中持久重命名文件的正确方法是什么?特别想知道目录上的 fsync。 (如果这取决于操作系统/文件系统,我问的是 Linux 和 ext3/ext4)。

注意:StackOverflow 上还有其他关于持久重命名的问题,但据我所知,他们没有解决目录的 fsync 问题(这对我来说很重要 - 我什至没有修改文件数据)。

我目前有(在Python中):

dstdirfd = open(dstdirpath, O_DIRECTORY|O_RDONLY)
rename(srcdirpath + '/' + filename, dstdirpath + '/' + filename)
fsync(dstdirfd)

具体问题

  • 这是否也隐式同步源目录?或者我最终可能会在电源循环后文件显示在两个目录中(这意味着我必须检查硬链接计数并手动执行恢复),即不可能保证持久的原子移动操作?
  • 如果我 fsync 源目录而不是目标目录,是否也会隐式 fsync 目标目录?
  • 是否有任何有用的相关测试/调试/学习工具(故障注入器、内省工具、模拟文件系统等)?

提前致谢。

What's the correct way to durably rename a file in a POSIX file system? Specifically wondering about fsyncs on the directories. (If this depends on the OS/FS, I'm asking about Linux and ext3/ext4).

Note: there are other questions on StackOverflow about durable renames, but AFAICT they don't address fsync-ing the directories (which is what matters to me - I'm not even modifying file data).

I currently have (in Python):

dstdirfd = open(dstdirpath, O_DIRECTORY|O_RDONLY)
rename(srcdirpath + '/' + filename, dstdirpath + '/' + filename)
fsync(dstdirfd)

Specific questions:

  • Does this also implicitly fsync the source directory? Or might I end up with the file showing up in both directories after a power cycle (meaning I'd have to check the hard link count and manually perform recovery), i.e. it's impossible to guarantee a durably atomic move operation?
  • If I fsync the source directory instead of the destination directory, will that also implicitly fsync the destination directory?
  • Are there any useful related testing/debugging/learning tools (fault injectors, introspection tools, mock filesystems, etc.)?

Thanks in advance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

挖个坑埋了你 2024-10-01 09:02:21

不幸的是,Dave 的答案是错误的。

并非所有 POSIX 系统都具有持久存储。如果他们这样做了,系统崩溃后仍然“允许”对其进行冲洗。对于这些系统,无操作 fsync() 是有意义的,并且 POSIX 下明确允许此类 fsync()。文件在旧目录、新目录、两者或任何其他位置中可恢复也是合法的。 POSIX 不保证系统崩溃或文件系统恢复。

真正的问题应该是:

如何在通过 POSIX API 支持的系统上进行持久重命名?

您需要在源目标上执行 fsync()目录,因为这些 fsync() 至少应该做的是保持源目录或目标目录的外观。

fsync(destdirfd) 是否也隐式同步源目录?

  • POSIX 一般:不,没有任何暗示
  • ext3/4:我不确定对源目录和目标目录的更改是否最终都会出现在日记帐中的同一交易。如果他们这样做了,他们就会一起承诺。

或者我最终可能会在电源循环后文件显示在两个目录中(“崩溃”),即不可能保证持久的原子移动操作?

  • POSIX 一般来说:没有保证,但你应该 fsync() 两个目录,这可能不是原子持久的
  • ext3/4:您最少需要多少 fsync() 取决于安装选项。例如,如果使用“dirsync”安装,则不需要这两个 fsync() 中的任何一个。最多你需要两个 fsync() ,但我几乎确定一个就足够了(然后是原子持久的)。

如果我 fsync 源目录而不是目标目录,是否也会隐式 fsync 目标目录?

  • POSIX:没有
  • ext3/4:我真的相信两者最终会在同一个事务中,所以它不会'无论您使用 fsync()
  • 较旧的内核 ext3:(如果它们不在同一个事务中),一些不太优化的实现在 fsync() 上进行了太多同步,我敢打赌它确实提交了到来的每个事务前。是的,正常的实现首先会将其链接到目标,然后将其从源中删除。因此 fsync(srcdirfd) 也会触发目标的 fsync() 。
  • ext4/最新的 ext3:如果它们不在同一个事务中,您也许能够独立地完全同步它们(两者都这样做)

是否有任何有用的相关测试/调试/学习工具(故障注入器、内省工具、模拟文件系统等)?

对于真正的崩溃,不会。顺便说一句,真正的崩溃超出了内核的视角。硬件可能会重新排序写入(并且无法写入所有内容),从而损坏文件系统。 Ext4 对此做好了更好的准备,因为它默认启用写入限制(挂载选项)(ext3 则不启用),并且可以通过日志校验和(也是一个挂载选项)检测损坏。

为了学习:找出这两个变化是否在日志中以某种方式联系在一起! :-P

Unfortunately Dave’s answer is wrong.

Not all POSIX systems might even have a durable storage. And if they do, it is still “allowed” to be hosed after a system crash. For those systems a no-op fsync() makes sense, and such fsync() is explicitly allowed under POSIX. It is also legal for the file to be recoverable in the old directory, the new directory, both, or any other location. POSIX makes no guarantees for system crashes or file system recoveries.

The real question should be:

How to do a durable rename on systems which support that through the POSIX API?

You need to do a fsync() on both, source and destination directory, because the minimum those fsync()s are supposed to do is persist how source or destination directory should look like.

Does a fsync(destdirfd) also implicitly fsync the source directory?

  • POSIX in general: no, nothing implies that
  • ext3/4: I’m not sure if both changes to source and destination dir end up in the same transaction in the journal. If they do, they get both commited together.

Or might I end up with the file showing up in both directories after a power cycle (“crash”), i.e. it's impossible to guarantee a durably atomic move operation?

  • POSIX in general: no guarantees, but you’re supposed to fsync() both directories, which might not be atomic-durable
  • ext3/4: how much fsync() you minimally need depends on the mount options. E.g. if mounted with “dirsync” you don’t need any of those two fsync()s. At most you need both fsync()s, but I’m almost sure one is enough (atomic-durable then).

If I fsync the source directory instead of the destination directory, will that also implicitly fsync the destination directory?

  • POSIX: no
  • ext3/4: I really believe both end up in the same transaction, so it doesn’t matter which of them you fsync()
  • older kernels ext3: (if they aren’t in the same transaction) some not-so-optimal implementation did way too much syncing on fsync(), I bet it did commit every transaction which came before. And yes, a normal implementation would first link it to the destination and then remove it from the source. So the fsync(srcdirfd) would trigger the fsync() of the destination as well.
  • ext4/latest ext3: if they aren’t in the same transaction, you might be able to completely sync them independently (so do both)

Are there any useful related testing/debugging/learning tools (fault injectors, introspection tools, mock filesystems, etc.)?

For a real crash, no. By the way, a real crash goes beyond the viewpoint of the kernel. The hardware might reorder writes (and fail to write everything), corrupting the filesystem. Ext4 is better prepared against this, because it enables write barries (mount options) by default (ext3 does not) and can detect corruption with journal checksums (also a mount option).

And for learning: find out if both changes are somehow linked in the journal! :-P

勿挽旧人 2024-10-01 09:02:21

POSIX 定义重命名函数必须是原子的

因此,如果您重命名(A,B),在任何情况下您都不会看到文件在两个目录中或两个目录中都没有的状态。无论您如何使用 fsync() 或者系统是否崩溃,总会有一个。

但这并不能解决确保 rename() 操作持久的问题。 POSIX 回答了这个问题

如果定义了 _POSIX_SYNCHRONIZED_IO,则 fsync() 函数将强制与文件描述符 fildes 指示的文件关联的所有当前排队的 I/O 操作进入同步 I/O 完成状态。所有 I/O 操作均应按照同步 I/O 文件完整性完成的定义完成。

因此,如果您 fsync() 一个目录,则挂起的重命名操作必须在返回时传输到磁盘。任一目录的 fsync() 都应该足够了,因为 rename() 操作的原子性要求两个目录的更改以原子方式同步。

最后,与另一个答案中提到的博客文章中的主张相反,其基本原理解释如下:

fsync() 函数旨在强制从缓冲区高速缓存中物理写入数据,并确保在系统崩溃或其他故障后,直到 fsync() 调用时的所有数据都记录在缓冲区高速缓存上。磁盘。由于这里没有定义“缓冲区缓存”、“系统崩溃”、“物理写入”和“非易失性存储”等概念,因此措辞必须更加抽象。

一个声称符合 POSIX 标准并且认为完成 fsync() 且不会在系统崩溃时保留这些更改的正确行为(即不是错误或硬件故障)的系统必须在规范方面故意歪曲自己。

(更新了附加信息:Linux 特定与可移植行为)

POSIX defines that the rename function must be atomic.

So if you rename(A, B), under no circumstances should you ever see a state with the file in both directories or neither directory. There will always be exactly one, no matter what you do with fsync() or whether the system crashes.

But that doesn't solve the problem of making sure the rename() operation is durable. POSIX answers this question:

If _POSIX_SYNCHRONIZED_IO is defined, the fsync() function shall force all currently queued I/O operations associated with the file indicated by file descriptor fildes to the synchronized I/O completion state. All I/O operations shall be completed as defined for synchronized I/O file integrity completion.

So if you fsync() a directory, pending rename operations must be transferred to disk by the time this returns. fsync() of either directory should be sufficient because atomicity of the rename() operation would require that both directories' changes be synced atomically.

Finally, in contrast to the claim in the blog post mentioned in another answer, the rationale for this explains the following:

The fsync() function is intended to force a physical write of data from the buffer cache, and to assure that after a system crash or other failure that all data up to the time of the fsync() call is recorded on the disk. Since the concepts of "buffer cache", "system crash", "physical write", and "non-volatile storage" are not defined here, the wording has to be more abstract.

A system that claimed to be POSIX compliant and that considered it correct behavior (i.e. not a bug or hardware failure) to complete an fsync() and not persist those changes across a system crash would have to be deliberately misrepresenting itself with respect to the spec.

(updated with additional info re: Linux-specific vs. portable behavior)

挽梦忆笙歌 2024-10-01 09:02:21

您问题的答案在很大程度上取决于所使用的特定操作系统、所使用的文件系统类型以及源和目标是否位于同一设备上。

我首先阅读您正在使用的平台上的 rename(2) 手册页。

The answer to your question is going to depend a lot on the specific OS being used, the type of filesystem being used and whether the source and dest are on the same device or not.

I'd start by reading the rename(2) man page on the platform you're using.

哭了丶谁疼 2024-10-01 09:02:21

在我看来,你正在尝试完成文件系统的工作。如果您移动文件,则内核和文件系统将负责原子操作和故障恢复,而不是您的代码。

无论如何,这篇文章似乎解决了您有关 fsync 的问题:
http://blogs.gnome.org/ alexl/2009/03/16/ext4-vs-fsync-my-take/

It sounds to me like you're trying to do the job of the filesystem. If you move a file the kernel and file-system are responsible for atomic operation and fault-recovery, not your code.

Anyway, this article seems to address your questions regarding fsync:
http://blogs.gnome.org/alexl/2009/03/16/ext4-vs-fsync-my-take/

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文