我可以使用 O_DIRECT 进行写入请求以避免断电期间数据丢失吗?

发布于 2024-11-02 17:18:17 字数 218 浏览 4 评论 0原文

我们希望尽力避免断电期间数据丢失。所以我决定使用 O_DIRECT 标志打开文件以将数据写入磁盘。 O_DIRECT 是否意味着数据完全绕过操作系统缓存?如果请求向应用程序返回成功,是否意味着数据一定已经刷新到磁盘了?如果我在一个文件系统中打开一个常规文件,那么 FS 元数据怎么样?是也立即刷新,还是缓存?

顺便问一下,O_DIRECT可以在Windows中使用吗?或者Windows下有没有相应的方法?

We want to try our bests to avoid data loss during power failure. So I decide to use O_DIRECT flag to open a file to write data in disk. Does O_DIRECT mean that the data bypass OS cache completely? If the request returns successful to the application, does it mean that the data must have been flushed to the disk? If I open a regular file in one file system, how about the FS metadata? Is it also be flushed immediately, or is it cached?

By the way, O_DIRECT can be used in Windows? Or are there any corresponding method in Windows?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

我要还你自由 2024-11-09 17:18:17

O_DIRECT 可能会做你想做的事,但它会大大减慢你的 I/O。
我认为仅根据您是否使用直接文件描述符操作或 FILE * 调用 fsync() 或 fflush() 就足够了。
至于元数据问题,它取决于底层文件系统,如果你想更加偏执的话,甚至取决于硬件。硬盘驱动器(尤其是 SSD)可能会报告操作已完成,但实际写入数据可能需要一段时间。

O_DIRECT will probably do what you want, but it will greatly slow down your I/O.
I think just calling fsync() or fflush() depending on whether you use direct file descriptor operations or FILE * should be enough.
As for the metadata question, it depends on the underlying file system and even on the hardware if you want to be extra paranoid. A hard drive (and especially a SSD) may report the operation finished but could take a while to actually write the data.

幸福丶如此 2024-11-09 17:18:17

您可以使用 O_DIRECT,但对于许多应用程序来说,调用 fdatasync() 更方便。 O_DIRECT 施加了很多限制,因为 IO 完全绕过操作系统缓存。它绕过读缓存和写缓存。

对于文件系统元数据,您所能做的就是在写入文件后 fsync() 文件。 fsync 刷新文件元数据,因此您可以确保如果随后立即断电,文件不会消失(或更改其属性等)。

这些机制中的任何一个都依赖于您的 IO 子系统,而不是向操作系统撒谎,以将数据持久保存到存储中,并且在许多情况下,依赖于其他硬件相关的事物(例如 RAID 控制器电池在电源恢复之前不会耗尽)

You can use O_DIRECT but for many applications, calling fdatasync() is more convenient. O_DIRECT imposes a lot of restrictions because the IOs completely bypass the OS cache. It bypasses read cache as well as write cache.

For filesystem metadata, all you can do is fsync() your file after writing it. fsync flushes the file metadata, so you can be sure that the file won't disappear (or change its attributes etc) if the power is lost immediately afterwards.

Any of these mechanisms depend on your IO subsystem not lying to the OS about having persisted data to storage, and in many cases, other hardware-dependent things (such as the RAID controller battery not running out before the power returns)

甜点 2024-11-09 17:18:17

CreateFile 可以做到这一点。

HANDLE WINAPI CreateFile(
  __in      LPCTSTR lpFileName,
  __in      DWORD dwDesiredAccess,
  __in      DWORD dwShareMode,
  __in_opt  LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  __in      DWORD dwCreationDisposition,
  __in      DWORD dwFlagsAndAttributes,
  __in_opt  HANDLE hTemplateFile
);

对于dwFlagsAndAttributes,您可以指定FILE_FLAG_WRITE_THROUGHFILE_FLAG_NO_BUFFERING

如果FILE_FLAG_WRITE_THROUGH并且
FILE_FLAG_NO_BUFFERING 都是
指定,以便系统缓存
无效,则数据为
立即刷新到磁盘,无需
通过Windows系统
缓存。操作系统还
请求硬写
磁盘的本地硬件缓存
持久媒体。

CreateFile can do this.

HANDLE WINAPI CreateFile(
  __in      LPCTSTR lpFileName,
  __in      DWORD dwDesiredAccess,
  __in      DWORD dwShareMode,
  __in_opt  LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  __in      DWORD dwCreationDisposition,
  __in      DWORD dwFlagsAndAttributes,
  __in_opt  HANDLE hTemplateFile
);

For dwFlagsAndAttributes you can specify FILE_FLAG_WRITE_THROUGH and FILE_FLAG_NO_BUFFERING.

If FILE_FLAG_WRITE_THROUGH and
FILE_FLAG_NO_BUFFERING are both
specified, so that system caching is
not in effect, then the data is
immediately flushed to disk without
going through the Windows system
cache
. The operating system also
requests a write-through of the hard
disk's local hardware cache to
persistent media.

鱼窥荷 2024-11-09 17:18:17

我可以使用 O_DIRECT 进行写入请求以避免断电期间数据丢失吗?

不!

在 Linux 上,当 O_DIRECT 尝试绕过您的操作系统缓存时,它从不绕过您的磁盘缓存。如果您的磁盘具有易失性写入缓存,则在突然断电期间您仍然可能会丢失仅位于磁盘缓存中的数据!

O_DIRECT 是否意味着数据完全绕过操作系统缓存?

通常,但某些 Linux 文件系统可能会使用 O_DIRECT 回退到缓冲 I/O(Ext4 Wiki 阐明 Direct IO 的语义
页面警告分配写入时可能会发生这种情况
)。

如果请求向应用程序返回成功,是否意味着数据一定已经刷新到磁盘了?

这通常意味着磁盘已经“看到”它,但请注意上述警告(例如,数据可能已进入缓冲区高速缓存/数据可能仅位于磁盘的易失性高速缓存中)。

如果我在一个文件系统中打开一个常规文件,FS 元数据怎么样?是也立即刷新,还是缓存?

很好的问题!即使请求成功完成,元数据可能仍在缓存中滚动并且尚未同步到磁盘。

上述所有内容意味着,如果您想确定操作是否已到达非易失性存储,则必须在正确的位置执行适当的 fsync() 命令(并检查其结果!)。请参阅https://thunk.org/tytso/ blog/2009/03/15/dont-fear-the-fsync/LWN 文章 "确保数据到达磁盘” 了解详细信息。

Can I use O_DIRECT for write requests to avoid data loss during power failure?

No!

On Linux while O_DIRECT tries to bypass your OS's cache it never bypasses your disk's cache. If your disk has a volatile write cache you can still lose data that was only in the disk cache during an abrupt power off!

Does O_DIRECT mean that the data bypass OS cache completely?

Usually, but some Linux filesystems may fall back to buffered I/O with O_DIRECT (the Ext4 Wiki Clarifying Direct IO's Semantics
page warns this can happen with allocating writes
).

If the request returns successful to the application, does it mean that the data must have been flushed to the disk?

It usually means the disk has "seen" it but see the above caveats (e.g. data might have gone to buffer cache / data might only be in disk's volatile cache).

If I open a regular file in one file system, how about the FS metadata? Is it also be flushed immediately, or is it cached?

Excellent question! Metadata may still be rolling around in cache and not yet synced to disk even though the request finished successfully.

All of the above mean you HAVE to do the appropriate fsync() command in the correct places (and check their results!) if you want to be sure whether an operation has reached non-volatile storage. See https://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ and the LWN article "Ensuring data reaches disk" for details.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文