使用多个线程并发写入文件

发布于 2024-12-01 09:53:28 字数 602 浏览 1 评论 0原文

我有一个用户级程序,它使用标志 O_WRONLY|O_SYNC 打开文件。该程序创建 256 个线程,每个线程尝试将 256 个或更多字节的数据写入文件。我想要总共 1280000 个请求,使其总共约 300 MB 的数据。一旦完成 1280000 个请求,该计划就会结束。

我使用 pthread_spin_trylock() 来增加一个变量,该变量跟踪已完成的请求数。为了确保每个线程写入唯一的偏移量,我使用 pwrite() 并根据已写入的请求数计算偏移量。因此,在实际写入文件时,我不使用任何互斥体(这种方法是否确保数据完整性?)

当我检查 pwrite() 调用被阻止的平均时间以及相应的数字时(即平均 Q2C 时间——这是 BIO 完整生命周期的时间度量)使用 blktrace,我发现有明显的区别。事实上,给定 BIO 的平均完成时间远大于 pwrite() 调用的平均延迟。这种差异背后的原因是什么?由于 O_SYNC 确保数据在返回之前实际写入物理介质,这些数字不应该相似吗?

I have a userlevel program which opens a file using the flags O_WRONLY|O_SYNC. The program creates 256 threads which attempt to write 256 or more bytes of data each to the file. I want to have a total of 1280000 requests, making it a total of about 300 MB of data. The program ends once 1280000 requests have been completed.

I use pthread_spin_trylock() to increment a variable which keeps track of the number of requests that have been completed. To ensure that each thread writes to a unique offset, I use pwrite() and calculate the offset as a function of the number of requests that have been written already. Hence, I don't use any mutex when actually writing to the file (does this approach ensure data integrity?)

When I check the average time for which the pwrite() call was blocked and the corresponding numbers (i.e., the average Q2C times -- which is the measure of the times for the complete life cycle of BIOs) as found using blktrace, I find that there is a significant difference. In fact, the average completion time for a given BIO is much greater than the average latency of a pwrite() call. What is the reason behind this discrepancy? Shouldn't these numbers be similar since O_SYNC ensures that the data is actually written to the physical medium before returning?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

秋意浓 2024-12-08 09:53:28

pwrite() 被认为是原子的,所以你应该是安全的......

关于你的系统调用和实际 BIO 之间的延迟差异,根据 kernel.org 上的手册页 用于 open(2 ):

POSIX 提供同步 I/O 的三种不同变体,
相应的
到标志 O_SYNC、O_DSYNC 和 O_RSYNC。目前(2.6.31),
仅限Linux
实现 O_SYNC,但 glibc 将 O_DSYNC 和 O_RSYNC 映射到
相同的数字
值为 O_SYNC。大多数 Linux 文件系统实际上并不
实施 POSIX
O_SYNC 语义,需要写入的所有元数据更新
位于磁盘上
返回用户空间时,但仅限 O_DSYNC 语义,
只需要
检索它所需的实际文件数据和元数据
磁盘由
系统调用返回的时间。

因此,这基本上意味着,使用 O_SYNC 标志,您尝试写入的全部数据不需要在系统调用返回之前刷新到磁盘,而只需足够的信息即可< em>从磁盘检索它...取决于您正在写入的内容,这可能比您打算写入磁盘的整个数据缓冲区要少得多,因此所有数据的实际写入数据将在稍后的时间发生,之后系统调用已完成,该过程已转移到其他事情。

pwrite() is suppose to be atomic, so you should be safe there ...

In regards to the difference in latency between your syscall and the actual BIO, according to this information on the man-pages at kernel.org for open(2):

POSIX provides for three different variants of synchronized I/O,
corresponding
to the flags O_SYNC, O_DSYNC, and O_RSYNC. Currently (2.6.31),
Linux only
implements O_SYNC, but glibc maps O_DSYNC and O_RSYNC to the
same numerical
value as O_SYNC. Most Linux file systems don't actually
implement the POSIX
O_SYNC semantics, which require all metadata updates of a write
to be on disk
on returning to userspace, but only the O_DSYNC semantics,
which require only
actual file data and metadata necessary to retrieve it to be on
disk by the
time the system call returns.

So this basically implies that with the O_SYNC flag the entirety of the data you're attempting to write does not need to be flushed to disk before a syscall returns, but rather just enough information to be capable of retrieving it from disk ... depending on what you're writing, that could be quite a bit less than the entire buffer of data you were intending to write to disk, and therefore the actual writing of all the data will take place at a later time, after the syscall has been completed and the process has moved on to something else.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文