跨平台和跨进程原子 int 写入文件

发布于 2024-09-01 23:04:34 字数 547 浏览 2 评论 0原文

我正在编写一个应用程序,该应用程序必须能够处理对它的许多并发访问,无论是通过线程还是通过进程。因此不应对此应用互斥锁或锁。

为了最大限度地减少锁的使用,我将文件设计为“仅附加”,因此所有数据首先附加到磁盘,然后更改指向已更新信息的地址来参考新的。因此,我需要实现一个小型锁定系统,仅更改此 int,使其引用新地址。 最好的方法是什么?

我正在考虑也许在地址之前放置一个标志,当它被设置时,读者将使用自旋锁直到它被释放。但恐怕它根本不是原子的,不是吗? 例如,

  • 读取器读取该标志,并且
  • 在同一时间将其取消设置,写入器写入该标志并更改 int 的值,
  • 读取器可能会读取不一致的值!

我正在寻找锁定技术,但我找到的只是线程锁定技术,或者锁定整个文件,而不是字段。难道不能这样做吗?仅附加数据库如何处理这个问题?

编辑: 我正在研究仅附加数据库(couchDB)是如何做到这一点的,它们似乎只使用线程来序列化对文件的写入。这是否意味着在不使用文件系统锁锁定整个文件的情况下,不可能使它们像 sqlite 一样可嵌入?

谢谢! 考埃

I'm writing an application that will have to be able to handle many concurrent accesses to it, either by threads as by processes. So no mutex'es or locks should be applied to this.

To make the use of locks go down to a minimum, I'm designing for the file to be "append-only", so all data is first appended to disk, and then the address pointing to the info it has updated, is changed to refer to the new one. So I will need to implement a small lock system only to change this one int so it refers to the new address.
How is the best way to do it?

I was thinking about maybe putting a flag before the address, that when it's set, the readers will use a spin lock until it's released. But I'm afraid that it isn't at all atomic, is it?
e.g.

  • a reader reads the flag, and it is unset
  • on the same time, a writer writes the flag and changes the value of the int
  • the reader may read an inconsistent value!

I'm looking for locking techniques but all I find is either for thread locking techniques, or to lock an entire file, not fields. Is it not possible to do this? How do append-only databases handle this?

edit:
I was looking at how append-only db's (couchDB) do it, and it seems they use a thread only to serialize the writes to file. Does that mean it isn't possible to make them embeddable, like sqlite, without locking the entire file with file system locks?

Thanks!
Cauê

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

唔猫 2024-09-08 23:04:37

请注意文件系统的附加语义 - 它可能不提供原子附加操作。

一种选择是将文件内存映射 (mmap) 为共享文件,然后执行原子内存操作,例如对指针进行比较和交换。你的成功将取决于你的操作系统是否有这样的操作(Linux、OSX都有)。

完成您想要的任务的正确(尽管我不确定它是否快速)方法是使用rename - 它是大多数文件系统上的原子文件操作。将最新数据保存在官方文件位置。要更新数据,请将新数据写入临时文件,然后将该临时文件重命名为正式位置。

Be careful about the append semantics of your filesystem - it probably doesn't provide atomic append operations.

One option is to memory map (mmap) your file as shared, then do atomic memory operations like compare-and-swap on the pointer. Your success will depend on whether your OS has such an operation (Linux, OSX do).

A correct (although I'm not sure it is fast) way accomplish what you want is with rename - it is an atomic file operation on most filesystems. Keep the most up-to-date data in an official file location. To update the data, write your new data to a temporary file, then rename that temporary file to the official location.

花开半夏魅人心 2024-09-08 23:04:37

当我需要做这样的事情时,通常我会编写一个进程,该进程接受来自其他进程的多个连接以获取数据。此日志记录进程可以维护单个文件指针,在其中写入所有数据,而无需冒多次写入到同一位置的风险。

日志记录进程中的每个线程只会侦听新输入并将其提交到队列,而不会阻塞生成数据的进程。尝试在生成要记录的数据的线程中执行此操作(写入磁盘)最终将使您处于必须进行锁定操作并遭受所需性能影响的境地。

When I need to do something like this, typically, I write a process that accepts multiple connections from other processes to get data. This logging process can maintain a single file pointer where it is writing all the data without running the risk of multiple writes going to the same place.

Each thread in the logging process will just listen for new input and submit it to the queue, without blocking the process that generated the data. Trying to do this (writing out to disk) in the threads that generate the data to be logged will eventually put you in a position where you have to have locking operations and suffer whatever performance hit they require.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文