内存映射文件和单个块的原子写入
如果我使用普通 IO API 读取和写入单个文件,则保证每个块的写入都是原子的。也就是说,如果我的写入仅修改单个块,则操作系统保证要么写入整个块,要么什么也不写入。
如何在内存映射文件上达到相同的效果?
内存映射文件只是字节数组,因此如果我修改字节数组,操作系统无法知道我何时考虑写入“完成”,因此它可能(即使不太可能)交换出内存中的内存在我的块写入操作的中间,实际上我写了半个块。
我需要某种“进入/离开关键部分”,或者在写入文件时将文件页面“固定”到内存中的某种方法。存在这样的东西吗?如果是这样,那么它是否可以跨常见 POSIX 系统和应用程序移植?视窗?
If I read and write a single file using normal IO APIs, writes are guaranteed to be atomic on a per-block basis. That is, if my write only modifies a single block, the operating system guarantees that either the whole block is written, or nothing at all.
How do I achieve the same effect on a memory mapped file?
Memory mapped files are simply byte arrays, so if I modify the byte array, the operating system has no way of knowing when I consider a write "done", so it might (even if that is unlikely) swap out the memory just in the middle of my block-writing operation, and in effect I write half a block.
I'd need some sort of a "enter/leave critical section", or some method of "pinning" the page of a file into memory while I'm writing to it. Does something like that exist? If so, is that portable across common POSIX systems & Windows?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
保留 日记 的技术似乎是唯一的方式。我不知道这对于多个应用程序写入同一文件是如何工作的。 Cassandra 项目有一篇关于如何通过日志获得性能的好文章。关键是要确保日志只记录积极操作(我的第一个方法是将每次写入的原像写入日志,允许您回滚,但它变得过度复杂的)。
所以基本上你的内存映射文件在标头中有一个
transactionId
,如果你的标头适合一个块,你就知道它不会被损坏,尽管很多人似乎用校验和写了两次:<代码>[标头[cksum]][标头[cksum]]。如果第一个校验和失败,则使用第二个。日记看起来像这样:
您只需不断附加日记记录,直到它变得太大,然后在某个时候将其滚动。当您启动程序时,您将检查文件的事务 ID 是否为日志的最后一个事务 ID,如果不是,您将回放日志中的所有事务以进行同步。
The technique of keeping a journal seems to be the only way. I don't know how this works with multiple apps writing to the same file. The Cassandra project has a good article on how to get performance with a journal. The key thing is to make sure of, is that the journal only records positive actions (my first approach was to write the pre-image of each write to the journal allowing you to rollback, but it got overly complicated).
So basically your memory-mapped file has a
transactionId
in the header, if your header fits into one block you know it won't get corrupted, though many people seem to write it twice with a checksum:[header[cksum]] [header[cksum]]
. If the first checksum fails, use the second.The journal looks something like this:
You just keep appending journal records until it gets too big, then roll it over at some point. When you startup your program you check to see if the transaction id for the file is at the last transaction id of the journal -- if not you play back all the transactions in the journal to sync up.
在一般情况下,操作系统不保证使用“普通 IO API”完成的“块写入”是原子的:
此外,您通常关心多个扇区的耐用性(例如,如果发生断电,我在该扇区之前发送的数据是否肯定在稳定存储上?)。如果正在进行任何缓冲,您的写入可能仍然只在 RAM/磁盘缓存中,除非您使用另一个命令首先检查/使用 请求缓存绕过的标志和所述标志实际上得到了遵守。
In the general case, the OS does not guarantee "writes of a block" done with "normal IO APIs" are atomic:
Further, you're usually concerned with durability over multiple sectors (e.g. if power loss happens was the data I sent before this sector definitely on stable storage?). If there's any buffering going on, your write may have still only been in RAM/disk cache unless you used another command to check first / opened the file/device with flags requesting cache bypass and said flags were actually honoured.