如何在 Linux 中为内存映射文件提供写时扩展功能?
我正在致力于将一些代码从 AIX 移植到 Linux。部分代码使用 shmat()
系统调用 来创建新文件。当在可写模式下与 SHM_MAP
一起使用时,可以将文件扩展到超出其原始长度(在我的情况下为零):
当一个文件被映射到一个段上时,通过访问该段来引用该文件。内存分页系统自动处理物理 I/O。超出文件末尾的引用会导致文件以页面大小的增量进行扩展。文件不能扩展到下一个段边界之外。
(AIX 中的“段”是 256 MB 的地址空间块,“页”通常是 4 KB。)
我想要在 Linux 上执行以下操作:
- 保留一个大的-相当大的地址空间块(不一定要大到 256 MB,这些文件不是那么大)
- 设置页面保护位,以便在第一次访问未保护的页面时生成段错误之前被触碰过
- 发生页面错误时,清除“导致页面错误”位并为该页面分配提交的内存,允许导致页面错误的写入(或读取)继续进行
- 关闭共享内存区域时,将修改的页面写入文件
我知道我可以使用 VirtualProtect 函数、PAGE_GUARD
内存保护位和 结构化异常处理程序。 Linux 上的相应方法是什么?也许有更好的方法在 Linux 上实现这种写时扩展功能吗?
我已经考虑过:
- 写入了多少文件
- 使用
mmap()
和一些固定的大尺寸,但我无法判断应用程序代码分配的匿名共享内存区域 相当大的大小,但我同样无法判断有多少区域已被写入 mmap()
本身似乎没有提供任何设施来扩展支持文件的长度
当然我会只需对应用程序代码进行最少的更改即可完成此操作。
I'm working on porting some code from AIX to Linux. Parts of the code use the shmat()
system call to create new files. When used with SHM_MAP
in a writable mode, one can extend the file beyond its original length (of zero, in my case):
When a file is mapped onto a segment, the file is referenced by accessing the segment. The memory paging system automatically takes care of the physical I/O. References beyond the end of the file cause the file to be extended in page-sized increments. The file cannot be extended beyond the next segment boundary.
(A "segment" in AIX is a 256 MB chunk of address space, and a "page" is usually 4 KB.)
What I would like to do on Linux is the following:
- Reserve a large-ish chunk of address space (it doesn't have to be as big as 256 MB, these aren't such large files)
- Set up the page protection bits so that a segfault is generated on the first access to a page that hasn't been touched before
- On a page fault, clear the "cause a page fault" bit and allocate committed memory for the page, allowing the write (or read) that caused the page fault to proceed
- Upon closing the shared memory area, write the modified pages to a file
I know I can do this on Windows with the VirtualProtect function, the PAGE_GUARD
memory protection bit, and a structured exception handler. What is the corresponding method on Linux to do the same? Is there perhaps a better way to implement this extend-on-write functionality on Linux?
I've already considered:
- using
mmap()
with some fixed large-ish size, but I can't tell how much of the file was written to by the application code - allocating an anonymous shared memory area of large-ish size, but again I can't tell how much of the area has been written
mmap()
by itself does not seem to provide any facility to extend the length of the backing file
Naturally I would like to do this with only minimal changes to the application code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这与我曾经做过的作业非常相似。基本上我有一个“页面”列表和一个“框架”列表以及相关信息。使用 SIGSEGV 我可以捕获故障并根据需要更改内存保护位。我将包含您可能会觉得有用的部分。
创建映射。最初它没有权限。
安装信号处理程序
异常处理程序
增强保护
我无法公开提供所有这些内容,因为团队可能会再次使用相同的作业。
This is very similar to a homework I once did. Basically I had a list of "pages" and a list of "frames", with associated information. Using
SIGSEGV
I would catch faults and alter the memory protection bits as necessary. I'll include parts that you may find useful.Create mapping. Initially it has no permissions.
Install signal handler
Exception handler
Increasing protection
I can't publicly make it all available since the team is likely to use that same homework again.
根据需要分配一个大缓冲区,然后使用 mprotect()* 系统调用使缓冲区的尾部只读,并为 SIGSEGV 注册一个信号处理程序,以记录之前的写入位置,并再次使用 mprotect()启用写入。
Allocate a big buffer however you like and then use mprotect()* system call to make the tail of the buffer read only and register a signal handler for SIGSEGV to note where in the before writes have been made and use mprotect() yet again to enable writes.
我自己也考虑过类似的事情,但也没有找到任何方法让
mmap()
扩展支持文件。目前,我计划尝试两种替代方案:
老实说,我认为稀疏文件行不通,但值得一试。
I've contemplated similar things myself, and haven't found any way for
mmap()
to extend the backing file either.Currently, I plan on trying two alternatives:
mremap()
'ing afterwardshonestly, I don't think sparse files would work, but it's worth a try.