将页面插入大型 mmap() 文件而不复制数据
我想知道是否有一种方法可以在我用 mmap() 打开的大(多 GB)文件的开头附近插入空白页。显然,可以在末尾添加一两页,并使用 memcpy() 将所有内容向前移动,但这会弄脏每个页面,并且最终刷新到磁盘时需要很长的时间。
我猜测解决方案需要在自定义文件系统和页表的手动操作之间进行一些复杂的协调:向 inode 添加一个块,以某种方式更新 VMM 中的缓存页面以反映这一点,然后以某种方式将页表混合到匹配。这听起来并不简单,这让我想知道是否有更好的方法。
这是一个关于 Linux 上的内存和文件操作的有点深入的问题,尽管我很高兴听到如何在其他系统中做到这一点。我对涉及提高复制效率的解决方法并不特别感兴趣,尽管需要重新映射但避免磁盘 IO 的技术将是一个好的开始。
I'm wondering if there is a way to insert blank pages near the beginning of a large (multi-GB) file that I have open with mmap(). Obviously it would be possible to add a page or two to the end, and move everything forward with memcpy(), but this would dirty every page and require an awful long time when eventually flushed to disk.
I'm guessing that a solution would require some complex coordination between a customized filesystem and manual manipulation of the page tables: add a block to the inode, somehow update the cached pages in the VMM to reflect this, then somehow swizzle the page table to match. This sounds non-trivial, which makes me wonder if there's a better way.
This is intended as a somewhat deep question about memory and file manipulation on Linux, although I'd be happy to hear about how this can be done in other systems. I'm not particularly interested in workarounds that involve making the copying more efficient, although a technique that requires remapping but avoids the disk IO would be a good start.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在您的文件中嵌入一个简单的 FAT。例如,文件的前 4k 是 FAT 页。数据将在以下页面中。随着文件的增长,第一个 FAT 页可以链接到其他 FAT 页。 Fat 中的每个条目都是一个数据页索引和下一个 FAT 条目的索引。 FAT 条目将是 FAT 的页以及条目本身该页上的索引。我想你明白了。 FAT 条目是一个链表。 FAT 页是一个链接列表。 FAT 条目链接数据页。这应该足以使用 remap_file_pages() 使您的文件在内存中看起来连续尽管它在磁盘上不连续。
Embed a simple FAT in your file. For instance, the first 4k of the file would be a the FAT page. Data would be in following pages. The first FAT page could link to other FAT pages as your file grew. Each entry in the fat would be a data page index and the index of the next FAT entry. A FAT entry would be the page of the FAT and the index on that page of the entry itself. I think you get the idea. The FAT entries are A linked list. The FAT pages are a linked list. The FAT entries link data pages. This should be enough information to use remap_file_pages() to make your file look contiguous in memory even though its not contiguous on the disk.