映射大文件(用于持久性大数组)
我正在通过 mmap 实现持久的大型常量数组。 使用 mmap 时是否有任何提示和技巧或陷阱需要注意?
I'm implementing persistent large constant arrays via mmap. Is there any tips and tricks or gotchas one should be aware when using mmap?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
存储在 mmap'd 区域内的所有指针都应该作为距 mmap'd 区域基部的偏移量,而不是真正的指针! 当您在下次运行程序时映射该区域时,不一定会获得相同的基地址。 (我必须清理对 mmap 区域基地址恒定性做出错误假设的代码)。
All pointers that are stored inside the mmap'd region should be done as offsets from the base of the mmap'd region, not as real pointers! You won't necessarily be getting the same base address when you mmap the region on the next run of the program. (I have had to clean up code that made incorrect assumptions about mmap region base address constancy).
这是 mmap() 最直接的用例,因此不会有太多问题。
您实际上只是加载一个大型常量数组。 作为常量,您不需要担心同步。 建议确保仅将 prot 参数设置为 PROT_READ,因为您不会写入。
如果要连续运行一个或多个使用常量的程序,则可能值得使用一个单独的程序来加载数据并使其保持驻留。 其他程序的运行本质上只是执行共享内存附加,而不是不断地将文件读入内存。
This is the most straight forward use case for mmap() so there shouldn't be much to trip you up.
You are effectively just loading a large constant array. Being constants you shouldn't need to worry about synchronization. It would be advisable to make sure the prot parameter is set to PROT_READ only since you won't be writing.
If one or more programs using the constants are going to be continually run, it might be worthwhile to have a separate program that loads the data and keeps it resident. Runs of the other programs then essentially are just doing an shared memory attach rather than continually reading the file into memory.
确保检查对打开文件大小或内存使用的限制。 在 Linux 上有一个内置的 shell 命令 ulimit。 运行
ulimit -a
以查看当前设置。Flush 使用 msync(2) 系统调用将内存中的数组写入文件,否则它们可能会保留在内存中直到 munmap(2) 并且在此之前可能会出现断电或其他情况!
如果多个进程正在映射共享有读写权限的同一内存区域,请确保一次只有一个进程正在写入,以避免损坏数据。 或者使用文件锁定或其他一些同步方式。
Make sure you check for restrictions on open file size or memory usage. On Linux there is a built in shell command ulimit. Run as
ulimit -a
to see the current settings.Flush writes to the in-memory array to the file with the msync(2) syscall or else they may stay in memory until munmap(2) and there may be a power outage or something before then!
If multiple processes are mmap'ing the same memory region shared with read and write privileges, make sure that only one is writing to it at a time to avoid corrupting your data. Or use file locking or some other means of synchronization.