为什么 MongoDB 的内存映射文件会导致像 top 这样的程序显示比正常情况更大的数字?
我正在尝试了解 mongodb 的内部结构,并且我一直在阅读有关此
http://www.theroadtosiliconvalley.com/technology/mongodb-mongo-nosql-db/
为什么会发生这种情况?
I am trying to wrap my head around the internals of mongodb, and I keep reading about this
http://www.theroadtosiliconvalley.com/technology/mongodb-mongo-nosql-db/
Why does this happen?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
因此,内存映射文件的工作方式是将内存中的地址逐字节映射到磁盘上的文件。这使得它非常快而且非常大。想象一下磁盘上的数据文件占用了这么大的内存。
为什么它很棒
在实践中,这很震撼,因为直接从内存中写入和读取而不是发出系统调用(想想上下文切换)很快。另外,在实践中,这个巨大的内存映射块不适合您的物理内存这一事实也没关系。为什么?您只需要适合 RAM 的工作数据集,因为未使用的页面不会加载,而是保留在磁盘上。如果需要它们,则会发生页面错误并加载它们。 (我相信已加载的部分称为常驻内存)
为什么它有点糟糕
映射到内存中的文件需要进行页面对齐,因此如果您不使用上的内存空间页面边界正是您浪费的空间(小交易)
摘要(tldnr)
它可能看起来占用了大量资源,因为它将整个数据映射到内存地址,但实际上并非如此重要的是这些数据实际上并不是全部被保存在 RAM 中。 Mongo 将根据需要提取数据并有效地使用内存来维护高性能的工作集。
So the way memorry mapped files work is that the addresses in memory are mapped byte for byte with a file on disk. This makes it really fast and but really large. Imagine a file on disk for your data taking up that size of memory.
Why it's awesome
In practice, this rocks because writing and reading from memory directly instead of issuing a system call (think context switch) is fast. Also, in practice, the fact that this huge memory mapped chunk doesn't fit in your physical ram is fine. Why? You only need the working set of data to fit in ram because the non-used pages are not loaded and just kept on disk. If they are needed a page fault happens and it gets loaded up. (I believe the portion that has been loaded is referred to as resident memory)
Why it it kind of sucks
Files mapped in memory needs to be page aligned so if you don't use up the memory space on the page boundary exactly you waste space (small tradoff)
Summary (tldnr)
It may look like its taking up a lot of resources because its mapping the entirety of your data to memory addresses but it doesn't really matter as that data isn't actually all being held in RAM. Mongo will pull in data as it needs it and use memory effectively to maintain a performant working set.