如果我的MMAP虚拟内存超过我的计算机RAM怎么办?
背景和用例
我拥有大约30 GB的数据,这些数据从未改变,具体来说,每个语言的每个字典。
客户要求查看单词的定义,我只是对此进行了回复。
在每个请求下,我都必须对自己的选择进行算法搜索,因此我不必循环浏览我存储在.txt文件中的超过200万个单词。
如果我打开txt文件并读取它,以便我可以搜索该单词,则由于文件的大小,它将永远需要(即使该文件被分解为较小的文件,它也不可行,也不是我想要的要做)。
我遇到了MMAP的概念,这是我在不和谐中非常善良的绅士来解决我问题的一种解决方案。
问题
当我学习MMAP时,我遇到了MMAP不会将数据存储在RAM上,而是将数据存储在虚拟RAM上……无论如何,我的服务器或Docker实例都可能具有不超过64 GB的RAM,而将其中30个数据的数据非常痛苦,这让我觉得需要有更好的选择。即使在最坏的情况下,如果我的服务器或Docker容器没有足够的RAM来存储在MMAP上的数据,那么这是不可行的,除非我错误地对其进行了工作,这就是为什么我要问这个问题的原因。
问题
对我的用例有更好的解决方案吗?
是否必须通过MMAP访问如此大量的数据,以便每次分配RAM内存时,我都不必打开并阅读文件?
最后,如果我对我到目前为止写的内容的具体陈述错了,请纠正我,因为我还在学习很多有关MMAP的信息。
对我的特定用例的要求
我可能会从一个客户中收到一个我必须查找的单词的请求,因此我需要能够有效地从TXT文件中检索大量数据。
对客户的响应必须尽可能快,越快越快,理想情况下,我说的不到三秒钟,或者如果不可能,那么尽可能快。
Background and Use Case
I have around 30 GB of data that never changes, specifically, every dictionary of every language.
Client requests to see the definition of a word, I simply respond with it.
On every request I have to conduct an algorithmic search of my choice so I don’t have to loop through the over two hundred million words I have stored in my .txt file.
If I open the txt file and read it so I can search for the word, it would take forever due to the size of the file (even if that file is broken down into smaller files, it is not feasible nor it is what I want to do).
I came across the concept of mmap, mentioned to me as a possible solution to my problem by a very kind gentleman on discord.
Problem
As I was learning about mmap I came across the fact that mmap does not store the data on the RAM but rather on a virtual RAM… well regardless of which it is, my server or docker instances may have no more than 64 GB of RAM and that chunk of data taking 30 of them is quite painful and makes me feel like there needs to be an alternative that is better. Even on a worst case scenario, if my server or docker container does not have enough RAM for the data stored on mmap, then it is not feasible, unless I am wrong as to how this works, which is why I am asking this question.
Questions
Is there better solution for my use case than mmap?
Will having to access such a large amount of data through mmap so I don’t have to open and read the file every time allocate RAM memory of the amount of the file that I am accessing?
Lastly, if I was wrong about a specific statement I made on what I have written so far, please do correct me as I am learning lots about mmap still.
Requirements For My Specific Use Case
I may get a request from one client that has tens of words that I have to look up, so I need to be able to retrieve lots of data from the txt file effectively.
The response to the client has to be as quick as possible, the quicker the better, I am talking ideally a less than three seconds, or if impossible, then as quick as it can be.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论