可扩展的 stl 集,类似于 C++ 的容器
我需要存储大量整数。可以有 整数输入流中的重复项,我只需要 来存储它们之间的区别。 我最初使用的是 stl set,但是当它出现 OutOfMem 时 输入的整数数量过高。 我正在寻找一些 C++ 容器库 允许我存储符合上述要求的数字 由文件支持的容器不应尝试将所有数字保留在内存中。 我不需要持久存储这些数据,我只需要找到 其中有独特的价值。
I need to store large number of integers. There can be
duplicates in the input stream of integers, I just need
to store distinct amongst them.
I was using stl set initially but It went OutOfMem when
input number of integers went too high.
I am looking for some C++ container library which would
allow me to store numbers with the said requirement possibly
backed by file i.e container should not try to keep all numbers in-mem.
I don't need to store this data persistently, I just need to find
unique values amongst it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
看看 STXXL;可能就是您正在寻找的。
编辑:我自己没有使用它,但是从文档中 - 您可以使用
stream::runs_creator
来创建数据的排序运行(无论多少适合内存),然后stream: :runs_merger
合并排序后的流,最后使用stream::unique
过滤唯一值。Take a look at the STXXL; might be what you're looking for.
Edit: I haven't used it myself, but from the docs - you could use
stream::runs_creator
to create sorted runs of your data (however much fits in memory), thenstream::runs_merger
to merge the sorted streams, and finally usestream::unique
to filter uniques.由于您需要的内存大于 RAM 允许的大小,您可以查看 memcached
Since you need larger than RAM allows you might look at memcached
您是否考虑过使用数据库(也许SQLite)?还是会太慢?
Have you considered using DB (maybe SQLite)? Or it would be too slow?
在断定数据库太慢之前,您至少应该认真尝试一下数据库。您所需要的只是轻量级键值存储之一。过去我使用过 Berkeley DB,但这里有一个其他数据库列表。
You should seriously at least try a database before concluding it is too slow. All you need is one of the lightweight key-value store ones. In the past I have used Berkeley DB, but here is a list of other ones.