MongoDB 上的数据库范围查询有多昂贵?
我有一个包含用户对象的集合,每个对象都有一个唯一的 ID 和一些其他内容。该集合可能有数百万个条目。我的问题是,如果一个查询需要 300 个 UIDS 的列表,然后检查集合中存在哪些,那么成本有多高?
I have a collection that contains user objects, each with a unique ID and some other stuff. This collection could have millions of entries. My question is how expensive would a query be that takes a list of say 300 UIDS, and then checks which of those exist in the collection?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为这个问题有两个部分#1:查询,#2:性能。
1:查询
这可以使用
$in
子句。2:性能
关于
$in
子句的一点是,从数据库的角度来看,只有一种逻辑方法可以做到这一点。它基本上会对您拥有的每一项进行一次索引搜索。现在,如果您遵循标准协议并将所有索引保存在 RAM 中,那么这个查询可能会在一秒钟内完成。我有一些强大的服务器,有数百个数百万,这样搜索 100 个“UIDS”会在大约 500 毫秒内返回。
YMMV。您可以通过将其分块并同时运行多个查询来确保服务器上有多个线程运行,从而获得更好的性能。
I think there are two parts to this question #1: the query, #2: the performance.
1: The query
This can easily be done using the
$in
clause.2: The performance
The thing about the
$in
clause is that there is only one logical way to do this from the DB perspective. It's basically going to do one index search for each item you have.Now if you follow standard protocol and keep all of your index in RAM, then this query is probably going to come in under a second a so. I have some beefy servers with 100s of millions and such a search for 100 "UIDS" comes back in about 500ms.
YMMV. You may get better performance by chunking it out and running multiple simulataneous queries just to ensure that you're getting multiple threads going on the server.