读取和插入操作比 mongodb 中的转储更快

发布于 2025-01-15 19:08:47 字数 727 浏览 1 评论 0原文

我需要清理 200Tb 的 mongodb 集合，并删除旧的时间戳。我正在尝试从新集合构建一个新集合，并运行删除查询，因为在当前正在使用的集合上运行 del 会减慢对其的其他请求。我想过通过转储以下集合或创建一个读写脚本来克隆一个新集合，这样它将从当前集合中读取并写入克隆的集合。我的问题是批处理的读/写操作例如：1000 读和写比转储更快？

编辑：我发现这个，这个和这篇文章，并且想知道，如果写上述方式的脚本与创建读写的 ssh 管道相同吗？ ex: 是一个节点/python 脚本，用于从集合中获取 1000 行并将其插入到克隆集合中，与 ssh *** ". /etc/profile; mongodump -h sourceHost -d yourDatabase … | mongorestore - h targetHost -d yourDatabase ？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

白昼 2025-01-22 19:08:47

我建议采用这种方法：

重命名集合。当您的应用程序尝试插入一些数据时，它将立即使用旧名称创建一个新的空集合。您可以创建一些索引。
运行mongoexport/mongoimport导入有效数据，即跳过过时的数据。

是的，一般来说，mongodump/mongorestore 可能会更快，但是在 mongoexport 中，您可以定义查询并限制导出的数据。可能是这样的：

mongoexport --uri "..." --db=yourDatabase --collection=collection --query='{timestamp: {$gt: ISODate("2022-01-010")}}' | mongoimport --uri "..." --db=yourDatabase --collection=collection --numInsertionWorkers=10

利用参数 numInsertionWorkers 来运行多个工作线程。它会加快你的插入速度。

那么您运行的是分片集群吗？如果是，那么您应该在新集合上使用 sh.splitAt()，请参阅如何将集合从一个数据库复制到 MongoDB 中的另一个数据库

I would suggest this approach:

Rename the collection. Your application will immediately create a new empty collection with the old name when it tries to insert some data. You may create some indexes.
Run mongoexport/mongoimport to import the valid data, i.e. skip the outdated.

Yes, in general mongodump/mongorestore might be faster, however at mongoexport you can define a query and limit the data which is exported. Could be like this:

mongoexport --uri "..." --db=yourDatabase --collection=collection --query='{timestamp: {$gt: ISODate("2022-01-010")}}' | mongoimport --uri "..." --db=yourDatabase --collection=collection --numInsertionWorkers=10

Utilize parameter numInsertionWorkers to run multiple workers. It will speed up your inserts.

So you run a sharded cluster? If yes, then you should use sh.splitAt() on the new collection, see How to copy a collection from one database to another in MongoDB

回复收藏 0 原文

~没有更多了~