Mercurial:是否可以将 .hg 文件夹压缩为几个大的 BLOB?
问题:通过网络克隆 Mercurial 存储库需要太多时间(~ 12 分钟
)。我们怀疑这是因为 .hg
目录包含大量文件(> 15 000
)。
我们还有更大的 git
存储库,但克隆性能非常好 - 大约 1 分钟
。看起来这是因为通过网络传输的 .git 文件夹只有几个文件(通常是 <30
)。
问题:Mercurial 是否支持“存储库压缩为单个 blob”,如果支持如何启用它?
谢谢
更新
Mercurial 版本:1.8.3
访问方法:SAMBA 共享 (\\server\path\to\repo
)
Mercurial 安装在 Linux 机器上,可从 Windows 计算机访问 (通过Windows域登录)
Issue: cloning mercurial repository over network takes too much time (~ 12 minutes
). We suspect it is because .hg
directory contains a lot of files (> 15 000
).
We also have git
repository which is even larger, but clone performance is quite good - around 1 minute
. Looks like it's because .git
folder which is transferred over network has only several files (usually < 30
).
Question: does Mercurial support "repository compressing to single blob" and if it does how to enable it?
Thanks
UPDATE
Mercurial version: 1.8.3
Access method: SAMBA share (\\server\path\to\repo
)
Mercurial is installed on Linux box, accessed from Windows machines (by Windows domain login)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
Mercurial 使用某种压缩在网络上发送数据(请参阅 http://hgbook.red-bean.com/read/behind-the-scenes.html#id358828 ),但是通过使用 Samba,您可以完全绕过此机制。 Mercurial 认为远程存储库位于本地文件系统上,并且使用的机制不同。
它在链接文档中清楚地说明了每个数据在发送之前都被作为一个整体进行压缩:
因此,使用“真正的”网络协议时,您不会遇到 15'000 个文件的问题。
顺便说一句,我强烈建议不要使用 Samba 之类的东西来共享您的存储库。这确实会带来各种问题:
您可以在 wiki 上找到有关发布存储库的信息:PublishingRepositories (您可以在其中看到根本不推荐使用 samba)
并回答问题是,AFAIK,没有办法压缩 Mercurial 元数据或类似的东西,比如减少文件数量。但如果存储库正确发布,这将不再是问题。
Mercurial use some kind of compression to send data on the network ( see http://hgbook.red-bean.com/read/behind-the-scenes.html#id358828 ), but by using Samba, you totally bypass this mechanism. Mercurial thinks the remote repository is on a local filesystem and the mechanism used is different.
It clearly says in the linked documentation that each data are compressed as a whole before sending :
So you won't have the problem of 15'000 files you use a "real" network protocol.
BTW, I strongly recommend against using something like Samba to share your repository. This is really asking for various kind of problems :
You can find information about publishing repositories on the wiki : PublishingRepositories (where you can see that samba is not recommended at all)
And to answer the question, AFAIK, there's no way to compress the Mercurial metadata or anything like that like reduce the number of files. But if the repository is published correctly, this won't be a problem anymore.
您可以通过创建捆绑包将其压缩为 blob:
hg bundle --all \\server\therepo.bundle
hg clone \\server\therepo.bundle
hg log -R therepo.bundle
您确实需要定期重新创建或更新捆绑包,但创建捆绑包的速度很快,并且可以在服务器上的 post-changeset 挂钩中完成,或者每晚完成。 (因为如果您在 .hg/hgrc 中正确设置了 [paths],则可以通过从捆绑包克隆后拉取存储库来获取剩余的变更集)。
因此,为了回答有关多个 blob 的问题,您可以为每个 X 变更集创建一个捆绑包,并让客户端克隆/取消捆绑每个变更集。 (但是,定期更新单个版本 + 正常拉取任何剩余的变更集似乎更容易......)
但是,由于您无论如何都在服务器上运行 Linux,我建议运行 hg-ssh 或 hg-web.cgi。这就是我们所做的,而且对我们来说效果很好。 (使用 Windows 客户端)
You could compress it to a blob by creating a bundle:
hg bundle --all \\server\therepo.bundle
hg clone \\server\therepo.bundle
hg log -R therepo.bundle
You do need to re-create or update the bundle periodically, but creating the bundle is fast and could be done in a post-changeset hook on the server, or nightly. (Since fetching remaining changesets can be done by pulling the repo after cloneing from bundle, if you set [paths] correctly in .hg/hgrc).
So, to answer your question about several blobs, you could create a bundle every X changesets, and have the clients clone/unbundle each of those. (However, having a single one updated regularly + a normal pull for any remaining changesets seems easier...)
However, since you're running Linux on the server anyway, I suggest running hg-ssh or hg-web.cgi. That's what we do and it works well for us. (With windows clients)