如何管理大型 git 存储库?
我们的一个 git 存储库足够大,以至于 git 克隆需要相当长的时间(超过几分钟)。 .git 目录约为 800M。克隆总是通过 ssh 在 100Mbps 局域网上进行。即使通过 ssh 克隆到本地主机也需要几分钟以上的时间。
是的,我们将数据和二进制 blob 存储在存储库中。
除了将它们移走之外,还有其他方法可以使其更快吗?
即使移动大文件是一种选择,我们如何才能在不重大中断重写每个人的历史记录的情况下做到这一点?
One of our git repositories is large enough that a git-clone takes an annoying amount of time (more than a few minutes). The .git directory is ~800M. Cloning always happens on a 100Mbps lan over ssh. Even cloning over ssh to localhost takes more than a few minutes.
Yes, we store data and binary blobs in the repository.
Short of moving those out, is there another way of making it faster?
Even if moving large files our were an option, how could we do it without major interruption rewriting everyone's history?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我遇到了同样的情况,有一个约 1GB 的存储库,需要通过 DSL 进行传输。我选择了经常被遗忘的运动鞋网:把它放在闪存驱动器上,然后开着我的车穿过城镇。这并不适用于所有情况,但您实际上只需要对初始克隆执行此操作。此后,转移是相当合理的。
I faced the same situation with a ~1GB repository, needing to be transferred over DSL. I went with the oft-forgotten sneakernet: putting it on a flash drive and driving it across town in my car. That isn't practical in every situation, but you really only have to do it for the initial clone. After that, the transfers are fairly reasonable.
我相当确定您将无法在不重写历史记录的情况下将这些二进制文件移出。
根据二进制文件的内容(可能是一些预构建的库或其他库),您可以为开发人员提供一个小脚本来运行结账后下载它们。
I'm fairly sure you're not going to be able to move those binary files out without rewriting history.
Depending on what the binaries are (maybe some pre-built libraries or whatever), you could have a little script for the developer to run post-checkout which downloads them.
千兆...光纤...
如果不重写历史,你的能力就相当有限。
您可以尝试 git gc ,它可能会稍微清理一下,但我不确定这是否是通过克隆完成的。
Gigabit... fiber...
Without rewriting history, you are fairly limited.
You can try a
git gc
it may clean it up a bit, but I'm not sure if that is done with a clone anyway.检查这个答案:
会 git-rm --缓存在其他用户拉取时删除他们的工作树文件
此措施与向
.gitignore
添加模式一起,应该可以帮助您将这些大文件排除在外。Check this answer:
Will git-rm --cached delete another user's working tree files when they pull
This measure, together with adding patterns to
.gitignore
, should help you keep those big files out.