如何在 Git 中只下载远程项目的必要部分?

发布于 2024-10-02 10:30:51 字数 83 浏览 2 评论 0原文

如果您正在处理一个大型远程存储库,并且希望将下载限制为您正在处理的几个分支,那么假设在这种情况下它是正确的命令,那么如何配置 git-clone 命令?

If you are working on a large remote repository and you want to restrict the download to the few branches you are working on, how do you configure the git-clone command, assuming that it is the right command in this case?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蘸点软妹酱 2024-10-09 10:30:51

回答真正的问题

使用 git 进行本地克隆通常不会占用大量额外空间,因为 git 将使用硬链接来共享目标文件。这很难注意到 - 如果您在每个存储库上运行 du,您将获得完整大小,但如果您同时在两个存储库上运行它,您应该会看到节省的空间。我假设您已经出于某种原因认为这还不够好。也许您使用的文件系统不支持硬链接,或者克隆位于单独的驱动器或其他东西上......谁知道呢。

无论如何,如果您希望创建一个轻量级克隆,节省一些空间,为什么不节省所有空间呢? git 的 contrib 目录中有一个可爱的脚本,名为 git-new-workdir (链接指向 git.git 中的当前版本)。它从存储库创建一个新的工作目录,.git 目录基本上都是通过符号链接共享的 - 几乎唯一不是的就是 HEAD。将脚本拖放到路径中的某个位置,您就可以像正常的 git 命令一样运行它:

git new-workdir <original-repo> <new-workdir-path>

瞧!您现在有两个工作树,具有共享的 .git 目录,因此您占用的唯一额外空间是工作树文件。如果您想工作,就没有办法解决这个问题!

您必须小心的一件事是检查两个存储库中的同一分支。如果您随后在一个存储库中提交该分支,则另一个存储库将变得不同步 - 工作树和索引将与该分支现在所在的提交不匹配。否则,您可以愉快地在两个存储库中工作!

原始答案

首先我要声明的是,您基本上没有机会这样做。我是认真的。它几乎不会为您节省任何磁盘空间,而具有硬链接对象的存储库(这是默认设置!您甚至不需要执行任何操作即可获得它!)将为您节省大量空间。

几乎在每种情况下,分支机构都有大部分共同的历史。节省空间的潜力仅存在于它们最近出现分歧的一小部分。查看 git logbranchA..branchB。这些提交是您将避免复制其对象的提交。里面有巨大的二进制文件吗?有 1000 行差异吗?不?那就别管这个了。它不会帮助你。

还在读书吗?好吧,我认为 git-clone 不会让你弄乱 refspec(--mirror 除外),但这显然不是我们想要的这里)。如果这样做真的很重要,您可以通过创建一个空存储库并拉取来管理它,然后仔细执行克隆将完成的其余设置:

mkdir foo && cd foo && git init
git remote add origin <url>
# set up a refspec to get the branch(es) you want
git config remote.origin.fetch "+refs/heads/foo:refs/remotes/origin/foo ..."
git fetch origin

您仍然缺少一些配置 - 特别是,您有一个本地主分支不跟踪任何内容。

这是一个非常奇怪的设置,没有从源头获取所有分支,但我认为它应该可以工作。当然,就像我在评论中所说的那样,您可能不会为自己省去很多麻烦。获取其他远程分支并不意味着您必须创建相应的本地分支,除非这些排除的分支与您获取的分支极其不同(即包含大量独特内容),否则您不会节省太多带宽或磁盘空间。

Answer to the real question

Local clones with git don't generally take up a ton of extra space, because git will use hard links to share the object files. This is hard to notice - if you run du on each repo, you'll get the full size, but if you run it on the two together, you should see the savings. I'm going to assume you've already for some reason decided that this isn't good enough. Perhaps you're on a filesystem that doesn't support hardlinks, or the clones are on separate drives or something... who knows.

In any case, if you're looking to create a lightweight clone, saving some space, why not save all the space? There's a lovely script in git's contrib directory called git-new-workdir (the link is to the current version in git.git). It creates a new work directory from a repo, with the .git directory essentially all shared via symlinks - pretty much the only thing that isn't is HEAD. Drop the script somewhere in your path, and you'll be able to run it like a normal git command:

git new-workdir <original-repo> <new-workdir-path>

Voila! You now have two work trees, with a shared .git directory, so the only extra space you're taking up is the work tree files. No way around that if you want to be able to work!

The one thing you must be careful about is checking out the same branch in both repos. If you then commit to that branch in one repo, the other one will become out of sync - the work tree and index won't match the commit that the branch is now at. Otherwise, you can happily work away in both repositories!

Original answer

Let me first state that there is essentially no chance you want to do this. I'm serious. It's barely going to save you any disk space, while repositories with hard-linked objects (which is the default! you don't even have to do anything to get that!) will save you a ton.

In virtually every case, branches share most of their history. The potential for saving space is only in the small recent part in which they've diverged. Look at git log branchA..branchB. Those commits are the ones whose objects you will avoid copying. Are there any enormous binary files in there? Any 1000-line diffs? No? Then don't bother with this. It's not going to help you.

Still reading? Okay, well, I don't think git-clone lets you mess with the refspec (with the exception of --mirror, but that's obviously not what we're after here). If it's really important to do this, you could manage it by creating an empty repository and pulling, then carefully doing the rest of the setup the clone would've done:

mkdir foo && cd foo && git init
git remote add origin <url>
# set up a refspec to get the branch(es) you want
git config remote.origin.fetch "+refs/heads/foo:refs/remotes/origin/foo ..."
git fetch origin

You've still got some config missing - in particular, you have a local master branch which isn't tracking anything.

This is a pretty strange setup, not grabbing all the branches from origin, but I suppose it should work. Of course, like I said in my comment, you may not be saving yourself a whole lot of trouble. Fetching other remote branches doesn't mean you have to create corresponding local branches, and unless those excluded branches diverge extremely from the ones you've grabbed (i.e. contain lots of unique content), you're not saving much bandwidth or disk space.

巷子口的你 2024-10-09 10:30:51

如果您正在两个不同目录中的两个分支上工作,那么您可以将其中一个设置为另一个的克隆:

git clone http://remote/repo.git branch-a
git clone branch-a branch-b

然后,修复 branch-borigin 远程>:(

cd branch-b
git remote add origin http://remote/repo.git

您可能必须先删除以前的来源)。这样,本地存储库信息将通过两个目录之间的硬链接共享,与创建远程存储库的两个单独克隆相比,可以节省一些空间。

或者,去购买 1 TB 驱动器,它们很便宜。

If you're working on two branches in two separate directories, then you can set up one to be a clone of the other:

git clone http://remote/repo.git branch-a
git clone branch-a branch-b

Then, fix the origin remote in branch-b:

cd branch-b
git remote add origin http://remote/repo.git

(you may have to remove the previous origin first). This way, the local repository information will be shared by hard links between the two directories, saving you some space compared to making two separate clones of the remote.

Or, go buy a 1 TB drive, they're cheap.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文