如何制作浅层 git 子模块?

发布于 2024-08-19 15:36:22 字数 310 浏览 9 评论 0 原文

是否可以有浅层子模块?我有一个包含多个子模块的超级项目,每个子模块都有很长的历史,因此它会不必要地拖累所有这些历史。

我所找到的只是这个未答复的线程

我应该破解 git-submodule 实施这个?

Is it possible to have shallow submodules? I have a superproject with several submodules, each with a long history, so it gets unnecessarily big dragging all that history.

All I have found is this unanswered thread.

Should I just hack git-submodule to implement this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

阪姬 2024-08-26 15:36:22

太长了;

git clone --recurse-submodules --shallow-submodules

(但请参阅 Ciro Santilli 答案
或者:记录子模块应该浅克隆:

git config -f .gitmodules submodule.<name>.shallow true

这意味着下一个 git clone --recurse-submodules 将会浅克隆子模块 '' (深度 1 ),甚至没有--shallow-submodules


接下来是 git submodule/git clone 在浅层克隆方面的演变,从(2013 年)Git 1.8.4 开始,并以此为基础。


即将推出的 Git 1.8.4(2013 年 7 月)< /a>:

git submodule update”可以选择浅层克隆子模块存储库。

(git 2.10 Q3 2016 允许使用 git config -f .gitmodules submodule..shallow true 来记录。
请参阅此答案的末尾)

请参阅提交275cd184d52b5b81cb89e4ec33e540fb2ae61c1f

在“git submodule”的添加和更新命令中添加--depth选项,然后将其传递给clone命令。当子模块很大并且您对除最新提交之外的任何内容都不感兴趣时​​,这非常有用。

添加了测试并进行了一些缩进调整,以符合测试文件的其余部分“子模块更新可以处理 pwd 中的符号链接”。

签署人:Fredrik Gustafsson [电子邮件受保护] >
确认者:Jens Lehmann <[电子邮件受保护]> ;

这意味着这是有效的:

# add shallow submodule
git submodule add --depth 1 <repo-url> <path>
git config -f .gitmodules submodule.<path>.shallow true

# later unshallow
git config -f .gitmodules submodule.<path>.shallow false
git submodule update <path>

命令可以按任何顺序运行。 git submodule 命令执行实际克隆(这次使用深度 1)。并且 git config 命令使该选项对于稍后将递归克隆存储库的其他人来说是永久的。

例如,假设您有存储库 https://github.com/foo/bar 并且您想要添加 https://github.com/lorem/ipsum作为子模块,位于您的存储库中的 path/to/submodule 中。命令可能如下所示:

git submodule add --depth 1 [email protected]:lorem/ipsum.git path/to/submodule
git config -f .gitmodules submodule.path/to/submodule.shallow true

以下结果也会产生相同的结果(相反的顺序):

git config -f .gitmodules submodule.path/to/submodule.shallow true
git submodule add --depth 1 [email protected]:lorem/ipsum.git path/to/submodule

下次有人运行 git clone --recursive [email protected]:foo/bar.git,它将提取 https://github 的整个历史记录.com/foo/bar,但它只会按预期浅克隆子模块。

和:

<前><代码>--深度

此选项对 addupdate 命令有效。
创建一个“浅”克隆,其历史记录被截断为指定的修订数量。


atwyman 添加了在评论中

据我所知,此选项不适用于不密切跟踪 master 的子模块。如果您设置深度 1,则只有您想要的子模块提交是最新的主模块时,子模块更新才能成功。 否则你会得到“致命:引用不是树

确实如此。
也就是说,直到 git 2.8(2016 年 3 月)。在 2.8 中,即使可以从远程存储库 HEAD 之一直接访问 SHA1,子模块 update --depth 又有了一次成功的机会。

请参阅 commit fb43e31(2016 年 2 月 24 日),作者:斯特凡·贝勒 (stefanbeller)
帮助者:Junio C Hamano (gitster)
(由 Junio C Hamano -- gitster -- 合并于 提交 9671a76,2016 年 2 月 26 日)

子模块:通过直接获取 sha1 来更加努力地获取所需的 sha1

在审查同时更新 Gerrit 中的子模块的更改时,常见的审查做法是在本地下载并挑选补丁来测试它。
然而,在本地测试时,“git submodule update”可能无法获取正确的子模块 sha1,因为子模块中相应的提交尚未成为项目历史记录的一部分,而也只是建议的更改。< /p>

如果 $sha1 不是默认提取的一部分,我们会尝试直接提取 $sha1。然而,某些服务器不支持 sha1 直接获取,这会导致 git-fetch 很快失败。
我们可能会在这里失败,因为仍然丢失的 sha1 无论如何都会导致稍后在结账阶段失败,所以在这里失败是我们能得到的最好的结果。


MVG 指出在评论提交fb43e31(git 2.9, 2016 年 2 月)

在我看来,提交fb43e31通过SHA1 id请求丢失的提交,因此服务器上的 uploadpack.allowReachableSHA1InWantuploadpack.allowTipSHA1InWant 设置可能会影响此功能是否有效。
我今天在 git 列表中写了一篇
帖子,指出了如何使用浅层子模块可以在某些情况下更好地工作,即如果提交也是一个标签。
让我们拭目以待。

我想这就是为什么 fb43e31 在获取默认分支之后将特定 SHA1 的获取作为后备的原因。
尽管如此,在“--深度 1”的情况下,我认为尽早中止是有意义的:如果列出的引用均不与请求的引用匹配,并且服务器不支持 SHA1 请求,则没有意义获取任何内容,因为无论如何我们都无法满足子模块的要求。


2016 年 8 月更新(3 年后)

使用 Git 2.10(2016 年第 3 季度),您将能够执行

 git config -f .gitmodules submodule.<name>.shallow true

查看“无需额外重量的 Git 子模块< /a>”了解更多。


Git 2.13(2017 年第 2 季度)请添加 commit 8d3047c(2017 年 4 月 19 日),作者:塞巴斯蒂安·舒伯特 (sschuberth)
(由 Sebastian Schuberth -- sschuberth -- 合并于 提交 8d3047c,2017 年 4 月 20 日)

此子模块的克隆将作为浅克隆执行(历史深度为1)

但是,Ciro Santilli 在评论中添加了(以及详细信息在他的回答中

.gitmodules 上的

shallow = true 仅影响使用 --recurse-submodules 时远程 HEAD 跟踪的引用,甚至如果目标提交由分支指向,即使您也将 branch = mybranch 放在 .gitmodules 上。


Git 2.20(2018 年第 4 季度)改进了子模块支持,当工作中缺少 .gitmodules 文件时,子模块支持已更新为从 HEAD:.gitmodules 处的 blob 读取树。

请参阅提交 2b1257e提交 76e9bdc(2018 年 10 月 25 日),以及 提交 b5c259f提交 23dd8f5提交 b2faad4, 提交 2502ffc提交 996df4d提交 d1b13df提交 45f5ef3提交 bcbc780(2018 年 10 月 5 日) ) 作者:Antonio Ospite (ao2)
(由 Junio C Hamano -- gitster -- 合并于 提交 abb4824,2018 年 11 月 13 日)

submodule:支持读取不在工作树中的.gitmodules

.gitmodules 文件在工作树中不可用时,请尝试
使用索引和当前分支中的内容。
这涵盖了文件是存储库的一部分但对于某些情况
未签出的原因,例如由于签出稀疏。

这使得至少可以使用“git submodule”命令
读取 gitmodules 配置文件而不完全填充
工作树。

写入.gitmodules仍然需要签出该文件,
因此,在调用 config_set_in_gitmodules_file_gently 之前请检查一下。

还在 git-submodule.sh::cmd_add() 中添加类似的检查,以预测“git submodule add”命令在 时最终失败.gitmodules 不可安全写入;这可以防止命令使存储库处于虚假状态(例如,子模块存储库已克隆,但 .gitmodules 未更新,因为 config_set_in_gitmodules_file_gently 失败)。

此外,由于 config_from_gitmodules() 现在访问全局对象
存储,有必要保护调用该函数的所有代码路径
反对并发访问全局对象存储。
目前这只发生在builtin/grep.c::grep_submodules()中,所以调用
在调用涉及 config_from_gitmodules() 的代码之前 grep_read_lock()

注意:在极少数情况下,此新功能不起作用
尚未正确:工作树中没有 .gitmodules 的嵌套子模块。


注意:Git 2.24(2019 年第 4 季度)修复了浅克隆子模块时可能出现的段错误。

请参阅 提交 ddb3c85(2019 年 9 月 30 日),作者:Ali Utku Selen (auselen)
(由 Junio C Hamano -- gitster -- 合并于 提交 678a9ca,2019 年 10 月 9 日)


Git 2.25(2020 年第一季度),澄清了 git 子模块更新 文档。

请参阅 提交 f0e58b3(2019 年 11 月 24 日),作者:菲利普·布莱恩 (phil-blain)
(由 Junio C Hamano -- gitster -- 合并于 提交 ef61045,2019 年 12 月 5 日)

doc:提及“git子模块更新”获取缺失的提交

帮助者:Junio C Hamano
帮助者:约翰内斯·辛德林
签字人:Philippe Blain

'git 子模块更新'将如果未找到超级项目中记录的 SHA-1,则从子模块远程获取新提交。文档中没有提到这一点。


警告:在 Git 2.25(2020 年第 1 季度)中,“git clone --recurse-submodules”和备用对象存储之间的交互设计不当。

文档和代码已被教导可以在用户看到失败时提出更清晰的建议。

请参阅 提交 4f3e57e提交 10c64a0(2019 年 12 月 2 日),作者:Jonathan Tan (jhowtan)
(由 Junio C Hamano -- gitster -- 合并于 提交 5dd1d59,2019 年 12 月 10 日)

submodule--helper:关于致命替代的建议错误

签字人:Jonathan Tan
确认者:Jeff King

当递归克隆一个超级项目并在其 .gitmodules 中定义了一些浅层模块,然后使用“--reference=”重新克隆时,会发生错误。例如:

git clone --recurse-submodules --branch=master -j8 \
  https://android.googlesource.com/platform/superproject \
  掌握
git clone --recurse-submodules --branch=master -j8 \
  https://android.googlesource.com/platform/superproject \
  --参考master master2

失败:

致命:子模块“”无法添加备用:参考存储库
'<剪断>'很浅

当无法添加从超级项目的替代项计算得出的替代项时,无论是在这种情况还是其他情况,请建议配置“submodule.alternateErrorStrategy”配置选项并使用“--reference-”克隆时使用“if-able”而不是“--reference”。

详细信息请参阅:

在 Git 2.25(2020 年第一季度)中,“git clone --recurse-submodules”和备用对象存储之间的交互设计不当。

Doc:解释 submodule.alternateErrorStrategy

签字人:Jonathan Tan
确认者:Jeff King

提交31224cbdc7(“克隆:递归和引用选项触发子模块替代", 2016-08-17, Git v2.11.0-rc0 -- 合并第 1 批中列出的 a> 教 Git 支持配置选项“超级项目上的 submodule.alternateLocation” 和 “submodule.alternateErrorStrategy”。

如果“submodule.alternateLocation”在超级项目上配置为“superproject”,则每当克隆该超级项目的子模块时,它都会计算类似的备用路径来自超级项目的 $GIT_DIR/objects/info/alternates 的子模块,并引用它。

submodule.alternateErrorStrategy”选项确定如果无法引用该替代项会发生什么情况。
但是,尚不清楚当该选项未设置为“die”时,克隆是否会像未指定替代项一样继续进行(如 31224cbdc7)。
因此,请相应地记录下来。

配置子模块文档现在包括:

submodule.alternateErrorStrategy::

指定如何处理通过 submodule.alternateLocation 计算的子模块替代项的错误。
可能的值为ignoreinfodie
默认为die
请注意,如果设置为 ignoreinfo,并且如果计算的替代项出现错误,则克隆将继续进行,就像未指定替代项一样 .


注意:“git 子模块更新 --quiet< /code>"(man) 没有将安静选项传播到底层 git fetch(man ),已使用 Git 2.32(2021 年第 2 季度)进行更正。

请参阅 提交 62af4bd(2021 年 4 月 30 日),作者:尼古拉斯·克拉克 (nwc10)
(由 Junio C Hamano -- gitster -- 合并于 提交 74339f8,2021 年 5 月 11 日)

子模块更新:使用“<代码>--安静"

签字人:尼古拉斯·克拉克

命令如

$ git 子模块更新 --quiet --init --深度=1

涉及浅克隆,请调用 shell 函数 fetch_in_submodule,该函数又调用 git fetch
向前传递 --quiet 选项。

TLDR;

git clone --recurse-submodules --shallow-submodules

(But see caveat with Ciro Santilli answer)
Or: record that a submodule should be shallow cloned:

git config -f .gitmodules submodule.<name>.shallow true

Which means the next git clone --recurse-submodules will shallow clone the submodule '<name>' (depth 1), even without the --shallow-submodules.


What follows is the evolution of git submodule/git clone when it comes to shallow clones, starting (in 2013) with Git 1.8.4, and going from there.


New in the upcoming Git 1.8.4 (July 2013):

"git submodule update" can optionally clone the submodule repositories shallowly.

(And git 2.10 Q3 2016 allows to record that with git config -f .gitmodules submodule.<name>.shallow true.
See the end of this answer)

See commit 275cd184d52b5b81cb89e4ec33e540fb2ae61c1f:

Add the --depth option to the add and update commands of "git submodule", which is then passed on to the clone command. This is useful when the submodule(s) are huge and you're not really interested in anything but the latest commit.

Tests are added and some indention adjustments were made to conform to the rest of the testfile on "submodule update can handle symbolic links in pwd".

Signed-off-by: Fredrik Gustafsson <[email protected]>
Acked-by: Jens Lehmann <[email protected]>

That means this works:

# add shallow submodule
git submodule add --depth 1 <repo-url> <path>
git config -f .gitmodules submodule.<path>.shallow true

# later unshallow
git config -f .gitmodules submodule.<path>.shallow false
git submodule update <path>

The commands can be ran in any order. The git submodule command perform the actual clone (using depth 1 this time). And the git config commands make the option permanent for other people who will clone the repo recursively later.

As an example, suppose you have the repo https://github.com/foo/bar and you want to add https://github.com/lorem/ipsum as a submodule, in your repo at path/to/submodule. The commands may look like like the following:

git submodule add --depth 1 [email protected]:lorem/ipsum.git path/to/submodule
git config -f .gitmodules submodule.path/to/submodule.shallow true

The following results in the same thing too (opposite order):

git config -f .gitmodules submodule.path/to/submodule.shallow true
git submodule add --depth 1 [email protected]:lorem/ipsum.git path/to/submodule

The next time someone runs git clone --recursive [email protected]:foo/bar.git, it will pull in the whole history of https://github.com/foo/bar, but it will only shallow-clone the submodule as expected.

With:

--depth

This option is valid for add and update commands.
Create a 'shallow' clone with a history truncated to the specified number of revisions.


atwyman adds in the comments:

As far as I can tell this option isn't usable for submodules which don't track master very closely. If you set depth 1, then submodule update can only ever succeed if the submodule commit you want is the latest master. Otherwise you get "fatal: reference is not a tree".

That is true.
That is, until git 2.8 (March 2016). With 2.8, the submodule update --depth has one more chance to succeed, even if the SHA1 is directly reachable from one of the remote repo HEADs.

See commit fb43e31 (24 Feb 2016) by Stefan Beller (stefanbeller).
Helped-by: Junio C Hamano (gitster).
(Merged by Junio C Hamano -- gitster -- in commit 9671a76, 26 Feb 2016)

submodule: try harder to fetch needed sha1 by direct fetching sha1

When reviewing a change that also updates a submodule in Gerrit, a common review practice is to download and cherry-pick the patch locally to test it.
However when testing it locally, the 'git submodule update' may fail fetching the correct submodule sha1 as the corresponding commit in the submodule is not yet part of the project history, but also just a proposed change.

If $sha1 was not part of the default fetch, we try to fetch the $sha1 directly. Some servers however do not support direct fetch by sha1, which leads git-fetch to fail quickly.
We can fail ourselves here as the still missing sha1 would lead to a failure later in the checkout stage anyway, so failing here is as good as we can get.


MVG points out in the comments to commit fb43e31 (git 2.9, Feb 2016)

It would seem to me that commit fb43e31 requests the missing commit by SHA1 id, so the uploadpack.allowReachableSHA1InWant and uploadpack.allowTipSHA1InWant settings on the server will probably affect whether this works.
I wrote a post to the git list today, pointing out how the use of shallow submodules could be made to work better for some scenarios, namely if the commit is also a tag.
Let's wait and see.

I guess this is a reason why fb43e31 made the fetch for a specific SHA1 a fallback after the fetch for the default branch.
Nevertheless, in case of “--depth 1” I think it would make sense to abort early: if none of the listed refs matches the requested one, and asking by SHA1 isn't supported by the server, then there is no point in fetching anything, since we won't be able to satisfy the submodule requirement either way.


Update August 2016 (3 years later)

With Git 2.10 (Q3 2016), you will be able to do

 git config -f .gitmodules submodule.<name>.shallow true

See "Git submodule without extra weight" for more.


Git 2.13 (Q2 2017) do add in commit 8d3047c (19 Apr 2017) by Sebastian Schuberth (sschuberth).
(Merged by Sebastian Schuberth -- sschuberth -- in commit 8d3047c, 20 Apr 2017)

a clone of this submodule will be performed as a shallow clone (with a history depth of 1)

However, Ciro Santilli adds in the comments (and details in his answer)

shallow = true on .gitmodules only affects the reference tracked by the HEAD of the remote when using --recurse-submodules, even if the target commit is pointed to by a branch, and even if you put branch = mybranch on the .gitmodules as well.


Git 2.20 (Q4 2018) improves on the submodule support, which has been updated to read from the blob at HEAD:.gitmodules when the .gitmodules file is missing from the working tree.

See commit 2b1257e, commit 76e9bdc (25 Oct 2018), and commit b5c259f, commit 23dd8f5, commit b2faad4, commit 2502ffc, commit 996df4d, commit d1b13df, commit 45f5ef3, commit bcbc780 (05 Oct 2018) by Antonio Ospite (ao2).
(Merged by Junio C Hamano -- gitster -- in commit abb4824, 13 Nov 2018)

submodule: support reading .gitmodules when it's not in the working tree

When the .gitmodules file is not available in the working tree, try
using the content from the index and from the current branch.
This covers the case when the file is part of the repository but for some
reason it is not checked out, for example because of a sparse checkout.

This makes it possible to use at least the 'git submodule' commands
which read the gitmodules configuration file without fully populating
the working tree.

Writing to .gitmodules will still require that the file is checked out,
so check for that before calling config_set_in_gitmodules_file_gently.

Add a similar check also in git-submodule.sh::cmd_add() to anticipate the eventual failure of the "git submodule add" command when .gitmodules is not safely writeable; this prevents the command from leaving the repository in a spurious state (e.g. the submodule repository was cloned but .gitmodules was not updated because config_set_in_gitmodules_file_gently failed).

Moreover, since config_from_gitmodules() now accesses the global object
store, it is necessary to protect all code paths which call the function
against concurrent access to the global object store.
Currently this only happens in builtin/grep.c::grep_submodules(), so call
grep_read_lock() before invoking code involving config_from_gitmodules().

NOTE: there is one rare case where this new feature does not work
properly yet: nested submodules without .gitmodules in their working tree.


Note: Git 2.24 (Q4 2019) fixes a possible segfault when cloning a submodule shallow.

See commit ddb3c85 (30 Sep 2019) by Ali Utku Selen (auselen).
(Merged by Junio C Hamano -- gitster -- in commit 678a9ca, 09 Oct 2019)


Git 2.25 (Q1 2020), clarifies the git submodule update documentation.

See commit f0e58b3 (24 Nov 2019) by Philippe Blain (phil-blain).
(Merged by Junio C Hamano -- gitster -- in commit ef61045, 05 Dec 2019)

doc: mention that 'git submodule update' fetches missing commits

Helped-by: Junio C Hamano
Helped-by: Johannes Schindelin
Signed-off-by: Philippe Blain

'git submodule update' will fetch new commits from the submodule remote if the SHA-1 recorded in the superproject is not found. This was not mentioned in the documentation.


Warning: With Git 2.25 (Q1 2020), the interaction between "git clone --recurse-submodules" and alternate object store was ill-designed.

The documentation and code have been taught to make more clear recommendations when the users see failures.

See commit 4f3e57e, commit 10c64a0 (02 Dec 2019) by Jonathan Tan (jhowtan).
(Merged by Junio C Hamano -- gitster -- in commit 5dd1d59, 10 Dec 2019)

submodule--helper: advise on fatal alternate error

Signed-off-by: Jonathan Tan
Acked-by: Jeff King

When recursively cloning a superproject with some shallow modules defined in its .gitmodules, then recloning with "--reference=<path>", an error occurs. For example:

git clone --recurse-submodules --branch=master -j8 \
  https://android.googlesource.com/platform/superproject \
  master
git clone --recurse-submodules --branch=master -j8 \
  https://android.googlesource.com/platform/superproject \
  --reference master master2

fails with:

fatal: submodule '<snip>' cannot add alternate: reference repository
'<snip>' is shallow

When a alternate computed from the superproject's alternate cannot be added, whether in this case or another, advise about configuring the "submodule.alternateErrorStrategy" configuration option and using "--reference-if-able" instead of "--reference" when cloning.

That is detailed in:

With Git 2.25 (Q1 2020), The interaction between "git clone --recurse-submodules" and alternate object store was ill-designed.

Doc: explain submodule.alternateErrorStrategy

Signed-off-by: Jonathan Tan
Acked-by: Jeff King

Commit 31224cbdc7 ("clone: recursive and reference option triggers submodule alternates", 2016-08-17, Git v2.11.0-rc0 -- merge listed in batch #1) taught Git to support the configuration options "submodule.alternateLocation" and "submodule.alternateErrorStrategy" on a superproject.

If "submodule.alternateLocation" is configured to "superproject" on a superproject, whenever a submodule of that superproject is cloned, it instead computes the analogous alternate path for that submodule from $GIT_DIR/objects/info/alternates of the superproject, and references it.

The "submodule.alternateErrorStrategy" option determines what happens if that alternate cannot be referenced.
However, it is not clear that the clone proceeds as if no alternate was specified when that option is not set to "die" (as can be seen in the tests in 31224cbdc7).
Therefore, document it accordingly.

The config submodule documentation now includes:

submodule.alternateErrorStrategy::

Specifies how to treat errors with the alternates for a submodule as computed via submodule.alternateLocation.
Possible values are ignore, info, die.
Default is die.
Note that if set to ignore or info, and if there is an error with the computed alternate, the clone proceeds as if no alternate was specified.


Note: "git submodule update --quiet"(man) did not propagate the quiet option down to underlying git fetch(man), which has been corrected with Git 2.32 (Q2 2021).

See commit 62af4bd (30 Apr 2021) by Nicholas Clark (nwc10).
(Merged by Junio C Hamano -- gitster -- in commit 74339f8, 11 May 2021)

submodule update: silence underlying fetch with "--quiet"

Signed-off-by: Nicholas Clark

Commands such as

$ git submodule update --quiet --init --depth=1

involving shallow clones, call the shell function fetch_in_submodule, which in turn invokes git fetch.
Pass the --quiet option onward there.

早乙女 2024-08-26 15:36:22

Git 2.9.0 支持浅层子模块直接克隆,所以现在你可以调用:

git clone url://to/source/repository --recursive --shallow-submodules

Git 2.9.0 support submodules shallow clone directly, so now you can just call:

git clone url://to/source/repository --recursive --shallow-submodules
淡淡的优雅 2024-08-26 15:36:22

按照 Ryan 的回答,我能够想出这个简单的脚本,它迭代所有子模块和浅层克隆它们:

#!/bin/bash
git submodule init
for i in $(git submodule | sed -e 's/.* //'); do
    spath=$(git config -f .gitmodules --get submodule.$i.path)
    surl=$(git config -f .gitmodules --get submodule.$i.url)
    git clone --depth 1 $surl $spath
done
git submodule update

Following Ryan's answer I was able to come up with this simple script which iterates through all submodules and shallow clones them:

#!/bin/bash
git submodule init
for i in $(git submodule | sed -e 's/.* //'); do
    spath=$(git config -f .gitmodules --get submodule.$i.path)
    surl=$(git config -f .gitmodules --get submodule.$i.url)
    git clone --depth 1 $surl $spath
done
git submodule update
故乡的云 2024-08-26 15:36:22

自 Git 2.14.1 起的错误/意外/烦人行为摘要

  1. .gitmodules 中的shallow = true 仅影响 git clone --recurse-submodules 如果远程子模块的 HEAD 指向所需的提交,即使目标提交由分支指向,并且即使您放置 .gitmodules 上的branch = mybranch 也是如此。

    本地测试脚本。 GitHub 2017-11 上的行为相同,其中 HEAD 由默认分支存储库设置控制:

    git clone --recurse-submodules https://github.com/cirosantilli/test-shallow-submodule-top-branch-shallow
    cd 测试浅子模块顶部分支浅/mod
    git 日志
    # 多次提交,不浅。
    
  2. git clone --recurse-submodules --shallow-submodules 如果出现以下情况则失败提交既未被分支引用,也未被带有消息的标记引用:错误:服务器不允许请求未通告的对象

    本地测试脚本。 GitHub 上的相同行为:

    git clone --recurse-submodules --shallow-submodules https://github.com/cirosantilli/test-shallow-submodule-top-sha
    # 错误
    

    我还在邮件列表上询问:https://marc.info /?l=git&m=151863590026582&w=2 回复是:

    <块引用>

    理论上这应该很容易。 :)

    不幸的是,实际上并没有那么多。这是因为克隆只会获得
    分支(通常是 master)的最新提示。克隆中没有机制
    指定所需的确切 sha1。

    有线协议支持询问精确的 sha1,因此应该涵盖这一点。
    (警告:它仅在服务器运营商启用时才有效
    uploadpack.allowReachableSHA1InWant github上没有AFAICT)

    git-fetch 允许获取任意 sha1,因此作为解决方法,您可以运行 fetch
    在递归克隆之后使用“git submodule update”,因为这将使用
    在初始克隆之后获取。

TODO 测试:allowReachableSHA1InWant

Summary of buggy / unexpected / annoying behaviour as of Git 2.14.1

  1. shallow = true in .gitmodules only affects git clone --recurse-submodules if the HEAD of the remote submodule points to the required commit, even if the target commit is pointed to by a branch, and even if you put branch = mybranch on the .gitmodules as well.

    Local test script. Same behaviour on GitHub 2017-11, where HEAD is controlled by the default branch repo setting:

    git clone --recurse-submodules https://github.com/cirosantilli/test-shallow-submodule-top-branch-shallow
    cd test-shallow-submodule-top-branch-shallow/mod
    git log
    # Multiple commits, not shallow.
    
  2. git clone --recurse-submodules --shallow-submodules fails if the commit is neither referenced by a branch or tag with a message: error: Server does not allow request for unadvertised object.

    Local test script. Same behaviour on GitHub:

    git clone --recurse-submodules --shallow-submodules https://github.com/cirosantilli/test-shallow-submodule-top-sha
    # error
    

    I also asked on the mailing list: https://marc.info/?l=git&m=151863590026582&w=2 and the reply was:

    In theory this should be easy. :)

    In practice not so much, unfortunately. This is because cloning will just obtain
    the latest tip of a branch (usually master). There is no mechanism in clone
    to specify the exact sha1 that is wanted.

    The wire protocol supports for asking exact sha1s, so that should be covered.
    (Caveat: it only works if the server operator enables
    uploadpack.allowReachableSHA1InWant which github has not AFAICT)

    git-fetch allows to fetch arbitrary sha1, so as a workaround you can run a fetch
    after the recursive clone by using "git submodule update" as that will use
    fetches after the initial clone.

TODO test: allowReachableSHA1InWant.

尾戒 2024-08-26 15:36:22

通读 git-submodule “source”,看起来 git submodule add 可以处理已经存在存储库的子模块。在这种情况下......

$ git clone $remote1 $repo
$ cd $repo
$ git clone --depth 5 $remotesub1 $sub1
$ git submodule add $remotesub1 $sub1
#repeat as necessary...

您需要确保所需的提交位于子模块存储库中,因此请确保设置适当的 --深度。

编辑:您也许能够摆脱多个手动子模块克隆,然后进行一次更新:

$ git clone $remote1 $repo
$ cd $repo
$ git clone --depth 5 $remotesub1 $sub1
#repeat as necessary...
$ git submodule update

Reading through the git-submodule "source", it looks like git submodule add can handle submodules that already have their repositories present. In that case...

$ git clone $remote1 $repo
$ cd $repo
$ git clone --depth 5 $remotesub1 $sub1
$ git submodule add $remotesub1 $sub1
#repeat as necessary...

You'll want to make sure the required commit is in the submodule repo, so make sure you set an appropriate --depth.

Edit: You may be able to get away with multiple manual submodule clones followed by a single update:

$ git clone $remote1 $repo
$ cd $repo
$ git clone --depth 5 $remotesub1 $sub1
#repeat as necessary...
$ git submodule update
猫七 2024-08-26 15:36:22

您的子模块的规范位置是否远程?如果是这样,您可以克隆一次吗?换句话说,您是否想要浅层克隆只是因为您正在遭受频繁的子模块(重新)克隆的带宽浪费?

如果您想要浅克隆来节省本地磁盘空间,那么 Ryan Graham 的答案似乎是一个不错的选择。手动克隆存储库,使其变浅。如果您认为它有用,请调整 git submodule 以支持它。发送电子邮件至列表询问相关信息(实施建议、建议界面等)。在我看来,那里的人们非常支持那些真诚地希望以建设性方式增强 Git 的潜在贡献者。

如果您可以对每个子模块进行一次完整克隆(加上稍后的获取以使它们保持最新),您可以尝试使用 git submodule update--reference 选项> (在 Git 1.6.4 及更高版本中)引用本地对象存储(例如,制作规范子模块存储库的 --mirror 克隆,然后使用 --reference在您的子模块中指向这些本地克隆)。在使用 --reference 之前,请务必阅读有关 git clone --reference/git clone --shared 的内容。引用镜像唯一可能出现的问题是它们是否最终获取非快进更新(尽管您可以启用引用日志并扩展其过期窗口以帮助保留任何可能导致问题的废弃提交)。只要

  • 您不进行任何本地子模块提交,或者
  • 规范存储库可能发布的非快进悬空的任何提交不是本地子模块提交的祖先,或者
  • 您很勤奋 ,您就不应该遇到任何问题关于将本地子模块提交重新基于规范子模块存储库中可能发布的任何非快进内容。

如果您采用类似的方法,并且有可能在工作树中进行本地子模块提交,那么创建一个自动化系统来确保签出子模块引用的关键对象不会是一个好主意。在镜像存储库中悬空(如果找到的话,将它们复制到需要它们的存储库)。

而且,就像 git clone 手册页所说的那样,如果您不理解这些含义,请不要使用 --reference

# Full clone (mirror), done once.
git clone --mirror $sub1_url $path_to_mirrors/$sub1_name.git
git clone --mirror $sub2_url $path_to_mirrors/$sub2_name.git

# Reference the full clones any time you initialize a submodule
git clone $super_url super
cd super
git submodule update --init --reference $path_to_mirrors/$sub1_name.git $sub1_path_in_super
git submodule update --init --reference $path_to_mirrors/$sub2_name.git $sub2_path_in_super

# To avoid extra packs in each of the superprojects' submodules,
#   update the mirror clones before any pull/merge in super-projects.
for p in $path_to_mirrors/*.git; do GIT_DIR="$p" git fetch; done

cd super
git pull             # merges in new versions of submodules
git submodule update # update sub refs, checkout new versions,
                     #   but no download since they reference the updated mirrors

或者,您可以使用本地镜像作为子模块的源,将镜像克隆与 git clone 的默认硬链接功能结合使用,而不是使用 --reference。在新的超级项目克隆中,执行 git submodule init ,编辑 .git/config 中的子模块 URL 以指向本地镜像,然后执行 git submodule update 。您需要重新克隆任何现有的签出子模块才能获取硬链接。您只需下载一次到镜像中,然后从本地提取到您签出的子模块中,即可节省带宽。硬链接将节省磁盘空间(尽管提取往往会累积并在签出子模块的对象存储的多个实例之间重复;您可以定期从镜像中重新克隆签出子模块,以重新获得由硬链接)。

Are the canonical locations for your submodules remote? If so, are you OK with cloning them once? In other words, do you want the shallow clones just because you are suffering the wasted bandwidth of frequent submodule (re)clones?

If you want shallow clones to save local diskspace, then Ryan Graham's answer seems like a good way to go. Manually clone the repositories so that they are shallow. If you think it would be useful, adapt git submodule to support it. Send an email to the list asking about it (advice for implementing it, suggestions on the interface, etc.). In my opinion, the folks there are quite supportive of potential contributors that earnestly want to enhance Git in constructive ways.

If you are OK with doing one full clone of each submodule (plus later fetches to keep them up to date), you might try using the --reference option of git submodule update (it is in Git 1.6.4 and later) to refer to local object stores (e.g. make --mirror clones of the canonical submodule repositories, then use --reference in your submodules to point to these local clones). Just be sure to read about git clone --reference/git clone --shared before using --reference. The only likely problem with referencing mirrors would be if they ever end up fetching non-fast-forward updates (though you could enable reflogs and expand their expiration windows to help retain any abandoned commits that might cause a problem). You should not have any problems as long as

  • you do not make any local submodule commits, or
  • any commits that are left dangling by non-fast-forwards that the canonical repositories might publish are not ancestors to your local submodule commits, or
  • you are diligent about keeping your local submodule commits rebased on top of whatever non-fast-forwards might be published in the canonical submodule repositories.

If you go with something like this and there is any chance that you might carry local submodule commits in your working trees, it would probably be a good idea to create an automated system that makes sure critical objects referenced by the checked-out submodules are not left dangling in the mirror repositories (and if any are found, copies them to the repositories that need them).

And, like the git clone manpage says, do not use --reference if you do not understand these implications.

# Full clone (mirror), done once.
git clone --mirror $sub1_url $path_to_mirrors/$sub1_name.git
git clone --mirror $sub2_url $path_to_mirrors/$sub2_name.git

# Reference the full clones any time you initialize a submodule
git clone $super_url super
cd super
git submodule update --init --reference $path_to_mirrors/$sub1_name.git $sub1_path_in_super
git submodule update --init --reference $path_to_mirrors/$sub2_name.git $sub2_path_in_super

# To avoid extra packs in each of the superprojects' submodules,
#   update the mirror clones before any pull/merge in super-projects.
for p in $path_to_mirrors/*.git; do GIT_DIR="$p" git fetch; done

cd super
git pull             # merges in new versions of submodules
git submodule update # update sub refs, checkout new versions,
                     #   but no download since they reference the updated mirrors

Alternatively, instead of --reference, you could use the mirror clones in combination with the default hardlinking functionality of git clone by using local mirrors as the source for your submodules. In new super-project clones, do git submodule init, edit the submodule URLs in .git/config to point to the local mirrors, then do git submodule update. You would need to reclone any existing checked-out submodules to get the hardlinks. You would save bandwidth by only downloading once into the mirrors, then fetching locally from those into your checked-out submodules. The hard linking would save disk space (although fetches would tend to accumulate and be duplicated across multiple instances of the checked-out submodules' object stores; you could periodically reclone the checked-out submodules from the mirrors to regain the disk space saving provided by hardlinking).

謸气贵蔟 2024-08-26 15:36:22

参考如何克隆具有特定修订/变更集的git存储库?< /a>

我编写了一个简单的脚本,当您的子模块引用远离主模块时,该脚本没有问题。

git submodule foreach --recursive 'git rev-parse HEAD | xargs -I {} git fetch origin {} && git reset --hard FETCH_HEAD'

此语句将获取子模块的引用版本。

它速度很快,但您无法在子模块上提交编辑(您必须在 https://stackoverflow.com/a/ 之前获取 unshallow 子模块) 17937889/3156509

完整:

#!/bin/bash
git submodule init
git submodule foreach --recursive 'git rev-parse HEAD | xargs -I {} git fetch origin {} && git reset --hard FETCH_HEAD'
git submodule update --recursive

Reference to How to clone git repository with specific revision/changeset?

I have written a simple script which has no problem when your submodule reference is away from the master

git submodule foreach --recursive 'git rev-parse HEAD | xargs -I {} git fetch origin {} && git reset --hard FETCH_HEAD'

This statement will fetch the referenced version of submodule.

It is fast but you cannot commit your edit on the submodule (you have to fetch unshallow it before https://stackoverflow.com/a/17937889/3156509)

in full:

#!/bin/bash
git submodule init
git submodule foreach --recursive 'git rev-parse HEAD | xargs -I {} git fetch origin {} && git reset --hard FETCH_HEAD'
git submodule update --recursive
神经大条 2024-08-26 15:36:22

我创建了一个稍微不同的版本,当它没有在前沿运行时,并非所有项目都这样做。标准子模块添加不起作用,上面的脚本也不起作用。因此,我为标签引用添加了一个哈希查找,如果没有,它将回退到完整克隆。

#!/bin/bash
git submodule init
git submodule | while read hash name junk; do
    spath=$(git config -f .gitmodules --get submodule.$name.path)
    surl=$(git config -f .gitmodules --get submodule.$name.url)
    sbr=$(git ls-remote --tags $surl | sed -r "/${hash:1}/ s|^.*tags/([^^]+).*\$|\1|p;d")
    if [ -z $sbr ]; then
        git clone $surl $spath
    else
        git clone -b $sbr --depth 1 --single-branch $surl $spath
    fi
done
git submodule update 

I created a slightly different version, for when it's not running at the bleeding edge, which not all projects do. The standard submodule additions did't work nor did the script above. So I added a hash lookup for the tag ref, and if it doesn't have one, it falls back to full clone.

#!/bin/bash
git submodule init
git submodule | while read hash name junk; do
    spath=$(git config -f .gitmodules --get submodule.$name.path)
    surl=$(git config -f .gitmodules --get submodule.$name.url)
    sbr=$(git ls-remote --tags $surl | sed -r "/${hash:1}/ s|^.*tags/([^^]+).*\$|\1|p;d")
    if [ -z $sbr ]; then
        git clone $surl $spath
    else
        git clone -b $sbr --depth 1 --single-branch $surl $spath
    fi
done
git submodule update 
握住你手 2024-08-26 15:36:22

子模块的浅克隆是完美的,因为它们在特定修订/变更集上进行快照。从网站下载 zip 很容易,所以我尝试了一个脚本。

#!/bin/bash
git submodule deinit --all -f
for value in $(git submodule | perl -pe 's/.*(\w{40})\s([^\s]+).*/\1:\2/'); do
  mysha=${value%:*}
  mysub=${value#*:}
  myurl=$(grep -A2 -Pi "path = $mysub" .gitmodules | grep -Pio '(?<=url =).*/[^.]+')
  mydir=$(dirname $mysub)
  wget $myurl/archive/$mysha.zip
  unzip $mysha.zip -d $mydir
  test -d $mysub && rm -rf $mysub
  mv $mydir/*-$mysha $mysub
  rm $mysha.zip
done
git submodule init

git submodule deinit --all -f 清除子模块树,使脚本可以重用。

git submodule 检索 40 个字符的 sha1,后跟与 .gitmodules 中相同的路径相对应。我使用 perl 连接这些信息,并用冒号分隔,然后使用变量转换将值分成 myshamysub

这些是关键键,因为我们需要下载 sha1 以及关联 .gitmodules 中 url 的路径。

给定一个典型的子模块条目:

[submodule "label"]
    path = localpath
    url = https://github.com/repository.git

path = 上的 myurl 键,然后查看 2 行以获取值。此方法可能无法始终如一地发挥作用,需要改进。 url grep 通过匹配最后一个 / 以及 . 之前的任何内容来去除任何剩余的 .git 类型引用。

mydirmysub 减去最后的 /name,后者是通向子模块名称的目录。

接下来是一个 wget ,其格式为可下载的 zip 存档 url。这将来可能会改变。

将文件解压缩到 mydir ,它是子模块路径中指定的子目录。生成的文件夹将是 url-sha1 的最后一个元素。

检查子模块路径中指定的子目录是否存在并将其删除以允许重命名提取的文件夹。

mv 将包含 sha1 的提取文件夹重命名为其正确的子模块路径。

删除下载的 zip 文件。

子模块 init

这更多的是一个 WIP 概念证明,而不是一个解决方案。当它工作时,结果是指定变更集的子模块的浅克隆。

如果存储库将子模块重新定位到不同的提交,请重新运行脚本进行更新。

像这样的脚本唯一有用的时候是源项目的非协作本地构建。

Shallow clone of a submodule is perfect because they snapshot at a particular revision/changeset. It's easy to download a zip from the website so I tried for a script.

#!/bin/bash
git submodule deinit --all -f
for value in $(git submodule | perl -pe 's/.*(\w{40})\s([^\s]+).*/\1:\2/'); do
  mysha=${value%:*}
  mysub=${value#*:}
  myurl=$(grep -A2 -Pi "path = $mysub" .gitmodules | grep -Pio '(?<=url =).*/[^.]+')
  mydir=$(dirname $mysub)
  wget $myurl/archive/$mysha.zip
  unzip $mysha.zip -d $mydir
  test -d $mysub && rm -rf $mysub
  mv $mydir/*-$mysha $mysub
  rm $mysha.zip
done
git submodule init

git submodule deinit --all -f clears the submodule tree which allows the script to be reusable.

git submodule retrieves the 40 char sha1 followed by a path that corresponds to the same in .gitmodules. I use perl to concatenate this information, delimited by a colon, then employ variable transformation to separate the values into mysha and mysub.

These are the critical keys because we need the sha1 to download and the path to correlate the url in .gitmodules.

Given a typical submodule entry:

[submodule "label"]
    path = localpath
    url = https://github.com/repository.git

myurl keys on path = then looks 2 lines after to get the value. This method may not work consistently and require refinement. The url grep strips any remaining .git type references by matching to the last / and anything up to a ..

mydir is mysub minus a final /name which would by the directory leading up to the submodule name.

Next is a wget with the format of downloadable zip archive url. This may change in future.

Unzip the file to mydir which would be the subdirectory specified in the submodule path. The resultant folder will be the last element of the url-sha1.

Check to see if the subdirectory specified in the submodule path exists and remove it to allow renaming of the extracted folder.

mv rename the extracted folder containing our sha1 to its correct submodule path.

Delete downloaded zip file.

Submodule init

This is more a WIP proof of concept rather than a solution. When it works, the result is a shallow clone of a submodule at a specified changeset.

Should the repository re-home a submodule to a different commit, re-run the script to update.

The only time a script like this would be useful is for non-collaborative local building of a source project.

街角迷惘 2024-08-26 15:36:22

当我无法影响主存储库的克隆时,我需要一个浅层克隆子模块的解决方案。
基于上述一种解决方案:

#!/bin/bash
git submodule init
for i in $(git submodule | sed -e 's/.* //'); do
    git submodule update --init --depth 1 -- $i
done

I needed a solution to shallow clone submodules when I can not effect on cloning of main repo.
Based on one solution above:

#!/bin/bash
git submodule init
for i in $(git submodule | sed -e 's/.* //'); do
    git submodule update --init --depth 1 -- $i
done
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文