我遇到过这样的情况,一个相对较大的 git 存储库位于本地网络上一台旧的、缓慢的主机上的虚拟机中,需要相当长的时间才能进行初始克隆。
ravn@bamboo:~/git$ git clone gitosis@gitbox:git00
Initialized empty Git repository in /home/ravn/git/git00/.git/
remote: Counting objects: 89973, done.
remote: Compressing objects: 100% (26745/26745), done.
remote: Total 89973 (delta 50970), reused 85013 (delta 47798)
Receiving objects: 100% (89973/89973), 349.86 MiB | 2.25 MiB/s, done.
Resolving deltas: 100% (50970/50970), done.
Checking out files: 100% (11722/11722), done.
ravn@bamboo:~/git$
gitosis 中没有特定于 git 的配置更改。
有什么方法可以将接收比特的速度加快到网络的能力吗?
我需要新的存储库与上游存储库正确连接。据我了解,这需要 git 进行克隆,因此在 git 之外进行原始位复制将不起作用。
I have a situation with a relatively large git repository located in a virtual machine on an elderly, slow host on my local network where it takes quite a while to do the initial clone.
ravn@bamboo:~/git$ git clone gitosis@gitbox:git00
Initialized empty Git repository in /home/ravn/git/git00/.git/
remote: Counting objects: 89973, done.
remote: Compressing objects: 100% (26745/26745), done.
remote: Total 89973 (delta 50970), reused 85013 (delta 47798)
Receiving objects: 100% (89973/89973), 349.86 MiB | 2.25 MiB/s, done.
Resolving deltas: 100% (50970/50970), done.
Checking out files: 100% (11722/11722), done.
ravn@bamboo:~/git$
There is no git specific configuration changes in gitosis.
Is there any way of speeding up the receiving bit up to what the network is capable of?
I need the new repositories to be properly connected with the upstream repository. To my understanding, this requires git to do the cloning, and thus raw bit copying outside of git will not work.
发布评论
评论(6)
使用深度创建浅克隆。
Use the depth to create a shallow clone.
哑副本
如前所述,您可以仅使用“哑”文件传输来复制存储库。
这当然不会浪费时间压缩、重新打包、增量化和/或过滤。
另外,您将获得
这可能是也可能不是您所需要的,但很高兴知道
Bundle
Git clone 默认情况下会优化带宽,因为默认情况下 git clone 不会优化带宽。 镜像所有分支(请参阅
--mirror
),仅按原样转储包文件是没有意义的(因为这可能会发送超出所需数量的文件)。当分发给真正大量的客户端时,考虑使用捆绑。
如果您想要快速克隆而不需要服务器端成本,< em>git way 是
bundle创建。您现在可以分发捆绑包,甚至无需服务器参与。如果您的意思是
bundle... --all
包含的不仅仅是简单的git clone
,请考虑例如bundle ... master
来减少体积。并分发快照包。这是两全其美的方法,当然您不会从上面的项目符号列表中获得这些项目。在接收端,只需
压缩配置
即可通过减少/删除压缩来降低服务器负载。
看看这些配置设置(我假设
pack.compression
可以帮助您降低服务器负载)如果网络带宽充足,这实际上会带来更快的克隆速度。 当您决定对其进行基准测试时,不要忘记
git-repack -F
!Dumb copy
As mentioned you could just copy a repository with 'dumb' file transfer.
This will certainly not waste time compressing, repacking, deltifying and/or filtering.
Plus, you will get
This may or may not be what you require, but it is nice to be aware of the fact
Bundle
Git clone by default optimizes for bandwidth. Since git clone, by default, does not mirror all branches (see
--mirror
) it would not make sense to just dump the pack-files as-is (because that will send possibly way more than required).When distributing to a truly big number of clients, consider using bundles.
If you want a fast clone without the server-side cost, the git way is
bundle create
. You can now distribute the bundle, without the server even being involved. If you mean thatbundle... --all
includes more than simplegit clone
, consider e.g.bundle ... master
to reduce the volume.and distribute the snapshot bundle instead. That's the best of both worlds, while of course you won't get the items from the bullet list above. On the receiving end, just
Compression configs
You can look at lowering server load by reducing/removing compression.
Have a look at these config settings (I assume
pack.compression
may help you lower the server load)Given ample network bandwidth, this will in fact result in faster clones. Don't forget about
git-repack -F
when you decide to benchmark that!2014 年建议的
git clone --depth=1 ...
建议将在第二季度变得更快2019 年使用 Git 2.22。这是因为,在初始“
git clone --depth=...
”部分克隆期间,它是为大部分连接花费周期是毫无意义的
检查是否枚举并跳过承诺对象(根据定义,这是从另一端获取的所有对象)。
这已经被优化掉了。
在 Git 2.26(2020 年第一季度)中,现在在提取部分克隆时会禁用不需要的连接检查。
请参阅提交2df1aa2,提交 5003377(2020 年 1 月 12 日),作者:乔纳森·谭(
jhowtan
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 8fb3945,2020 年 2 月 14 日)和:
而且,仍然在 Git 2.26(2020 年第一季度)中,对象可达性位图机制和部分克隆机制无法很好地协同工作,因为部分克隆使用的一些对象过滤标准本质上依赖于对象遍历,但是位图机制是一种绕过该对象遍历的优化。
请参阅 提交 20a5fd8(2020 年 2 月 18 日),作者:Junio C Hamano (
gitster
)。请参阅 提交 3ab3185,提交 84243da, 提交 4f3bd56, 提交 cc4aa28, 提交 2aaeb9a, 提交 6663ae0, 提交 4eb707e, 提交 ea047a8, 提交 608d9c9, 提交 55cb10f, 提交 792f811, 提交 d90fe06(2020 年 2 月 14 日),以及 提交 e03f928, 提交 acac50d, 提交 551cf8b(2020 年 2 月 13 日),作者:杰夫·金(
peff
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 0df82d9,2020 年 3 月 2 日)Git 2.27(2020 年第二季度)将简化部分克隆存储库中的提交祖先连通性检查,其中假定“承诺”对象可以从承诺者远程存储库按需延迟获取。
请参阅 提交 2b98478(2020 年 3 月 20 日),作者:Jonathan Tan (
jhowtan
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 0c60105,2020 年 4 月 22 日)在 Git 2.27(2020 年第 2 季度)中,使用对象过滤器“
--filter=tree:0
”的对象遍历现在可以利用可用的包位图。请参阅 提交 9639474,提交 5bf7f1e(2020 年 5 月 4 日),作者:杰夫·金(
peff
)。请参阅提交b0a8d48,提交 856e12c(2020 年 5 月 4 日),作者:泰勒·布劳 (
ttaylorr
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 69ae8ff,2020 年 5 月 13 日)并且:
请参阅 提交9639474,提交 5bf7f1e(2020 年 5 月 4 日),作者:杰夫·金 (
peff
)。请参阅提交b0a8d48,提交 856e12c(2020 年 5 月 4 日),作者:泰勒·布劳 (
ttaylorr
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 69ae8ff,2020 年 5 月 13 日)在 Git 2.32(2021 年第 2 季度)中,允许某些对象丢失和可延迟检索的“承诺包”的处理已经得到优化(有点)。
请参阅提交c1fa951,提交 45a187c, 提交 fcc07e9(2021 年 4 月 13 日),作者:杰夫·金(
peff
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 13158b9,2021 年 4 月 30 日)较早的优化丢弃了仍在使用的树对象缓冲区,该缓冲区已用GIT 2.38(Q3 2022)校正。
参见 noreflow noreferrer“ github.com/peff“ rel =” nofollow noreferrer“> jeff king(
peff
)。(由 Junio C Hamano --
gitster
-- 合并于 commit 01a30a5 ,2022年8月25日)The
git clone --depth=1 ...
suggested in 2014 will become faster in Q2 2019 with Git 2.22.That is because, during an initial "
git clone --depth=...
" partial clone, it ispointless to spend cycles for a large portion of the connectivity
check that enumerates and skips promisor objects (which by definition is all objects fetched from the other side).
This has been optimized out.
With Git 2.26 (Q1 2020), an unneeded connectivity check is now disabled in a partial clone when fetching into it.
See commit 2df1aa2, commit 5003377 (12 Jan 2020) by Jonathan Tan (
jhowtan
).(Merged by Junio C Hamano --
gitster
-- in commit 8fb3945, 14 Feb 2020)And:
And, still with Git 2.26 (Q1 2020), The object reachability bitmap machinery and the partial cloning machinery were not prepared to work well together, because some object-filtering criteria that partial clones use inherently rely on object traversal, but the bitmap machinery is an optimization to bypass that object traversal.
See commit 20a5fd8 (18 Feb 2020) by Junio C Hamano (
gitster
).See commit 3ab3185, commit 84243da, commit 4f3bd56, commit cc4aa28, commit 2aaeb9a, commit 6663ae0, commit 4eb707e, commit ea047a8, commit 608d9c9, commit 55cb10f, commit 792f811, commit d90fe06 (14 Feb 2020), and commit e03f928, commit acac50d, commit 551cf8b (13 Feb 2020) by Jeff King (
peff
).(Merged by Junio C Hamano --
gitster
-- in commit 0df82d9, 02 Mar 2020)Git 2.27 (Q2 2020) will simplify the commit ancestry connectedness check in a partial clone repository in which "promised" objects are assumed to be obtainable lazily on-demand from promisor remote repositories.
See commit 2b98478 (20 Mar 2020) by Jonathan Tan (
jhowtan
).(Merged by Junio C Hamano --
gitster
-- in commit 0c60105, 22 Apr 2020)With Git 2.27 (Q2 2020), the object walk with object filter "
--filter=tree:0
" can now take advantage of the pack bitmap when available.See commit 9639474, commit 5bf7f1e (04 May 2020) by Jeff King (
peff
).See commit b0a8d48, commit 856e12c (04 May 2020) by Taylor Blau (
ttaylorr
).(Merged by Junio C Hamano --
gitster
-- in commit 69ae8ff, 13 May 2020)And:
See commit 9639474, commit 5bf7f1e (04 May 2020) by Jeff King (
peff
).See commit b0a8d48, commit 856e12c (04 May 2020) by Taylor Blau (
ttaylorr
).(Merged by Junio C Hamano --
gitster
-- in commit 69ae8ff, 13 May 2020)With Git 2.32 (Q2 2021), handling of "promisor packs" that allows certain objects to be missing and lazily retrievable has been optimized (a bit).
See commit c1fa951, commit 45a187c, commit fcc07e9 (13 Apr 2021) by Jeff King (
peff
).(Merged by Junio C Hamano --
gitster
-- in commit 13158b9, 30 Apr 2021)An earlier optimization discarded a tree-object buffer that is still in use, which has been corrected with Git 2.38 (Q3 2022).
See commit 1490d7d (14 Aug 2022) by Jeff King (
peff
).(Merged by Junio C Hamano --
gitster
-- in commit 01a30a5, 25 Aug 2022)我正在对 git clone 进行基准测试。
如果项目包含子模块,使用 --jobs 选项可以更快
前任:
I'm bench marking git clone.
It can be faster with --jobs options if the project include submodules
ex:
在意识到数据传输速度的上限是在git“外部”建立的ssh连接之后,我做了一些实验,发现使用pcsp(Putty scp)的上限是3,0 MB/s因为河豚加密方案选择得当。使用raw ftp进行控制实验表明传输速度为3.1 MB/s,因此表明这是网络的上限。
它在 vmware hypervisor 内运行,并且由于执行网络 I/O 的进程几乎使用了 100% cpu,因此表明瓶颈是 Ubuntu 网卡驱动程序。然后我发现,即使安装了 vmware 工具,由于某种原因,内核仍然使用 vlance 驱动程序(模拟具有 IRQ 等的 10 MBps 网卡)而不是 vmxnet 驱动程序(直接与虚拟机管理程序对话)。现在等待服务窗口进行更改。
换句话说,问题不在于git,而在于底层的“硬件”。
After realizing that the upper limit to the transfer speed of data, is the ssh connection which is established "outside" of git I did some experiments, and found that the upper limit of using pcsp (Putty scp) was 3,0 MB/s as the blowfish encryption scheme was properly chosen. A control experiment with raw ftp showed that the transfer speed was 3.1 MB/s, so it indicated that this was the upper bound of the network.
This runs inside a vmware hypervisor, and as the process doing network I/O utilized almost 100% cpu it indicated that the bottleneck was the Ubuntu network card driver. I then found that even though vmware tools were installed, for some reason the kernel still used the vlance driver (emulating a 10 MBps network card with IRQ's and all) instead of the vmxnet driver (which speaks directly to the hypervisor). This now awaits a service window to be changed.
In other words, the problem was not with git but the underlying "hardware".
从日志来看,您似乎已经完成了克隆,如果您的问题是您需要在不同的计算机上多次执行此过程,您只需将存储库目录从一台计算机复制到另一台计算机即可。这种方式将保留每个副本和您克隆的存储库之间的关系(远程)。
From the log it seems you already finished the clone, if your problem is that you need to do this process multiple times on different machines, you can just copy the repository directory from one machine to the other. This way will preserve the relationship (remotes) between each copy and the repository you cloned from.