有没有办法限制“git gc”的内存量用途?
我在共享主机上托管 git 存储库。我的存储库中必然有几个非常大的文件,每次我尝试在存储库上运行“git gc”时,我的进程都会因使用过多内存而被共享托管提供商杀死。有没有办法限制 git gc 可以消耗的内存量?我希望它可以用内存使用来换取速度,并且只需花更长的时间来完成工作。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
Git repack 的内存使用量为:
(pack.deltaCacheSize + pack.windowMemory) × pack.threads
。各自的默认值是 256MiB、无限制、nproc。增量缓存没有什么用处:大部分时间都花在了滑动窗口上计算增量上,其中大部分都被丢弃了;缓存幸存者以便它们可以重用一次(写入时)不会改善运行时间。该缓存也不在线程之间共享。
默认情况下,窗口内存通过
pack.window
(gc.aggressiveWindow
) 进行限制。限制这种打包方式是一个坏主意,因为工作集的大小和效率会有很大差异。最好将两者提高到更高的值,并依靠 pack.windowMemory 来限制窗口大小。最后,线程具有分割工作集的缺点。降低 pack.threads 并增加 pack.windowMemory 以使总数保持不变应该可以缩短运行时间。
repack 还有其他有用的可调参数(
pack.depth
、pack.compression
、位图选项),但它们不会影响内存使用。Git repack's memory use is:
(pack.deltaCacheSize + pack.windowMemory) × pack.threads
. Respective defaults are 256MiB, unlimited, nproc.The delta cache isn't useful: most of the time is spent computing deltas on a sliding window, the majority of which are discarded; caching the survivors so they can be reused once (when writing) won't improve the runtime. That cache also isn't shared between threads.
By default the window memory is limited through
pack.window
(gc.aggressiveWindow
). Limiting packing that way is a bad idea, because the working set size and efficiency will vary widely. It's best to raise both to much higher values and rely onpack.windowMemory
to limit the window size.Finally, threading has the disadvantage of splitting the working set. Lowering
pack.threads
and increasingpack.windowMemory
so that the total stays the same should improve the run time.repack has other useful tunables (
pack.depth
,pack.compression
, the bitmap options), but they don't affect memory use.您可以使用关闭 delta 属性来仅对这些路径名的 blob 禁用增量压缩:
在 foo/.git/info/attributes 中(或 foo.git/info/attributes 中) code> 如果它是一个裸存储库)(请参阅 gitattributes 中的增量条目并参阅 gitignore 用于模式语法):
这不会影响存储库的克隆。要影响其他存储库(即克隆),请将属性放入
.gitattributes
文件中,而不是(或除了)info/attributes
文件中。You could use turn off the delta attribute to disable delta compression for just the blobs of those pathnames:
In
foo/.git/info/attributes
(orfoo.git/info/attributes
if it is a bare repository) (see the delta entry in gitattributes and see gitignore for the pattern syntax):This will not affect clones of the repository. To affect other repositories (i.e. clones), put the attributes in a
.gitattributes
file instead of (or in addition to) theinfo/attributes
file.Git 2.18 (Q2 2018) 将改善 gc 内存消耗。
在 2.18 之前,“
git pack-objects
”在完成工作时需要分配大量的“struct object_entry
”:缩小其大小对性能有很大帮助。这会影响 git gc。
请参阅提交f6a5576,提交 3b13a5f, 提交0aca34e,提交 ac77d0c,提交27a7d06,提交 660b373,提交 0cb3c14,提交898eba5,提交 43fa44f,提交 06af3bb , 提交 b5c0cbd, 提交 0c6804a, 提交fd9b1ba,提交 8d6ccce,提交 4c2db93(2018 年 4 月 14 日),作者:Nguyễn Thái Ngọc Duy (
pclouds
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 ad635e8,2018 年 5 月 23 日)使用 Git 2.20(2018 年第 4 季度),可以更轻松地检查一个分叉中存在的对象是否与未出现在同一分叉存储库中的另一个对象形成增量。
请参阅提交 fe0ac2f、提交 108f530, 提交f64ba53(2018 年 8 月 16 日),作者:Christian Couder (
chriscool
)。帮助者:Jeff King (
peff
) 和 Duy Nguyen (pclouds
)。请参阅 提交 9eb0986、提交 16d75fa, 提交28b8a73,提交 c8d521f(2018 年 8 月 16 日),作者:杰夫·金 (
peff
)。帮助者:Jeff King (
peff
) 和 Duy Nguyen (pclouds
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 f3504ea,2018 年 9 月 17 日)请注意,Git 2.21(2019 年 2 月)修复了一个小错误:“
git pack-objects
”错误地使用了未初始化的互斥体,该错误已得到纠正。请参阅 提交 edb673c、提交 459307b(2019 年 1 月 25 日),作者: Patrick Hogg ( ``)。
帮助者:Junio C Hamano (
gitster
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 d243a32,2019 年 2 月 5 日)Git 2.21(2019 年 2 月)仍然找到另一种方法来缩小包的大小,通过“
git pack-objects
”学习另一种算法来计算集合要发送的对象,将生成的包文件进行交换以保存
有利于小推送的遍历成本。
配置包文档现在包括:
请参阅“
git Push
对于大型存储库来说非常慢”举个具体的例子。注意:正如 Git 2.24 中所评论的,像
pack.useSparse
这样的设置仍然是实验性的。请参阅提交aaf633c,提交 c6cc4c5, 提交ad0fb65,提交 31b1de6,提交 b068d9a,提交 7211b9e(2019 年 8 月 13 日),作者:Derrick Stolee (
derrickstolee
) 。(由 Junio C Hamano --
gitster
-- 合并于 提交 f4f8dfe,2019 年 9 月 9 日)使用 Git 2.26 (Q1 2020),方式“
git pack-objects
" 重用现有包中存储的对象来生成其结果改善了。请参阅提交 d2ea031、提交 92fb0db, 提交bb514de,提交 ff48302,提交 e704fc7,提交 2f4af77,提交 8ebf529,提交 59b2829,提交 40d18ff,提交 14fbd26 (2019 年 12 月 18 日),提交 56d9cbe,提交 bab28d9(2019 年 9 月 13 日),作者 杰夫·金 (
peff
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 a14aebe,2020 年 2 月 14 日)在 Git 2.34(2021 年第 4 季度)中,
git repack
本身(由git gc
使用)受益于内存使用量的减少。请参阅 提交 b017334、提交 a9fd2f2, 提交a241878(2021 年 8 月 29 日),作者:Taylor Blau (
ttaylorr
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 9559de3,2021 年 9 月 10 日)fetch.negotiationAlgorithm
和feature.experimental
配置变量之间的交互已在 Git 2.36(2022 年第 2 季度)中得到纠正。请参阅提交 714edc6、提交 a9a136c, Elijah Newren (
newren
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 70ff41f,2022 年 2 月 16 日)git config
现在包含在其 手册页:在 Git 2.44(2024 年第 1 季度)中,包文件数据的流传输过去只能通过具有多个包文件的存储库中的单个主包来完成。
它已被扩展为允许从其他包文件中重用。这会影响 gc。
请参阅 提交 ba47d88、提交 af626ac, 提交9410741,提交 3bea0c0, nofollow noreferrer“> commit B1E33333068247DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDEB69457C249DDEE7A1“ REL =“ Nofollow Noreferrer”> commit B1E3333 5e6ad2“ rel =” nofollow noreferrer”>提交ED9F414 ,, /git/comm/ca0fd69e37132ACDDDC457B96A91EF528C7C312B“ rel =“ nofollow noreferrer”> commit Ca0fd69 , commit 4805125 , github.com/git/git/commit/D1D701EB9CE2293588AABF34C69335D49640F968“ rel =“ nofollow noreferrer”> commitd d1d701e 2102185E7450C54A3637FDEE0“ rel =” nofollow noreferrer">commit 5e29c3f, commit 83296d2, nofollow noreferrer“> commit 35E156B E5D48BF38BC0E1F44F4DAA7C8E0F75CD9296D020“ rel =“ nofollow noreferrer”>提交E5D48BF , commit dab6093 , commits 6cdb67b 12d2952305a4793d7b43f“ rel =” nofollow noreferrer“> commit 66f0c71 (2023年12月14日)作者: 。
(由 Junio C Hamano --
gitster
-- 合并于git config
现在包含在其人页:与Git 2.44(Q1 2024),将用户选择到多彼得重复使用实验中,
请参见 href =“ https://github.com/git/git/commit/7C01878EEB15E8DD75F0262BDFBDFB3249C85A30A4A” rel =“ nofollow noreferrer”> norelollow noreferrer“> commition 7c01878 05年5月5日(05年5月5日) /ttaylorr“ rel =” nofollow noreferrer“> taylor blau(
ttaylorr
)。(由 Junio C Hamano --
gitster
-- 合并于git config
现在包含在其人页:Git 2.18 (Q2 2018) will improve the gc memory consumption.
Before 2.18, "
git pack-objects
" needs to allocate tons of "struct object_entry
" while doing its work: shrinking its size helps the performance quite a bit.This influences
git gc
.See commit f6a5576, commit 3b13a5f, commit 0aca34e, commit ac77d0c, commit 27a7d06, commit 660b373, commit 0cb3c14, commit 898eba5, commit 43fa44f, commit 06af3bb, commit b5c0cbd, commit 0c6804a, commit fd9b1ba, commit 8d6ccce, commit 4c2db93 (14 Apr 2018) by Nguyễn Thái Ngọc Duy (
pclouds
).(Merged by Junio C Hamano --
gitster
-- in commit ad635e8, 23 May 2018)With Git 2.20 (Q4 2018), it will be easier to check an object that exists in one fork is not made into a delta against another object that does not appear in the same forked repository.
See commit fe0ac2f, commit 108f530, commit f64ba53 (16 Aug 2018) by Christian Couder (
chriscool
).Helped-by: Jeff King (
peff
), and Duy Nguyen (pclouds
).See commit 9eb0986, commit 16d75fa, commit 28b8a73, commit c8d521f (16 Aug 2018) by Jeff King (
peff
).Helped-by: Jeff King (
peff
), and Duy Nguyen (pclouds
).(Merged by Junio C Hamano --
gitster
-- in commit f3504ea, 17 Sep 2018)Note that Git 2.21 (Feb. 2019) fixes a small bug: "
git pack-objects
" incorrectly used uninitialized mutex, which has been corrected.See commit edb673c, commit 459307b (25 Jan 2019) by Patrick Hogg (``).
Helped-by: Junio C Hamano (
gitster
).(Merged by Junio C Hamano --
gitster
-- in commit d243a32, 05 Feb 2019)Git 2.21 (Feb. 2019) still find another way to shrink the size of the pack with "
git pack-objects
" learning another algorithm to compute the set ofobjects to send, that trades the resulting packfile off to save
traversal cost to favor small pushes.
The config pack documentation now includes:
See "
git push
is very slow for a huge repo" for a concrete illustration.Note: as commented in Git 2.24, a setting like
pack.useSparse
is still experimental.See commit aaf633c, commit c6cc4c5, commit ad0fb65, commit 31b1de6, commit b068d9a, commit 7211b9e (13 Aug 2019) by Derrick Stolee (
derrickstolee
).(Merged by Junio C Hamano --
gitster
-- in commit f4f8dfe, 09 Sep 2019)With Git 2.26 (Q1 2020), The way "
git pack-objects
" reuses objects stored in existing pack to generate its result has been improved.See commit d2ea031, commit 92fb0db, commit bb514de, commit ff48302, commit e704fc7, commit 2f4af77, commit 8ebf529, commit 59b2829, commit 40d18ff, commit 14fbd26 (18 Dec 2019), and commit 56d9cbe, commit bab28d9 (13 Sep 2019) by Jeff King (
peff
).(Merged by Junio C Hamano --
gitster
-- in commit a14aebe, 14 Feb 2020)With Git 2.34 (Q4 2021),
git repack
itself (used bygit gc
) benefits from a reduced memory usage.See commit b017334, commit a9fd2f2, commit a241878 (29 Aug 2021) by Taylor Blau (
ttaylorr
).(Merged by Junio C Hamano --
gitster
-- in commit 9559de3, 10 Sep 2021)The interaction between
fetch.negotiationAlgorithm
andfeature.experimental
configuration variables has been corrected with Git 2.36 (Q2 2022).See commit 714edc6, commit a9a136c, commit a68c5b9 (02 Feb 2022) by Elijah Newren (
newren
).(Merged by Junio C Hamano --
gitster
-- in commit 70ff41f, 16 Feb 2022)git config
now includes in its man page:With Git 2.44 (Q1 2024), streaming spans of packfile data used to be done only from a single, primary, pack in a repository with multiple packfiles.
It has been extended to allow reuse from other packfiles, too. That can influence the gc.
See commit ba47d88, commit af626ac, commit 9410741, commit 3bea0c0, commit 54393e4, commit 519e17f, commit dbd5c52, commit e1bfe30, commit b1e3333, commit ed9f414, commit b96289a, commit ca0fd69, commit 4805125, commit 073b40e, commit d1d701e, commit 5e29c3f, commit 83296d2, commit 35e156b, commit e5d48bf, commit dab6093, commit 307d75b, commit 5f5ccd9, commit fba6818, commit a96015a, commit 6cdb67b, commit 66f0c71 (14 Dec 2023) by Taylor Blau (
ttaylorr
).(Merged by Junio C Hamano --
gitster
-- in commit 0fea6b7, 12 Jan 2024)git config
now includes in its man page:With Git 2.44 (Q1 2024), rc1, setting
feature.experimental
opts the user into multi-pack reuse experimentSee commit 23c1e71, commit 7c01878 (05 Feb 2024) by Taylor Blau (
ttaylorr
).(Merged by Junio C Hamano --
gitster
-- in commit 3b89ff1, 12 Feb 2024)git config
now includes in its man page:我使用了此链接<的说明/a>.与 Charles Baileys 建议的想法相同。
命令的副本在这里:
这对我在具有共享托管帐户的 hostgator 上有用。
I used instructions from this link. Same idea as Charles Baileys suggested.
A copy of the commands is here:
This worked for me on hostgator with shared hosting account.
是的,查看 git config 的帮助页面并查看
pack.*
选项,特别是pack.depth
、pack.window
、pack.windowMemory
和pack.deltaCacheSize
。它不是一个完全精确的大小,因为 git 需要将每个对象映射到内存中,因此无论窗口和增量缓存设置如何,一个非常大的对象都可能导致大量内存使用。
您可能会更好地在本地打包并“手动”将打包文件传输到远程端,添加
.keep
文件,这样远程 git 就不会尝试完全重新打包所有内容。Yes, have a look at the help page for
git config
and look at thepack.*
options, specificallypack.depth
,pack.window
,pack.windowMemory
andpack.deltaCacheSize
.It's not a totally exact size as git needs to map each object into memory so one very large object can cause a lot of memory usage regardless of the window and delta cache settings.
You may have better luck packing locally and transfering pack files to the remote side "manually", adding a
.keep
files so that the remote git doesn't ever try to completely repack everything.