限制 git 存储库中的文件大小
我目前正在考虑将我的 VCS(从 subversion)更改为 git。是否可以限制 git 存储库中提交的文件大小?对于例如颠覆,有一个钩子: http://www.davidgrant.ca/limit_size_of_subversion_commits_with_this_hook
根据我的经验,人们,尤其是那些缺乏经验,有时倾向于提交不应进入 VCS 的文件(例如大文件系统映像)。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
由于我在这个问题上挣扎了一段时间,即使有描述,我认为这也与其他人相关,我想我应该发布一个如何实现J16 SDiZ 描述可以实现。
所以,我对服务器端的看法
update
钩子防止推送太大的文件:请注意,它已评论该代码仅检查最新的提交,因此需要调整该代码以迭代 $2 和 $3 之间的提交并执行检查所有的人。
As I was struggling with it for a while, even with the description, and I think this is relevant for others too, I thought I'd post an implementation of how what J16 SDiZ described could be implemented.
So, my take on the server-side
update
hook preventing too big files to be pushed:Note that it's been commented that this code only checks the latest commit, so this code would need to be tweaked to iterate commits between $2 and $3 and do the check to all of them.
eis 和 J-16 SDiZ 的答案存在严重问题。
他们只检查最终提交 $3 或 $newrev 的状态。
他们还需要检查其他提交中提交的内容
udpate 挂钩中的 $2(或 $oldrev)和 $3(或 $newrev)之间。
J-16 SDiZ 更接近正确答案。
最大的缺陷是,如果某个部门的服务器安装了这个更新挂钩来保护它,那么他会发现:
在使用 git rm 删除意外签入的大文件后,
那么当前树或最后一次提交就可以了,而且它会
拉入整个提交链,包括大文件
被删除,创造了一个没人想要的膨胀的、不愉快的肥胖历史。
解决方案是检查从 $oldrev 到 $newrev 的每个提交,或者指定整个范围 $oldrev..$newrev。
请确保您不仅仅是单独检查 $newrev,否则这将会失败
你的 git 历史记录中有大量垃圾,被推出与他人分享,
然后就很难或不可能去除。
The answers by eis and J-16 SDiZ suffer from a severe problem.
They are only checking the state of the finale commit $3 or $newrev.
They need to also check what is being submitted in the other commits
between $2 (or $oldrev) and $3 (or $newrev) in the udpate hook.
J-16 SDiZ is closer to the right answer.
The big flaw is that someone whose departmental server has this update hook installed to protect it will find out the hard way that:
After using git rm to remove the big file accidentally being checked in,
then the current tree or last commit only will be fine, and it will
pull in the entire chain of commits, including the big file that
was deleted, creating a swollen unhappy fat history that nobody wants.
To solution is either to check each and every commit from $oldrev to $newrev, or to specify the entire range $oldrev..$newrev.
Be darn sure you are not just checking $newrev alone, or this will fail
with massive junk in your git history, pushed out to share with others,
and then difficult or impossible to remove after that.
这个非常好:
你可以通过添加以下内容在服务器端
config
文件中配置大小:This one is pretty good:
You can configure the size in the serverside
config
file by adding:如果您使用的是 gitolite,您也可以尝试 VREF。
默认情况下已经提供了一个 VREF(代码位于 gitolite/src/VREF/MAX_NEWBI_SIZE 中)。
它称为 MAX_NEWBI_SIZE。
它的工作原理如下:
其中 1000 是示例阈值(以字节为单位)。
此 VREF 的工作方式类似于更新挂钩,如果您要推送的一个文件大于阈值,它将拒绝您的推送。
if you are using gitolite you can also try VREF.
There is one VREF already provided by default (the code is in gitolite/src/VREF/MAX_NEWBIN_SIZE).
It is called MAX_NEWBIN_SIZE.
It works like this:
Where 1000 is example threshold in Bytes.
This VREF works like a update hook and it will reject your push if one file you are to push is greater than the threshold.
是的,git 也有钩子(git 钩子) 。但这在某种程度上取决于您将使用的实际工作流程。
如果您的用户没有经验,那么先拉,然后让他们推会安全得多。这样,您可以确保他们不会搞砸主存储库。
Yes, git has hooks as well (git hooks). But it kind of depends on the actually work-flow you will be using.
If you have inexperienced users, it is much safer to pull, then to let them push. That way, you can make sure they won't screw up the main repository.
我想强调另一组在拉取请求阶段解决此问题的方法:GitHub Actions 和 Apps。它不会阻止大文件被提交到分支中,但如果它们在合并之前被删除,那么生成的基础分支将不会在历史记录中包含大文件。
最近开发了一个操作,可以根据用户定义的参考值检查添加的文件大小(通过 GitHub API):lfs-警告。
我还亲自编写了一个 Probot 应用程序来筛选 PR 中的大文件(针对用户定义的值),但效率低得多:
I want to highlight another set of approaches that address this issue at the pull request stage: GitHub Actions and Apps. It doesn't stop large files from being committed into a branch, but if they're removed prior to the merge then the resulting base branch will not have the large files in history.
There's a recently developed action that checks the added file sizes (through the GitHub API) against a user-defined reference value: lfs-warning.
I've also personally hacked together a Probot app to screen for large file sizes in a PR (against a user-defined value), but it's much less efficient: sizeCheck
从我在某人签入时看到的情况来看,这将是非常罕见的情况,例如 200Mb 甚至更大大小的文件。
虽然您可以通过使用服务器端挂钩(不确定客户端挂钩,因为您必须依赖安装挂钩的人)来防止这种情况发生,就像在 SVN 中一样,但您还必须考虑到 Git 中的情况,从存储库中删除这样的文件/提交要容易得多。在 SVN 中你没有这样的奢侈,至少不是一个简单的方法。
This is going to be a very rare case from what I have seen when some one checks in, say a 200Mb or even more size file.
While you can prevent this from happening by using server side hooks ( not sure about client side hooks since you have to rely on the person having the hooks installed ) much like how you would in SVN, you also have to take into account that in Git, it is much much easier to remove such a file / commit from the repository. You did not have such a luxury in SVN, atleast not an easy way.
我正在使用 gitolite 并且更新挂钩已经被使用 - 我没有使用更新挂钩,而是使用了预接收挂钩。 Chriki 发布的脚本工作得非常好,除了数据是通过 stdin 传递的 - 所以我做了一行更改:(
可能有一种更优雅的方法来做到这一点,但它有效)
I am using gitolite and the update hook was already being used - instead of using the update hook, I used the pre-receive hook. The script posted by Chriki worked fabulously with the exception that the data is passed via stdin - so I made one line change:
(there may be a more elegant way to do that but it works)
您需要一个能够满足以下场景的解决方案。
这个钩子(https://github.com/mgit-at/git-max-filesize )处理上述两种情况,并且似乎也可以正确处理边缘情况,例如新分支推送和分支删除。
You need a solution that caters to the following scenarios.
This hook (https://github.com/mgit-at/git-max-filesize) deals with the above 2 cases and seems to also correctly handle edge cases such as new branch pushes and branch deletes.
另一种方法是对
.gitignore
进行版本控制,这将阻止任何具有特定扩展名的文件显示在状态中。您仍然可以有挂钩(在下游或上游,如其他答案所建议的),但至少所有下游存储库都可以包含
.gitignore
以避免添加.exe
,.dll
,.iso
, ...如果您使用钩子,请考虑 Git 2.42 (Q3 2023):一些可用于“
--format=
” 为“git ls-tree
“(man) 不被git ls-files
(man),即使它们与后者的上下文相关。请参阅 提交 4d28c4f(2023 年 5 月 23 日),作者:胡哲宁(
adlternative
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 32fe7ff,2023 年 6 月 13 日)git ls-files
现在包含在其 手册页:git ls-files
现在包含在其 手册页:Another way is to version a
.gitignore
, which will prevent any file with a certain extension to show up in the status.You still can have hooks as well (on downstream or upstream, as suggested by the other answers), but at least all downstream repo can include that
.gitignore
to avoid adding.exe
,.dll
,.iso
, ...If you are using hooks, consider Git 2.42 (Q3 2023): some atoms that can be used in "
--format=<format>
" for "git ls-tree
"(man) were not supported bygit ls-files
(man), even though they were relevant in the context of the latter.See commit 4d28c4f (23 May 2023) by ZheNing Hu (
adlternative
).(Merged by Junio C Hamano --
gitster
-- in commit 32fe7ff, 13 Jun 2023)git ls-files
now includes in its man page:git ls-files
now includes in its man page:您可以使用 钩子,或者
pre-commit
钩子(在客户端),或者一个update
钩子(在服务器上)。执行git ls-files --cached
(用于预提交)或git ls-tree --full-tree -r -l $3
(用于更新)并执行因此。git ls-tree -l 会给出类似这样的结果:
抓住第四列,它就是大小。使用 git ls-tree --full-tree -r -l HEAD |排序-k 4 -n -r | head -1 获取最大的文件。
cut
提取,if [ a -lt b ]
检查大小等。抱歉,我认为如果你是程序员,你应该能够自己做到这一点。
You can use a hook, either
pre-commit
hook (on client), or aupdate
hook (on server). Do agit ls-files --cached
(for pre-commit) orgit ls-tree --full-tree -r -l $3
(for update) and act accordingly.git ls-tree -l
would give something like this:Grab the forth column, and it is the size. Use
git ls-tree --full-tree -r -l HEAD | sort -k 4 -n -r | head -1
to get the largest file.cut
to extract,if [ a -lt b ]
to check size, etc..Sorry, I think if you are a programmer, you should be able to do this yourself.