是否可以远程计算 git 存储库的对象和大小?

发布于 2024-09-02 09:06:14 字数 102 浏览 5 评论 0原文

假设网络中的某个地方存在公共 git 存储库。我想克隆它,但首先我需要确定它的大小(像 git count-objects 中那样有多少对象和千字节)

有没有办法做到这一点?

Assume that somewhere in the web exists public git repository. I want to clone it but firstly i need to be sure what is size of it (how much objects & kbytes like in git count-objects)

Is there a way to do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

戒ㄋ 2024-09-09 09:06:14

您可以使用的一个小组装如下:

mkdir repo-name
cd repo-name
git init
git remote add origin <URL of remote>
git fetch origin

git fetch 按以下方式显示反馈:

remote: Counting objects: 95815, done.
remote: Compressing objects: 100% (25006/25006), done.
remote: Total 95815 (delta 69568), reused 95445 (delta 69317)
Receiving objects: 100% (95815/95815), 18.48 MiB | 16.84 MiB/s, done.
...

远程端的步骤通常发生得非常快;接收步骤可能非常耗时。它实际上并没有显示总大小,但您当然可以看一下它,如果您看到“1% ... 23.75 GiB”,您就知道遇到了麻烦,您可以取消它。

One little kludge you could use would be the following:

mkdir repo-name
cd repo-name
git init
git remote add origin <URL of remote>
git fetch origin

git fetch displays feedback along these lines:

remote: Counting objects: 95815, done.
remote: Compressing objects: 100% (25006/25006), done.
remote: Total 95815 (delta 69568), reused 95445 (delta 69317)
Receiving objects: 100% (95815/95815), 18.48 MiB | 16.84 MiB/s, done.
...

The steps on the remote end generally happen pretty fast; it's the receiving step that can be time-consuming. It doesn't actually show the total size, but you can certainly watch it for a second, and if you see "1% ... 23.75 GiB" you know you're in trouble, and you can cancel it.

[ 2021 年 9 月 21 日更新 ]
看来该链接现在将被重定向到另一个 URL,因此我们需要在curl 中添加-L 来跟随重定向。

卷曲-sL https://api.github.com/repos/Marijnh/CodeMirror | grep size


[旧答案]
对于 github 存储库,它现在提供 API 来检查文件大小。有用!

此链接:see-the-size -of-a-github-repo-before-cloning-it 给出了答案命令

:(来自@VMTroper的答案)

curl https://api.github.com/repos/$2/$3 | grep size

示例:

curl https://api.github.com/repos/Marijnh/CodeMirror | grep size
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100  5005  100  5005    0     0   2656      0  0:00:01  0:00:01 --:--:--  2779
"size": 28589,

[ update 21 Sep 2021 ]
It seems that the link will now be redirected to another URL, so we need to add -L to curl to follow the redirection.

curl -sL https://api.github.com/repos/Marijnh/CodeMirror | grep size


[ Old answer ]
For github repository, it now offer API to check file size. It works!

This link: see-the-size-of-a-github-repo-before-cloning-it gave the answer

Command: (answer from @VMTrooper)

curl https://api.github.com/repos/$2/$3 | grep size

Example:

curl https://api.github.com/repos/Marijnh/CodeMirror | grep size
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
100  5005  100  5005    0     0   2656      0  0:00:01  0:00:01 --:--:--  2779
"size": 28589,
仙女山的月亮 2024-09-09 09:06:14

不提供对象计数,但如果您使用 Google Chrome 浏览器并安装此 扩展

它将存储库大小添加到主页:

GitHub Repo 大小扩展屏幕截图

Doesn't give the object count, but if you use Google Chrome browser and install this extension

It adds the repo size to the home page:

GitHub Repo Size extension screenshot

初懵 2024-09-09 09:06:14

我认为这个问题有几个问题:git count-objects
并不真正代表存储库的大小(甚至 git count-object -v )
事实并非如此);如果您使用除哑 http 传输之外的任何其他内容,
当您制作克隆包时,将为您的克隆创建新包;和(正如 VonC 指出的
出)您为分析远程存储库所做的任何事情都不会考虑
工作副本大小。

话虽这么说,如果他们使用的是哑 http 传输(例如 github,
不是),你可以编写一个shell脚本,使用curl来查询所有的大小
对象和包。这可能会让你更接近,但它会产生更多的 http
您只需再次发出请求即可实际进行克隆。

可以弄清楚 git-fetch 将通过线路发送什么(到
智能http传输)并发送它来分析结果,但事实并非如此
一件好事。本质上你是在要求目标服务器打包
您只需下载并丢弃的结果,以便您可以
再次下载它们以保存它们。

可以使用类似以下步骤的操作来达到此效果:

url=https://github.com/gitster/git.git
git ls-remote $url |
  grep '[[:space:]]\(HEAD\|refs/heads/master\|refs/tags\)' |
  grep -v '\^{}

在所有这一切结束时,远程服务器将打包 master/HEAD 和
所有标签都适合您,您将下载整个包文件
看看当您在克隆期间下载它时它有多大。

当您最终进行克隆时,工作副本也会被创建,因此
整个目录将比这些命令吐出的大,但是包文件
通常是具有重要历史记录的工作副本的最大部分。

| awk '{print "0032want " $1}' > binarydata echo 00000009done >> binarydata curl -s -X POST --data-binary @binarydata \ -H "Content-Type: application/x-git-upload-pack-request" \ -H "Accept-Encoding: deflate, gzip" \ -H "Accept: application/x-git-upload-pack-result" \ -A "git/1.7.9" $url/git-upload-pack | wc -c

在所有这一切结束时,远程服务器将打包 master/HEAD 和
所有标签都适合您,您将下载整个包文件
看看当您在克隆期间下载它时它有多大。

当您最终进行克隆时,工作副本也会被创建,因此
整个目录将比这些命令吐出的大,但是包文件
通常是具有重要历史记录的工作副本的最大部分。

I think there are a couple problems with this question: git count-objects
doesn't truly represent the size of a repository (even git count-object -v
doesn't really); if you're using anything other than the dumb http transport, a
new pack will be created for your clone when you make it; and (as VonC pointed
out) anything you do to analyze a remote repo won't take into account the
working copy size.

That being said, if they are using the dumb http transport (github, for example,
is not), you could write a shell script that used curl to query the sizes of all
the objects and packs. That might get you closer, but it's making more http
requests that you'll just have to make again to actually do the clone.

It is possible to figure out what git-fetch would send across the wire (to a
smart http transport) and send that to analyze the results, but it's not really
a nice thing to do. Essentially you're asking the target server to pack up
results that you're just going to download and throw away, so that you can
download them again to save them.

Something like these steps can be used to this effect:

url=https://github.com/gitster/git.git
git ls-remote $url |
  grep '[[:space:]]\(HEAD\|refs/heads/master\|refs/tags\)' |
  grep -v '\^{}

At the end of all of this, the remote server will have packed up master/HEAD and
all the tags for you and you will have downloaded the entire pack file just to
see how big it will be when you download it during your clone.

When you finally do a clone, the working copy will be created as well, so the
entire directory will be larger than these commands spit out, but the pack file
generally is the largest part of a working copy with any significant history.

| awk '{print "0032want " $1}' > binarydata echo 00000009done >> binarydata curl -s -X POST --data-binary @binarydata \ -H "Content-Type: application/x-git-upload-pack-request" \ -H "Accept-Encoding: deflate, gzip" \ -H "Accept: application/x-git-upload-pack-result" \ -A "git/1.7.9" $url/git-upload-pack | wc -c

At the end of all of this, the remote server will have packed up master/HEAD and
all the tags for you and you will have downloaded the entire pack file just to
see how big it will be when you download it during your clone.

When you finally do a clone, the working copy will be created as well, so the
entire directory will be larger than these commands spit out, but the pack file
generally is the largest part of a working copy with any significant history.

耳根太软 2024-09-09 09:06:14

据我所知,没有:
Git 不是服务器,默认情况下没有任何东西监听请求(除非您激活 gitweb 或 gitolite 层)
命令“git remote ...”处理远程存储库的本地副本(获取)。

因此,除非您获取某些内容,或者克隆 --bare 远程存储库(它不会检出文件,因此您只有 Git 数据库),否则您将不知道其大小。
这不包括签出后工作目录的大小。

Not that I know of:
Git is not a server, there is nothing by default listening to a request (unless you activate a gitweb, or a gitolite layer)
And the command "git remote ..." deals with the local copy (fetched) of a remote repo.

So unless you fetch something, or clone --bare a remote repo (which does not checkout the files, so you only have the Git database alone), you won't have an idea of its size.
And that does not include the size of the working directory, once checked out.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文