使用 Bash 获取 GitHub 组织中所有分支的列表而不触发速率限制?
在尝试建立传入 GitHub 提交的列表时,我偶然发现了 GitHub 速率 api 限制,即每小时 60 次调用。正如这个答案中所解释的,可以使用 API 调用获取分支列表:
https://api.github.com/repos/{username}/{repo-name}/branches
但是,这会触发速率限制对于普通 GitHub 组织/用户而言。所以我想尝试一种不同的方法,使用 RSS/atom 格式。然而,正如同一个答案所解释的那样,原子格式/RSS 提要似乎取决于用户拥有存储库中所有分支的列表。 这个问题要求了解存储库中所有提交的概述,但相反,它会给出存储库默认分支中所有提交的答案。 这个问题收到了一个有效的答案触发速率限制,因为它依赖于每个存储库至少 1 个 API 调用。
因此,我想问:如何最多使用 1 个 GitHub API 调用来获取 GitHub 用户的所有分支的列表?
注意,使用原子视图是完全可以的,但是,我尚未找到类似以下的原子视图: https://github.com/:owner/:repo/commits.atom
或 https://github.com/:owner/:repo/ branches.atom
显示存储库中的所有分支。我强烈希望有一个不依赖第三方的解决方案,例如: https://rsshub.app/ github/repos/yanglr 正如我想象的那样,它们也会在某个时候开始速率限制。
我当前的方法是使用 bash 抓取 https://github.com/:user/:repo/branches
的源代码。但是,我想可能存在更有效的解决方案。
MWE
感谢这些评论,我能够找到一个 bash MWE 来使用终端执行 GraphQL 查询。它在这个答案中给出,其中bearer
不是一个变量,它是标识和 ......
应该是您的个人 GitHub 访问令牌。我目前正在研究如何使存储库超过第一百个。然后我将了解如何获取这些存储库的分支。
尝试 I
以下查询生成一个 json
,其中包含用户每个存储库中的存储库和前 4 个分支!
名称:examplequery.gql
。
query {
repositoryOwner(login: "somegithubuser") {
repositories(first: 40) {
edges {
node {
nameWithOwner
refs(
refPrefix: "refs/heads/"
orderBy: { direction: DESC, field: TAG_COMMIT_DATE }
first: 4
) {
edges {
node {
... on Ref {
name
}
}
}
}
}
}
}
}
}
接下来,制作一个运行查询的 bash 脚本:
#!/usr/bin/env bash
# Runs graphql query on GitHub. Execute with:
# ./run_graphql_query.sh examplequery1.gql
GITHUB_PERSONAL_ACCESS_TOKEN_GLOBAL="your_github_personal_access_token"
if [ $# -ne 1 ]; then
echo "usage of this script is incorrect."
exit 1
fi
if [ ! -f $1 ];then
echo "usage of this script is incorrect."
exit 1
fi
# Form query JSON
QUERY=$(jq -n \
--arg q "$(cat $1 | tr -d '\n')" \
'{ query: $q }')
curl -s -X POST \
-H "Content-Type: application/json" \
-H "Authorization: bearer $GITHUB_PERSONAL_ACCESS_TOKEN_GLOBAL" \
--data "$QUERY" \
https://api.github.com/graphql
它可以运行:
./run_graphql_query.sh examplequery1.gql
在我回答这个问题之前,还有两个问题需要解决。如何迭代所有存储库而不是仅前 100 个存储库。如何将 json 解析为每个存储库的分支列表。
While trying to establish a list of incoming GitHub commits I've stumbled accross the GitHub rate api limits, of 60 calls per hour. As explained in this answer, one can get the lists of branches with an API call using:
https://api.github.com/repos/{username}/{repo-name}/branches
However, that triggers the rate limit for the average GitHub organisation/user. So I thought I'd try a different approach, using RSS/atom format. However, as that same answer explains, the atom format/rss feed seems to depend on the user having a list of all branches in a repository. This question asks for an overview of all commits in a repository, yet instead it is given an answer for all commits in the default branch of the repository. And this question receives a working answer that triggers the rate limit, as it relies on at least 1 API call per repository.
Hence, I would like to ask: How could one get a list of all branches of a GitHub user, using at most 1 GitHub API call?
Note, using atom views would be perfectly fine, however, I have not found an atom view like: https://github.com/:owner/:repo/commits.atom
or https://github.com/:owner/:repo/branches.atom
that displays all branches in a repository. I would strongly prefer a solution that does not rely on a third party like: https://rsshub.app/github/repos/yanglr as I imagine, they too will at some point start rate-limiting.
My current approach is to scrape the source code of https://github.com/:user/:repo/branches
using bash. However, I imagine there might exist a more efficient solution to this.
MWE
Thanks to the comments, I was ble to find a bash MWE to perform a GraphQL query using terminal. It is given in this answer, where bearer
is not a variable, it is the means of identification and the ......
should be your personal GitHub Access token. I am currently looking into how to get the repositories beyond the 1st hundred. Then I'll look at how to get the branches of those repositories.
Attempt I
The following query yields a json
with the repositories and first 4 branches in each repository of a user!
name:examplequery.gql
.
query {
repositoryOwner(login: "somegithubuser") {
repositories(first: 40) {
edges {
node {
nameWithOwner
refs(
refPrefix: "refs/heads/"
orderBy: { direction: DESC, field: TAG_COMMIT_DATE }
first: 4
) {
edges {
node {
... on Ref {
name
}
}
}
}
}
}
}
}
}
Next, a bash script is made that runs the query:
#!/usr/bin/env bash
# Runs graphql query on GitHub. Execute with:
# ./run_graphql_query.sh examplequery1.gql
GITHUB_PERSONAL_ACCESS_TOKEN_GLOBAL="your_github_personal_access_token"
if [ $# -ne 1 ]; then
echo "usage of this script is incorrect."
exit 1
fi
if [ ! -f $1 ];then
echo "usage of this script is incorrect."
exit 1
fi
# Form query JSON
QUERY=$(jq -n \
--arg q "$(cat $1 | tr -d '\n')" \
'{ query: $q }')
curl -s -X POST \
-H "Content-Type: application/json" \
-H "Authorization: bearer $GITHUB_PERSONAL_ACCESS_TOKEN_GLOBAL" \
--data "$QUERY" \
https://api.github.com/graphql
It can be ran with:
./run_graphql_query.sh examplequery1.gql
There are two more issues to resolve before I can answer the question. How I can iterate over all repositories instead of only the first 100. How I can parse the json into a list of branches per repository.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论