Git 存储库唯一 ID

发布于 2024-10-17 11:25:30 字数 138 浏览 5 评论 0原文

我需要查明提交是否属于特定的 git 存储库。

这个想法是为我需要测试的每个存储库生成一些唯一的 ID。 然后我可以将此唯一 id 与根据测试提交计算出的 id 进行比较。

例如,采用初始更改集的 SHA。它能唯一标识存储库吗?

I need to find out if a commit belongs to a particular git repository.

The idea is to generate some unique id for every repository I need to test.
Then I can compare this unique id to the id, calculated from tested commit.

For example take an SHA of initial change set. Can it uniqely identify the repository?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

雨的味道风的声音 2024-10-24 11:25:30

SHA1 密钥用于识别内容(blob 或树的内容),而不是存储库。
如果每个存储库的内容不同,那么它的历史就没有共同的祖先,所以我认为基于更改集的解决方案不起作用。

也许(未经测试)您可以通过 git 笔记
例如,请参阅 GitHub deploy-notes,它使用此机制来跟踪部署。

The SHA1 key is about identifying the content (of a blob, or of a tree), not about a repository.
If the content differ from repo to repo, then its history has no common ancestor, so I don't think a change-set-based solution will work.

Maybe (not tested) you could add some marker (without having to change all the SHA1) through git notes.
See for instance GitHub deploy-notes which uses this mechanism to track deployments.

提赋 2024-10-24 11:25:30

(从评论中移出)

如果您的存储库中没有特定提交的父级,则这是不可能的(在这种情况下,您可以简单地回答这个问题)。虽然提交保存了对父级的引用并以这种方式维护整个树的完整性,但如果没有该提交,则无法仅根据哈希重建提交,因此您无法找到该父级的父级,依此类推,直到您会找到实际上位于您的存储库中的父级。

(moved from comment)

That's not possible if you don't have the parent of the particular commit already in your repository (in which case you can trivially answer the question). While the commit holds a reference to the parent and maintains the whole tree's integrity that way, you cannot reconstruct a commit just from the hash if you don't have that commit, so you can't find out that parent's parent and so on until you find a parent which actually is within your repository.

睫毛溺水了 2024-10-24 11:25:30

您可以使用 git filter-branch 来搜索您要查找的提交。

初始提交的哈希值不会为您提供有关存储库本身的太多信息。没有办法唯一地标识存储库。

You can use git filter-branch to search for the commit you are looking for.

A hash of the initial commit does not give you much info about the repository itself. There's no way to uniquely identify a repository.

漫雪独思 2024-10-24 11:25:30

在 Rietveld 中,当人们想要查找针对其存储库的评论时,我们不能强迫每个人都使用“git Notes”,因此我们将使用 git rev-list --parents HEAD 输出中的最后一个哈希值代码>.

In Rietveld we can not force everybody to use 'git notes' when people want to find reviews made against their repositories, so we are going to use the last hash from the output of git rev-list --parents HEAD.

森末i 2024-10-24 11:25:30

与 Mercurial 相比,其中检查 mercurial/treediscovery.py (Mercurial 存储库标识):

base = list(base)
if base == [nullid]:
    if force:
        repo.ui.warn(_("warning: repository is unrelated\n"))
    else:
        raise util.Abort(_("repository is unrelated"))

base 变量存储两个存储库的最后公共部分。

Git 在获取/推送时发出警告:没有常见提交时也有相同的假设。我只是没有 grep Git 源代码,这需要时间。

通过给出 Mercurial 推/拉检查的想法,我们可以假设存储库是相关的(如果它们具有共同的根)。对于 Mercurial 来说,这意味着来自命令的哈希值:

$ hg log -r "roots(all())"

对于两个存储库都必须具有非空感叹词。

您可能不会通过精心制作存储库来欺骗根检查,因为构建两个存储库看起来像这样(具有共同部分但根不同):

0 <--- SHA-256-XXX <--- SHA-256-YYY <--- SHA-256-ZZZ
0 <--- SHA-256-YYY <--- SHA-256-ZZZ

不可能,因为这意味着您反转 SHA-256,因为每个后续哈希都取决于先前的值。对于 Mercurial 和 Git 来说都是如此。

在 Git 中查看根的相应命令是:

$ git log --format=oneline --all --max-parents=0

您可以尝试一下:

bash# md git
/home/user/tmp/git

bash# md one
/home/user/tmp/git/one

bash# git init
Initialized empty Git repository in /home/user/tmp/git/one/.git/

bash# echo x1 > x1
bash# git add x1
bash# git ci -m x1
[master (root-commit) 1208fb0] x1

bash# echo x2 > x2
bash# git add x2
bash# git ci -m x2
[master 1c3fe86] x2

bash# cd ..

bash# md two
/home/user/tmp/git/two

bash# git init
Initialized empty Git repository in /home/user/tmp/git/two/.git/

bash# echo y1 > y1
bash# git add y1
bash# git ci -m y1
[master (root-commit) ff56a8e] y1

bash# echo y2 > y2
bash# git add y2
bash# git ci -m y2
[master 18adff5] y2

bash# git fetch ../one/
warning: no common commits
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From ../one
 * branch            HEAD       -> FETCH_HEAD

bash# git co --orphan one
Switched to a new branch 'one'

bash# git merge FETCH_HEAD

bash# git log --format=oneline --all
18adff541c7ce9f1a1f2be2804d6d0e5792ff086 y2
ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c y1
1c3fe8665851e89d37f49633cd2478900217b91c x2
1208fb0f721005207c6afe6a549a9ed0dcc5b0a8 x1

bash# git log --format=oneline --all --max-parents=0
ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c y1
1208fb0f721005207c6afe6a549a9ed0dcc5b0a8 x1

bash# git log --all --graph

* commit 18adff541c7ce9f1a1f2be2804d6d0e5792ff086
|     y2
|  
* commit ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c
      y1

* commit 1c3fe8665851e89d37f49633cd2478900217b91c
|     x2
|  
* commit 1208fb0f721005207c6afe6a549a9ed0dcc5b0a8
      x1

注意 Git 允许部分签出。我没有检查此案例的 --max-parents=0

Compare with Mercurial, where is checks mercurial/treediscovery.py (Mercurial repository identification):

base = list(base)
if base == [nullid]:
    if force:
        repo.ui.warn(_("warning: repository is unrelated\n"))
    else:
        raise util.Abort(_("repository is unrelated"))

base variable store last common parts of two repositories.

Git have same assumptions when emit warning: no common commits on fetch/push. I just didn't grep Git sources, that require time.

By giving this idea of Mercurial push/pull checks we may assume that repositories are related if they have common roots. For mercurial this means that hashes from command:

$ hg log -r "roots(all())"

for both repositories must have non-empty interjection.

You may not trick roots checking by carefully crafting repositories because building two repositories looks like these (with common parts but different roots):

0 <--- SHA-256-XXX <--- SHA-256-YYY <--- SHA-256-ZZZ
0 <--- SHA-256-YYY <--- SHA-256-ZZZ

impossible because that mean you reverse SHA-256 as each subsequent hash depends on previous values. That is true both for Mercurial and Git.

Corresponding command to see roots in Git is:

$ git log --format=oneline --all --max-parents=0

You can toy yourself with:

bash# md git
/home/user/tmp/git

bash# md one
/home/user/tmp/git/one

bash# git init
Initialized empty Git repository in /home/user/tmp/git/one/.git/

bash# echo x1 > x1
bash# git add x1
bash# git ci -m x1
[master (root-commit) 1208fb0] x1

bash# echo x2 > x2
bash# git add x2
bash# git ci -m x2
[master 1c3fe86] x2

bash# cd ..

bash# md two
/home/user/tmp/git/two

bash# git init
Initialized empty Git repository in /home/user/tmp/git/two/.git/

bash# echo y1 > y1
bash# git add y1
bash# git ci -m y1
[master (root-commit) ff56a8e] y1

bash# echo y2 > y2
bash# git add y2
bash# git ci -m y2
[master 18adff5] y2

bash# git fetch ../one/
warning: no common commits
remote: Counting objects: 6, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 6 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From ../one
 * branch            HEAD       -> FETCH_HEAD

bash# git co --orphan one
Switched to a new branch 'one'

bash# git merge FETCH_HEAD

bash# git log --format=oneline --all
18adff541c7ce9f1a1f2be2804d6d0e5792ff086 y2
ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c y1
1c3fe8665851e89d37f49633cd2478900217b91c x2
1208fb0f721005207c6afe6a549a9ed0dcc5b0a8 x1

bash# git log --format=oneline --all --max-parents=0
ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c y1
1208fb0f721005207c6afe6a549a9ed0dcc5b0a8 x1

bash# git log --all --graph

* commit 18adff541c7ce9f1a1f2be2804d6d0e5792ff086
|     y2
|  
* commit ff56a8e7e9145d2b1b5a760bbc9b12451927ab0c
      y1

* commit 1c3fe8665851e89d37f49633cd2478900217b91c
|     x2
|  
* commit 1208fb0f721005207c6afe6a549a9ed0dcc5b0a8
      x1

NOTE Git allow partial checkout. I didn't check this case for --max-parents=0.

梦开始←不甜 2024-10-24 11:25:30

当我对存储库有写访问权限时,我发现生成一个随机 uuid 很有用,我将其存储在 .gituuid 文件中,该文件也已提交:

uuidgen > .gituuid
git add .gituuid
git commit -m "Add: git uuid" .gituuid

这在全局范围内解决了如何唯一标识存储库的问题,但这个答案仅在您具有写入权限时才相关。

注意:我还有一些其他脚本可以跟踪这些 git uuid,并允许我找到文件系统上关联的存储库的位置。但这超出了范围。

When I have a write access on a repo, I find useful to generate a random uuid that I will store inside a .gituuid file, which is also commited:

uuidgen > .gituuid
git add .gituuid
git commit -m "Add: git uuid" .gituuid

This globally solve how to uniquely identify a repo, but this answer is only relevant if you have write permissions.

Note: I've some other scripts that tracks thoses git uuids and allow me to locate where are the associated repo on my file system. But this is out of scope.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文