Git(Hub) 如何处理短 SHA 可能发生的冲突?

发布于 2024-11-30 07:13:41 字数 376 浏览 1 评论 0原文

Git 和 GitHub 都显示 SHA 的简短版本——仅前 7 个字符,而不是全部 40 个字符——并且 Git 和 GitHub 都支持将这些短 SHA 作为参数。

例如 git show 962a9e8

例如 https://github.com/joyent/node/commit/962a9e8< /a>

鉴于可能性空间现在降低了几个数量级,“仅仅”268万,Git和GitHub这里如何防止冲突?他们如何处理它们?

Both Git and GitHub display short versions of SHAs -- just the first 7 characters instead of all 40 -- and both Git and GitHub support taking these short SHAs as arguments.

E.g. git show 962a9e8

E.g. https://github.com/joyent/node/commit/962a9e8

Given that the possibility space is now orders of magnitude lower, "just" 268 million, how do Git and GitHub protect against collisions here? And how do they handle them?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

流年已逝 2024-12-07 07:13:41

这些简短的形式只是为了简化视觉识别,让您的生活更轻松。 Git 并不会真正截断任何内容,内部所有内容都将使用完整值进行处理。不过,您可以在方便时使用部分 SHA-1:

只要您提供前几个字符,Git 就足够聪明,可以弄清楚您要输入的提交内容,只要您的部分 SHA-1 至少有四个字符长且明确——也就是说,当前只有一个对象存储库以该部分 SHA-1 开头。

These short forms are just to simplify visual recognition and to make your life easier. Git doesn't really truncate anything, internally everything will be handled with the complete value. You can use a partial SHA-1 at your convenience, though:

Git is smart enough to figure out what commit you meant to type if you provide the first few characters, as long as your partial SHA-1 is at least four characters long and unambiguous — that is, only one object in the current repository begins with that partial SHA-1.

玉环 2024-12-07 07:13:41

我有一个提交 ID 为 000182eacf99cde27d5916aa415921924b82972c 的存储库。

git show 00018

显示修订版本,但

git show 0001

打印

error: short SHA1 0001 is ambiguous.
error: short SHA1 0001 is ambiguous.
fatal: ambiguous argument '0001': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions

(如果您好奇,它是 git 本身的 git 存储库的克隆;该提交是 Linus Torvalds 在 2005 年所做的提交。)

I have a repository that has a commit with an id of 000182eacf99cde27d5916aa415921924b82972c.

git show 00018

shows the revision, but

git show 0001

prints

error: short SHA1 0001 is ambiguous.
error: short SHA1 0001 is ambiguous.
fatal: ambiguous argument '0001': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions

(If you're curious, it's a clone of the git repository for git itself; that commit is one that Linus Torvalds made in 2005.)

红焚 2024-12-07 07:13:41

这里有两个注意事项:

  • 如果您在显示提交的 GitHub 页面上的任意位置键入 y,您将看到所述提交的完整 40 个字节。
    这说明了 emboss 的观点:GitHub 不会截断任何内容。

  • 无论如何,自 2010 年以来 7 个十六进制数字(28 位)是不够的。
    请参阅 Linus Torwalds 本人的 commit dce9648(2010 年 10 月,git 1.7.4.4):< /p>

默认值 7 来自 git 开发的相当早期,当时 7 个十六进制数字很多(它涵盖了大约 250+ 百万个哈希值)。当时我认为 65k 次修订已经很多了(这就是我们在 BK 中即将达到的水平),并且每次修订往往约为 5-10
新对象左右,所以一百万个对象是一个很大的数字。

(BK = BitKeeper)

如今,内核甚至不是最大的 git 项目,甚至内核也有大约 220k 修订版(比 BK 树大得多),而且我们正在接近 200 万个对象。到那时,七个十六进制数字对于许多人来说仍然是唯一的,但是当我们
就对象数量和哈希大小之间仅两个数量级的差异而言,截断的哈希值将会发生冲突。这不再是不切实际的——它一直在发生。

我们应该增加不切实际的小默认缩写,添加一种方法,让人们在 git 配置文件中设置自己的默认每个项目。

Two notes here:

  • If you type y anywhere on the GitHub page displaying a commit, you will see the full 40 bytes of said commit.
    That illustrates emboss's point: GitHub doesn't truncate anything.

  • And 7 hex digits (28 bits) isn't enough since 2010 anyway.
    See commit dce9648 by Linus Torwalds himself (Oct 2010, git 1.7.4.4):

The default of 7 comes from fairly early in git development, when seven hex digits was a lot (it covers about 250+ million hash values). Back then I thought that 65k revisions was a lot (it was what we were about to hit in BK), and each revision tends to be about 5-10
new objects or so, so a million objects was a big number.

(BK = BitKeeper)

These days, the kernel isn't even the largest git project, and even the kernel has about 220k revisions (much bigger than the BK tree ever was) and we are approaching two million objects. At that point, seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number of objects and the hash size, there will be collisions in truncated hash values. It's no longer even close to unrealistic - it happens all the time.

We should both increase the default abbrev that was unrealistically small, and add a way for people to set their own default per-project in the git config file.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文