Git 如何计算文件哈希值?
存储在树对象中的 SHA1 哈希值(由 git ls-tree 返回)与文件内容的 SHA1 哈希值(由 sha1sum 返回)不匹配:
$ git cat-file blob 4716ca912495c805b94a88ef6dc3fb4aff46bf3c | sha1sum
de20247992af0f949ae8df4fa9a37e4a03d7063e -
如何Git 计算文件哈希值?它会在计算哈希值之前压缩内容吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
$ echo -e 'blob 14\0Hello, World!' |沙苏姆
8ab686eafeb1f44702738c8b0f24f2567c36da6d
来源:http://alblue.bandlem.com/2011/08/git-tip-of-week-objects.html
$ echo -e 'blob 14\0Hello, World!' | shasum
8ab686eafeb1f44702738c8b0f24f2567c36da6d
Source: http://alblue.bandlem.com/2011/08/git-tip-of-week-objects.html
我只是扩展
@Leif Gruenwoldt
的答案,并详细说明 参考由@Leif Gruenwoldt
提供自己动手..
GIT 如何计算其提交哈希值
文本
blob⎵
是一个常量前缀,\0
也是常量,并且是NULL
字符。
和
因文件而异。请参阅:git 提交对象的文件格式是什么?
这就是全部!
但是等等!,您是否注意到
不是用于哈希计算的参数?如果两个文件的内容相同,无论它们的创建日期和时间以及名称如何,它们都可能具有相同的哈希值。这是 Git 比其他版本控制系统更好地处理移动和重命名的原因之一。自己动手(分机)
注意:
该链接没有提及
tree
对象是如何进行哈希处理的。我不确定算法和参数,但是根据我的观察,它可能会根据它包含的所有blob
和trees
(可能是它们的哈希值)计算哈希值I am only expanding on the answer by
@Leif Gruenwoldt
and detailing what is in the reference provided by@Leif Gruenwoldt
Do It Yourself..
How does GIT compute its commit hashes
The text
blob⎵
is a constant prefix and\0
is also constant and is theNULL
character. The<size_of_file>
and<contents_of_file>
vary depending on the file.See: What is the file format of a git commit object?
And thats all folks!
But wait!, did you notice that the
<filename>
is not a parameter used for the hash computation? Two files could potentially have the same hash if their contents are same indifferent of the date and time they were created and their name. This is one of the reasons Git handles moves and renames better than other version control systems.Do It Yourself (Ext)
Note:
The link does not mention how the
tree
object is hashed. I am not certain of the algorithm and parameters however from my observation it probably computes a hash based on all theblobs
andtrees
(their hashes probably) it containsgit hash-object
这是验证测试方法的快速方法:
输出:
其中
sha1sum
位于 GNU Coreutils 中。然后归结为理解每种对象类型的格式。我们已经介绍了简单的
blob
,以下是其他内容:git hash-object
This is a quick way to verify your test method:
Output:
where
sha1sum
is in GNU Coreutils.Then it comes down to understanding the format of each object type. We have already covered the trivial
blob
, here are the others:我需要这个来进行 Python 3 中的一些单元测试,所以我想把它留在这里。
我在任何地方都坚持
\n
行结尾,但在某些情况下 Git 也可能是 在计算此哈希之前更改行结尾,因此您可能需要.replace('\r\n', '\n')
也在那里。I needed this for some unit tests in Python 3 so thought I'd leave it here.
I stick to
\n
line endings everywhere but in some circumstances Git might also be changing your line endings before calculating this hash so you may need a.replace('\r\n', '\n')
in there too.基于 Leif Gruenwoldt 答案,这里是 Leif Gruenwoldt 的 shell 函数替代品。 com/git/git/blob/master/builtin/hash-object.c" rel="nofollow noreferrer">
git hash-object
:测试:
Based on Leif Gruenwoldt answer, here is a shell function substitute to
git hash-object
:Test:
这是用于二进制哈希计算的 python3 版本(上面的示例适用于文本)
为了便于阅读,请将此代码放在您自己的 def 中。
另请注意,代码是一个片段,而不是完整的脚本。为您带来灵感。
This is a python3 version for binary hash calculation (the above example is for text)
For purpose of readability put this code in your own def.
Also note, the code is a snippet, not a complete script. For your inspiration.
Git 2.45(2024 年第 2 季度),第 10 批 现在提供了这方面的官方文档。
请参阅 提交 28636d7(2024 年 3 月 12 日),作者:德克·古德斯 (
dgouders-whs
)。(由 Junio C Hamano --
gitster
-- 合并于 提交 509a047,2024 年 3 月 21 日)user-manual
现在包含在其 手册页:Git 2.45 (Q2 2024), batch 10 now offers an official documentation on this.
See commit 28636d7 (12 Mar 2024) by Dirk Gouders (
dgouders-whs
).(Merged by Junio C Hamano --
gitster
-- in commit 509a047, 21 Mar 2024)user-manual
now includes in its man page: