如何恢复因硬盘故障而损坏的 Git 对象?
我遇到过硬盘故障,导致 Git 存储库的某些文件损坏。 运行 git fsck --full 时,我得到以下输出:
error: .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack SHA1 checksum mismatch
error: index CRC mismatch for object 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid code lengths set)
error: cannot unpack 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid stored block lengths)
error: failed to read object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa at offset 276988017 from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack
fatal: object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa is corrupted
我有存储库的备份,但包含包文件的唯一备份已损坏。 因此,我认为我必须找到一种方法来从不同的备份中检索单个对象,并以某种方式指示 Git 生成仅包含正确对象的新包。
您能给我提示如何修复我的存储库吗?
I have had a hard disk failure which resulted in some files of a Git repository getting damaged. When running git fsck --full
I get the following output:
error: .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack SHA1 checksum mismatch
error: index CRC mismatch for object 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid code lengths set)
error: cannot unpack 6c8cae4994b5ec7891ccb1527d30634997a978ee from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack at offset 97824129
error: inflate: data stream error (invalid stored block lengths)
error: failed to read object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa at offset 276988017 from .git/objects/pack/pack-6863e0a0e4b4ded6090fac5d12eba6ca7346b19c.pack
fatal: object 0dcf6723cc69cc7f91d4a7432d0f1a1f05e77eaa is corrupted
I have backups of the repository, but the only backup that includes the pack file has it already damaged. So I think that I have to find out a way to retrieve the single objects from different backups and somehow instruct Git to produce a new pack with only correct objects.
Can you please give me hints how to fix my repository?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在以前的一些备份中,您的坏对象可能已打包在不同的文件中,或者可能是松散的对象。 所以你的对象可能会被恢复。
您的数据库中似乎有一些坏对象。 所以你可以通过手动方式完成。
因为
git hash-object
、git mktree
和git commit-tree
不会写入对象,因为它们是在包中找到的,然后启动这样做:(您的包已从存储库中移出,并在其中再次解压;现在只有好的对象位于数据库中)
您可以执行以下操作:
并检查对象的类型。
如果类型是 blob:从以前的备份中检索文件的内容(使用 git show 或 git cat-file 或 git unpack-file ; 然后你可以 git hash-object -w 重写当前存储库中的对象
如果类型是树:你可以使用 git ls-tree 来恢复树 。从以前的备份;然后 git mktree 再次写入当前存储库
如果类型是 commit:与 git show 、 git cat-file 相同。 /code> 和
git commit-tree
当然,在开始此过程之前我会备份您的原始工作副本
。 org/pub/software/scm/git/docs/v1.7.10.1/howto/recover-corrupted-blob-object.txt" rel="noreferrer">如何恢复损坏的 Blob 对象。
In some previous backups, your bad objects may have been packed in different files or may be loose objects yet. So your objects may be recovered.
It seems there are a few bad objects in your database. So you could do it the manual way.
Because of
git hash-object
,git mktree
andgit commit-tree
do not write the objects because they are found in the pack, then start doing this:(Your packs are moved out from the repository, and unpacked again in it; only the good objects are now in the database)
You can do:
and check the type of the object.
If the type is blob: retrieve the contents of the file from previous backups (with
git show
orgit cat-file
orgit unpack-file
; then you maygit hash-object -w
to rewrite the object in your current repository.If the type is tree: you could use
git ls-tree
to recover the tree from previous backups; thengit mktree
to write it again in your current repository.If the type is commit: the same with
git show
,git cat-file
andgit commit-tree
.Of course, I would backup your original working copy before starting this process.
Also, take a look at How to Recover Corrupted Blob Object.
Banengusk 让我走上了正确的道路。 为了进一步参考,我想发布我修复存储库损坏所采取的步骤。 我很幸运能够在旧包或存储库备份中找到所有需要的对象。
Banengusk was putting me on the right track. For further reference, I want to post the steps I took to fix my repository corruption. I was lucky enough to find all needed objects either in older packs or in repository backups.
首先尝试以下命令(如果需要,请重新运行):
然后您仍然遇到问题,尝试可以:
删除所有损坏的对象,例如
删除所有空对象,例如
通过以下方式检查“损坏的链接”消息:
<前><代码>git ls-tree 2d9263c6d23595e7cb2a21e5ebbb53655278dff8
这将告诉您损坏的 blob 来自哪个文件!
要恢复文件,您可能真的很幸运,它可能是您已经在工作树中签出的版本:
再次,如果它输出丢失的 SHA1 (4b945..),那么您现在就完成了!
假设是某个旧版本被破坏,最简单的方法是:
这将显示该文件的整个日志(请注意,您拥有的树可能不是顶级树,因此您需要自己找出它位于哪个子目录中),然后您可以现在再次使用哈希对象重新创建丢失的对象。
获取缺少提交、树或 blob 的所有引用的列表:
使用常规的branch -d 或tag -d 命令可能无法删除其中一些引用,因为如果 git 注意到损坏,它们就会消失。 因此,请改用管道命令 git update-ref -d $ref 。 请注意,如果是本地分支,此命令可能会在 .git/config 中留下过时的分支配置。 可以手动删除它(查找 [branch "$ref"] 部分)。
在所有引用都干净之后,引用日志中可能仍然存在损坏的提交。 您可以使用 git reflog expire --expire=now --all 清除所有引用日志。 如果您不想丢失所有引用日志,您可以在各个引用中搜索损坏的引用日志:
(请注意 git rev-list 添加的 -g 选项。)然后,对每个选项使用 git reflog expire --expire=now $ref 。
当所有损坏的引用和引用日志都消失后,运行 git fsck --full 以检查存储库是否干净。 悬空对象是可以的。
下面您可以找到命令的高级用法,如果使用不当,这些命令可能会导致 git 存储库中的数据丢失,因此请在意外对 git 造成进一步损坏之前进行备份。 如果您知道自己在做什么,请自行承担风险。
在获取后将当前分支拉到上游分支之上:
您还可以尝试签出新分支并删除旧分支:
要查找损坏的分支要删除 git 中的对象,请尝试以下命令:
对于 OSX,使用
sed -E
而不是sed -r
。另一个想法是从包文件中解压所有对象以重新生成 .git/objects 中的所有对象,因此请尝试在存储库中运行以下命令:
如果上述命令没有帮助,您可以尝试 rsync 或从另一个复制 git 对象repo,例如,
要在尝试签出时修复损坏的分支,如下所示:
尝试将其删除并再次从上游签出:
如果 git 让您进入分离状态,请签出
master
并合并到其中独立的分支。另一个想法是递归地重新调整现有主控:
另请参阅:
Try the following commands at first (re-run again if needed):
And then you you still have the problems, try can:
remove all the corrupt objects, e.g.
remove all the empty objects, e.g.
check a "broken link" message by:
This will tells you what file the corrupt blob came from!
to recover file, you might be really lucky, and it may be the version that you already have checked out in your working tree:
again, and if it outputs the missing SHA1 (4b945..) you're now all done!
assuming that it was some older version that was broken, the easiest way to do it is to do:
and that will show you the whole log for that file (please realize that the tree you had may not be the top-level tree, so you need to figure out which subdirectory it was in on your own), then you can now recreate the missing object with hash-object again.
to get a list of all refs with missing commits, trees or blobs:
It may not be possible to remove some of those refs using the regular branch -d or tag -d commands, since they will die if git notices the corruption. So use the plumbing command git update-ref -d $ref instead. Note that in case of local branches, this command may leave stale branch configuration behind in .git/config. It can be deleted manually (look for the [branch "$ref"] section).
After all refs are clean, there may still be broken commits in the reflog. You can clear all reflogs using git reflog expire --expire=now --all. If you do not want to lose all of your reflogs, you can search the individual refs for broken reflogs:
(Note the added -g option to git rev-list.) Then, use git reflog expire --expire=now $ref on each of those.
When all broken refs and reflogs are gone, run git fsck --full in order to check that the repository is clean. Dangling objects are Ok.
Below you can find advanced usage of commands which potentially can cause lost of your data in your git repository if not used wisely, so make a backup before you accidentally do further damages to your git. Try on your own risk if you know what you're doing.
To pull the current branch on top of the upstream branch after fetching:
You also may try to checkout new branch and delete the old one:
To find the corrupted object in git for removal, try the following command:
For OSX, use
sed -E
instead ofsed -r
.Other idea is to unpack all objects from pack files to regenerate all objects inside .git/objects, so try to run the following commands within your repository:
If above doesn't help, you may try to rsync or copy the git objects from another repo, e.g.
To fix the broken branch when trying to checkout as follows:
Try to remove it and checkout from upstream again:
In case if git get you into detached state, checkout the
master
and merge into it the detached branch.Another idea is to rebase the existing master recursively:
See also:
以下是我从损坏的 blob 对象中恢复所遵循的步骤。
1) 识别损坏的 blob
损坏的 blob 为 241091723c324aed77b2d35f97a05e856b319efd
2) 将损坏的 blob 移至安全位置(以防万一)
3) 获取损坏的 blob 的父级
父级哈希为 0716831e1a6c8d3e6b2b541d21c4748cc0ce7 180。
4) 获取与损坏的 blob 相对应的文件名
在备份或上游 git 存储库中查找此特定文件(在我的例子中为 dump.tar.gz)。 然后将其复制到本地存储库中的某个位置。
5) 在 git 对象数据库中添加之前损坏的文件
6) 庆祝!
Here are the steps I followed to recover from a corrupt blob object.
1) Identify corrupt blob
Corrupt blob is 241091723c324aed77b2d35f97a05e856b319efd
2) Move corrupt blob to a safe place (just in case)
3) Get parent of corrupt blob
Parent hash is 0716831e1a6c8d3e6b2b541d21c4748cc0ce7180.
4) Get file name corresponding to corrupt blob
Find this particular file in a backup or in the upstream git repository (in my case it is dump.tar.gz). Then copy it somewhere inside your local repository.
5) Add previously corrupted file in the git object database
6) Celebrate!
Git checkout 实际上可以从修订版本中挑选出单个文件。 只需给它提交哈希值和文件名即可。 更详细的信息
我想安全地解决此问题的最简单方法是恢复到最新的未提交备份,然后有选择地从较新的提交中挑选出未损坏的文件。 祝你好运!
Git checkout can actually pick out individual files from a revision. Just give it the commit hash and the file name. More detailed info here.
I guess the easiest way to fix this safely is to revert to the newest uncommited backup and then selectively pick out uncorrupted files from newer commits. Good luck!
如果您的备份已损坏,或者您有一些部分损坏的备份(如果您备份损坏的对象,则可能会发生这种情况),这里有两个功能可能会有所帮助。
在您尝试恢复的存储库中运行两者。
标准警告:仅当您真的绝望并且已备份(损坏的)存储库时才使用。 这可能无法解决任何问题,但至少应该凸显腐败的程度。
和
Here are two functions that may help if your backup is corrupted, or you have a few partially corrupted backups as well (this may happen if you backup the corrupted objects).
Run both in the repo you're trying to recover.
Standard warning: only use if you're really desperate and you have backed up your (corrupted) repo. This might not resolve anything, but at least should highlight the level of corruption.
and
Daniel Fanjul 的解决方案看起来很有希望。 我能够找到该 blob 文件并将其提取(“git fsck --full --no-dangling”,“git cat-file -t {hash}”,“git show {hash} > file.tmp”)但是当我尝试使用“git hash-object -w file.tmp”更新包文件时,它显示正确的哈希值,但错误仍然存在。
所以我决定尝试不同的方法。 我可以简单地删除本地存储库并从远程下载所有内容,但本地存储库中的某些分支提前了 8 个提交,我不想丢失这些更改。 由于那个 6kb 的小 mp3 文件,我决定彻底删除它。 我尝试了很多方法,但最好的方法是从这里: https://itextpdf.com/en/blog/technical-notes/how-completely-remove-file-git-repository
我通过运行此命令“git rev-list - -objects --all | grep {哈希}”。 然后我做了一个备份(强烈建议这样做,因为我失败了3次),然后运行命令:
“java -jar bfg.jar --delete-files {filename} --no-blob-protection .< /strong>"
你可以从这里获取 bfg.jar 文件 https://rtyley.github。 io/bfg-repo-cleaner/ 因此根据文档我应该接下来运行此命令:
"git reflog expire --expire=now --all && git gc --prune=now - -aggressive"
当我这样做时,我在最后一步遇到了错误。 因此,我从备份中恢复了所有内容,这一次,删除文件后,我检出到分支(这导致了该错误),然后检出回主干,并且仅在依次运行命令之后:
“git reflog过期--过期=现在--全部”
“git gc --prune=now --aggressive”
然后我将文件添加回其位置并提交。 然而,由于许多本地提交被更改,我无法将任何内容推送到服务器。 因此,我备份服务器上的所有内容(以防万一我搞砸了),检查受影响的分支并运行命令“git push --force”。
从这个案例中我明白了什么? GIT 很棒,但很敏感...我应该有一个选项来简单地忽略一个 f...6kb 文件我知道我在做什么。 我不知道为什么“git hash-object -w”也不起作用=(吸取的教训,推送所有提交,不要等待,不时备份存储库。我也知道如何从存储库中删除文件,如果我曾经需要=)
我希望这可以节省某人的时间
The solution by Daniel Fanjul looked promissing. I was able to find that blob file and extracted it ("git fsck --full --no-dangling", "git cat-file -t {hash}", "git show {hash} > file.tmp") but when I tried to update pack file with "git hash-object -w file.tmp", it displayed correct hash BUT the error remained.
So I decided to try different approach. I could simply delete local repository and download everything from remote but some branches in local repository were 8 commits ahead and I did not want to lose those changes. Since that tiny, 6kb mp3 file, I decided to delete it completely. I tried many ways but the best was from here: https://itextpdf.com/en/blog/technical-notes/how-completely-remove-file-git-repository
I got the file name by running this command "git rev-list --objects --all | grep {hash}". Then I did a backup (strongly recommend to do so because I failed 3 times) and then run the command:
"java -jar bfg.jar --delete-files {filename} --no-blob-protection ."
You can get bfg.jar file from here https://rtyley.github.io/bfg-repo-cleaner/ so according to documentation I should run this command next:
"git reflog expire --expire=now --all && git gc --prune=now --aggressive"
When I did so, I got errors on last step. So I recovered everything from backup and this time, after removing file, I checkout to the branch (which was causing that error), then check out back to main and only after run the command one after each other:
"git reflog expire --expire=now --all"
"git gc --prune=now --aggressive"
Then I added my file back to its location and comit. However, since many local commits were changed, I was not able to push anything to server. So I backup everything on server (in case I screw it), check out to the branch which was affected and run the command "git push --force".
What I understood from this case? GIT is great but so senstive... I should have an option to simply disregard one f... 6kb file I know what I am doing. I have no clude why "git hash-object -w" did not work either =( Lessons learnt, push all commits, do not wait, do backup of repository time to time. Also I know how to remove files from repository, if I ever need =)
I hope this saves someone's time
我通过添加一些更改解决了这个问题,例如再次添加 git add -A 和 git commit 。
I have resolved this problem by adding some change, like
git add -A
andgit commit
again.