Git 说“二进制文件 a... 和 b... 不同”对于 *.reg 文件打开
有没有办法强制 Git 将 .reg
文件视为文本?我使用 Git 来跟踪 Windows 注册表调整,Windows 使用 .reg
来存储这些文件。
更新1:我让它运行差异(谢谢,安德鲁)。然而,现在看起来像下面这样。这是编码问题吗?
index 0080fe3..fc51807 100644
--- a/Install On Rebuild/4. Registry Tweaks.reg
+++ b/Install On Rebuild/4. Registry Tweaks.reg
@@ -1,49 +1,48 @@
-<FF><FE>W^@i^@n^@d^@o^@w^@s^@ ^@R^@e^@g^@i^@s^@t^@r^@y^@ ^@E^@d^@i^@t^@o^@r^@
-^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;
-^@^M^@
...
有什么想法吗?
更新2:感谢所有提供帮助的人:这就是我最终所做的:创建内容为*.reg text diff
的文件.gitattributes
,然后我将文件转换为 UTF-8,因为 UTF-16 的差异很奇怪。我没有使用任何外来字符,因此 UTF-8 适合我。
Is there a way to force Git in to treating .reg
files as text? I am using Git to track my windows registry tweaks and Windows uses .reg
for these files.
UPDATE 1: I got it to run a diff (thanks, Andrew). However, now it looks like this below. Is this an encoding issue?
index 0080fe3..fc51807 100644
--- a/Install On Rebuild/4. Registry Tweaks.reg
+++ b/Install On Rebuild/4. Registry Tweaks.reg
@@ -1,49 +1,48 @@
-<FF><FE>W^@i^@n^@d^@o^@w^@s^@ ^@R^@e^@g^@i^@s^@t^@r^@y^@ ^@E^@d^@i^@t^@o^@r^@
-^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;^@;
-^@^M^@
...
Any ideas?
UPDATE 2: Thanks all who helped: here's what I did in the end: create file .gitattributes
with content *.reg text diff
and then I converted the files to UTF-8 as UTF-16 is weird with diffs. I'm not using any foreign characters so UTF-8 works for me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
要告诉 git 显式比较文件类型,请将以下内容放入 存储库根目录中的
.gitattributes
文件:To tell git to explicitly diff a filetype, put the following in a
.gitattributes
file in your repository’s root directory:快速回答
正如其他人指出的那样,此问题是由编码混淆引起的。您有两个选择:
通过相应地重新保存将文件编码更改为 UTF-8。
创建一个
.gitattributes
文件,并包含以下内容:*.regworking-tree-encoding=UTF-16LE-BOM eol=CRLF
原因
默认情况下,从 Windows 注册表编辑器导出的注册表以特定的 UTF-16 编码保存。
在底层,Git 仅支持 UTF-8 及其超集,因此当 Git 看到 UTF-16 编码文件时,它会看到许多意外的非字符字节并将其解释为二进制文件。
通过设置
*.reg diff
属性来要求 Git 将文件视为文本是行不通的,因为 Git 仍然认为错误的编码。这就是您看到所有这些^@
字符的原因。解决方案
其他人建议的一种解决方案是将 UTF-16 文件另存为 UTF-8,这完全有效!但它确实有一个很大的缺点:如果您有很多 .reg 文件,或者您想从注册表编辑器重新导出密钥,则每次都必须使用正确的编码重新保存它。
或者,您可以通过
working-tree-encoding 告诉 Git 您打算使用什么编码
属性。指定此选项后,Git 会将文本文件在提交到存储库时将其转换为 UTF-8,然后在签出时将其转换回原始编码。这样,当文件出现在工作目录中时,它始终具有原始编码。如果您熟悉行尾标准化,其行为类似于那。如果您采取这种方式,则需要注意一些陷阱:
由于这些原因,文档建议仅在以下情况下使用此属性:文件无法有效地存储为 UTF-8,但根据您的使用情况,这些陷阱可能不会让您担心。
最后,在使用此属性时,还必须指定使用哪些行尾字符以避免歧义。这是通过
eol
属性完成的。综上所述,我建议您尝试在存储库的根目录中创建一个
.gitattributes
文件,并包含以下行:*.regworking-tree-encoding=UTF-16LE-BOM eol =CRLF
Quick Answer
As others have pointed out, this issue is caused by an encoding mix up. You have two options:
Change the file encoding to UTF-8 by re-saving it accordingly.
Create a
.gitattributes
file, and include the following:*.reg working-tree-encoding=UTF-16LE-BOM eol=CRLF
Cause
By default, registry exports from the Windows Registry Editor are saved in a particular UTF-16 encoding.
Under the hood, Git only supports UTF-8 and its supersets, so when Git sees a UTF-16 encoded file, it sees a lot of unexpected non-character bytes and interprets that as a binary file.
Asking Git to treat the file as text by setting a
*.reg diff
attribute doesn't work because Git is still expecting the wrong encoding. That's why you saw all of those^@
characters.Solutions
One solution that others have suggested is to save the UTF-16 files as UTF-8 and that totally works! It does have one big disadvantage though: if you have a lot of .reg files, or you want to re-export a key from the Registry Editor, you'll have to re-save it with the correct encoding every time.
Alternatively, you can tell Git what encoding you plan to use with the
working-tree-encoding
attribute. When this is specified, Git will convert a text file to UTF-8 as it is committed to the repository, and then convert it back to the original encoding as it gets checked out. That way, the file always has the original encoding when it appears in your working directory. If you're familiar with end-of-line normalization, the behavior is similar to that.If you take this route, there are a few pitfalls to be aware of:
For these reasons, the documentation recommends to only use this attribute if the file cannot be stored usefully as UTF-8, but depending on your use case these pitfalls may not concern you.
Finally, when using this attribute it's important to also specify what end-of-line characters are in use to avoid ambiguity. That's done with the
eol
attribute.Putting it all together, I recommend you try creating a
.gitattributes
file in your repository's root, and including the following line:*.reg working-tree-encoding=UTF-16LE-BOM eol=CRLF
Git 将您的注册表导出文件视为二进制文件,因为它们具有 NUL。没有很好的方法来区分或合并一般二进制文件。一个字节的更改可以更改文件其余部分的解释。
有两种处理二进制文件的通用方法:
接受它们是二进制的。差异不会有任何意义,所以不要要求它们。永远不要合并它们,这意味着只允许在一个分支上进行更改。在这种情况下,可以通过将每个调整(或一组相关调整放在单独的文件中)来变得更容易,这样一个文件中发生差异的可能性就会减少。
将更改存储为文本,然后转换/反转换为这些二进制形式。
。尽管这些“文本”文件,UTF-16 编码似乎没有非 ASCII 位,但是您能否将它们转换为 ASCII(或 UTF-8,如果没有,则为 ASCII) 扩展字符)?
Git is treating your registry export files as binary files because they have NULs. There is no good way to diff or merge general binary files. A change of one byte can change the interpretation of the rest of the file.
There are two general approaches to handling binary files:
Accept that they're binary. Diffs aren't going to be meaningful, so don't ask for them. Don't ever merge them, which means only allowing changes on one branch. In this case, this can be made easier by putting each tweak (or set of related tweaks in a separate file, so there's fewer possible ways differences will happen in one file.
Store the changes as text, and convert/deconvert to these binary forms.
Even though these "text" files, the UTF-16 encoding contains NULs. There appear to be no non-ASCII bits however. Can you convert them to ASCII (or UTF-8, which will be ASCII if there are no extended characters)?
创建一个 utf16toascii.py:
然后在 bash 中执行:
您可以比较注册表文件、Xcode .strings 文件或任何其他 utf-16 文件。
Create one utf16toascii.py:
Then in bash do:
And you're good to diff registry files, as well as Xcode .strings files, or any other utf-16 file.
通过在记事本中打开每个 .reg 文件并将其另存为 Encoding UTF-8,将 .reg 文件从 utf16 转换为 utf8。
Convert .reg files from utf16 to utf8 by opening each .reg file in notepad and saving as Encoding UTF-8.
手动分配 iconv 进行比较
另一个答案建议尝试
*.regworking-tree-encoding=UTF-16LE-BOM eol=CRLF
这对我不起作用。我使用的是 Windows 10,TortoiseGit 没有意识到该文件实际上没有更改。
我有一个将 Tomcat 注册表转储到磁盘的 bat 文件,运行后 TortoiseGit 图标将始终为红色。但如果我在命令行上运行 git status ,我会立即变成绿色。 ——不知道那里发生了什么。
所以我最后做了别的事情。我不接触内部编码,我只是手动定义一个用于 UTF-16 文件的比较程序,然后手动分配 .reg 文件来使用它。这对我来说适用于 git-bash-for-windows 。我在 Windows 资源管理器中没有遇到红色“this-has-changed”TortoiseGit 图标覆盖的问题。
有两个步骤:
定义比较程序:
分配 .reg 文件以使用该程序:
下面有更多详细信息。
我有一个执行此操作的批处理文件:
现在我运行此导出批处理文件,然后通过 Tomcat9 GUI 将“超时”值从 60 秒更改为 66 秒,然后再次运行导出批处理文件。
之前:您没有得到任何文本差异。您只会得到“二进制文件...不同”
之后:您会得到实际的差异。
Manually assign iconv for diffing
Another answer suggested trying
*.reg working-tree-encoding=UTF-16LE-BOM eol=CRLF
This did not work for me. I'm on Windows 10 and TortoiseGit didn't realize that the file was actually unchanged.
I have a bat file that dumps Tomcat registry to disk and after running that the TortoiseGit icon would always be red. But I would immediately turn green if I ran
git status
on the command line. -- Not sure what's going on there.So I wound up with something else. I don't touch the internal encoding, I just manually define a diffing program to use for UTF-16 files and then I manually assign .reg files to use that. This works on git-bash-for-windows for me. And I don't have the problem in Windows Explorer with the red "this-has-changed" TortoiseGit icon overlay.
There are two steps:
Define diffing program:
Assign .reg files to use that:
More details below.
I have a batch file which does this:
Now I ran this export batch file, then I changed the "timeout" value from 60 seconds to 66 seconds via Tomcat9 GUI and then I ran the export batch file again.
Before: you get no textual diff. You just get "Binary files ... differ"
After: you get an actual diff.