使用 git 签入对 UTF8 BOM 的更改

发布于 2024-11-16 17:01:20 字数 161 浏览 3 评论 0原文

我不小心从 Windows 签入了一个 utf8 编码的文本文件,之前没有删除 BOM。现在我尝试在更高版本中删除它并再次签入此更改。 git 似乎忽略了 BOM 字节的更改。是否有一个设置可以让 git 让我按原样签入文件? (我知道在行结尾方面存在类似的问题 - 并且有一个针对此问题的设置......)

I accidentally checked in a utf8 encoded text file from Windows without removing the BOM before. Now I tried to remove it in a later version and check-in this change again. It seems as git ignores the change to the BOM bytes. Is there a setting to make git let me check-in the file like it is? (I know there is a similar issue when it comes to line endings - and there is a setting for this one...)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

巷雨优美回忆 2024-11-23 17:01:20

如果你能让这个重现,一定要报告一个错误

这是我的两分钱:

xxd -r > raw <<< "0000000: 4865 c582 c397 c3b8 0a                   He......."
cat raw # shows "Heł×ø" in UTF8 terminals

git init .
iconv -t UTF32BE raw  > test
git commit -am nobom test
iconv -t UTF32 raw  > test
git diff # reports: "Binary files a/test and b/test differ"
git commit -am bom test

验证不同的对象存在:

git rev-list --objects --all
1d0cf0c1871a8743f947bd4582198db4fc1e72b1
c52c2a8c211a0031e01eef5d5121d5d0b4aabc40
4740254f8f52094afc131040afc80bb68265e78c 
fd3c513224525b3ab94a2512cbbfa918793640eb test
2d9da153c5febf0425437395227381d3a4784154 
2e54d36463fee81e89423d7d80ccc5d7003aba21 test

或者,稍微更直接

for h in $(git rev-list --all -- test); do git ls-tree $a; done
100644 blob 2e54d36463fee81e89423d7d80ccc5d7003aba21    test
100644 blob 2e54d36463fee81e89423d7d80ccc5d7003aba21    test

这是在 ubuntu 64 位上使用 git 1.7.4.1


xxd test # no bom:
0000000: 0000 0048 0000 0065 0000 0142 0000 00d7  ...H...e...B....
0000010: 0000 00f8 0000 000a                      ........

xxd test # with bom
0000000: fffe 0000 4800 0000 6500 0000 4201 0000  ....H...e...B...
0000010: d700 0000 f800 0000 0a00 0000            ............

If you can make this reproducible, by all means report a bug

Here's my two cents:

xxd -r > raw <<< "0000000: 4865 c582 c397 c3b8 0a                   He......."
cat raw # shows "Heł×ø" in UTF8 terminals

git init .
iconv -t UTF32BE raw  > test
git commit -am nobom test
iconv -t UTF32 raw  > test
git diff # reports: "Binary files a/test and b/test differ"
git commit -am bom test

Verify different objects present:

git rev-list --objects --all
1d0cf0c1871a8743f947bd4582198db4fc1e72b1
c52c2a8c211a0031e01eef5d5121d5d0b4aabc40
4740254f8f52094afc131040afc80bb68265e78c 
fd3c513224525b3ab94a2512cbbfa918793640eb test
2d9da153c5febf0425437395227381d3a4784154 
2e54d36463fee81e89423d7d80ccc5d7003aba21 test

or, slightly more direct

for h in $(git rev-list --all -- test); do git ls-tree $a; done
100644 blob 2e54d36463fee81e89423d7d80ccc5d7003aba21    test
100644 blob 2e54d36463fee81e89423d7d80ccc5d7003aba21    test

This is with git 1.7.4.1 on ubuntu 64 bit


xxd test # no bom:
0000000: 0000 0048 0000 0065 0000 0142 0000 00d7  ...H...e...B....
0000010: 0000 00f8 0000 000a                      ........

xxd test # with bom
0000000: fffe 0000 4800 0000 6500 0000 4201 0000  ....H...e...B...
0000010: d700 0000 f800 0000 0a00 0000            ............

没有你我更好 2024-11-23 17:01:20

git 不会忽略字节顺序标记 (BOM) 序列,并且 git 可以仅提交 BOM 删除。使用 xml UTF-8 进行测试

在 Visual Studio 2017 中通过文件 -> 另存为 -> 保存删除 Windows 上的 BOM编码->Unicode(UTF-8,无签名)。 git 看到更改,可以提交

git does not ignore Byte Order Mark (BOM) sequence and it is possible to git commit a BOM removal only. Tested with xml UTF-8

Removing BOM on Windows in Visual Studio 2017 through File->Save As->Save with Encoding->Unicode (UTF-8 without signature). git sees a change and it can be committed

情仇皆在手 2024-11-23 17:01:20

如果找不到合适的解决方案,您可以随时向文件添加字符、提交、删除 BOM 和字母以及修改提交。

If you can't find a proper solution, you can always add a character to file, commit, remove the BOM and the letter, and amend the commit.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文