钩或不钩 - git
我们定制的 IDE 输出 XML 文件,其编码使它们看起来像二进制文件。这些文件的差异和合并失败。
我们可以使用 tr 命令创建这些文件的 ASCII 版本。我希望达到这样的状态:这些文件在提交之前总是自动转换为 ascii。
我拿起我的使用 Git 进行版本控制,它全心全意地警告我不要使用钩子,除非我真的需要。
我应该为此目的使用钩子吗?或者我可以做其他事情来确保文件在提交之前始终进行转换吗?
Windows XP with msysgit 1.7.4
--= update =--
感谢大家的帮助和耐心。寻找这个问题我尝试了以下方法,但它不起作用:
echo "*.xrp filter=xrp" > .git/info/attributes
git config --global filter.xrp.clean 'tr -cd '\''\11\12\15\40-\176'\'''
git config --global filter.xrp.smudge cat
git checkout --force
此配置更改后文件保持不变。即使我删除并重新结帐。
配置为清理任务的 tr
命令确实独立工作。证明:
$ head -n 1 cashflow/repo/C_GMM_CashflowRepo.xrp
ÿþ< ! - - X M L R e p o s i t o r y f i l e 1 . 0 - - >
$ tr -cd '\''\11\12\15\40-\176'\' < cashflow/repo/C_GMM_CashflowRepo.xrp | head -n 1
<!-- XML Repository file 1.0 -->
任何人都可以看到我的配置有什么问题吗?
Our bespoke IDE outputs XML files with an encoding that makes them look like binary files. Diffs and merges of these files fail.
We can create ASCII versions of these files with the tr
command. I would like to get to a state where these files are always automatically converted to ascii before they are committed.
I picked up my copy of Version Control with Git and it wholeheartedly warns me away from using hooks unless I really need to.
Should I be using a hook for this purpose? Or can I do something else to ensure the files are always converted before commit?
Windows XP with msysgit 1.7.4
--= update =--
Thanks everyone for your help and patience. Looking to this question I tried the following, but it does not work:
echo "*.xrp filter=xrp" > .git/info/attributes
git config --global filter.xrp.clean 'tr -cd '\''\11\12\15\40-\176'\'''
git config --global filter.xrp.smudge cat
git checkout --force
The files remain unchanged after this config change. Even when I delete and re-checkout.
The tr
command configured as the clean task does work in isolation. Proof:
$ head -n 1 cashflow/repo/C_GMM_CashflowRepo.xrp
ÿþ< ! - - X M L R e p o s i t o r y f i l e 1 . 0 - - >
$ tr -cd '\''\11\12\15\40-\176'\' < cashflow/repo/C_GMM_CashflowRepo.xrp | head -n 1
<!-- XML Repository file 1.0 -->
Can anyone see what is wrong with my config?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
挂钩的一个问题是它们不是分布式的。
.gitattributes
有一些指令来管理文件的差异和内容,但另一个选项是 属性过滤器(仍在.gitattributes
中),并且可以在提交时自动转换这些文件。(也就是说,如果干净的脚本能够检测这些文件仅基于其内容)
根据此聊天讨论,OP Synesso 报告成功:
请注意,对于不只涉及一个用户,但可能涉及克隆该存储库的任何用户的任何修改,我更喜欢添加(并提交)一个额外的
.gitattributes
文件,在其中声明过滤器,而不是修改.git/info/attribute
文件(未克隆)。从
gitattributes
手册页:http://git-scm.com/docs/gitattributes
phyatt 添加 评论中:
One issue with hooks is that they aren't distributed.
.gitattributes
has some directive to manage the diff and content of a file, but another option would be an attribute filter (still in.gitattributes
), and could automatically convert those files on commit.(That is if the clean script is able to detect those files based on their content alone)
Per this chat discussion, the OP Synesso reports a success:
Note that, for any modification which doesn't concern just one user, but potentially any user cloning that repo, I prefer adding (and committing) an extra
.gitattributes
file in which the filter is declared, rather than modifying the.git/info/attribute
file (which isn't cloned around).From the
gitattributes
man page:http://git-scm.com/docs/gitattributes
phyatt adds in the comments:
diff 是否有机会按原样处理它们(即它们只包含一些奇怪的字节,但其他都是文本)?如果是这样,您可以强制 git 将它们视为带有
.gitattributes
的文本。如果没有,最好还是创建自定义 diff 和合并脚本(将根据需要使用 tr 进行转换)并告诉 git 使用它,再次使用.gitattributes
。在任何一种情况下,您都不会使用挂钩(这些挂钩用于在特定操作中运行),而是使用特定于文件的
.gitattributes
。Does diff stand a chance of working on them as is (i.e. they just contain a handful of strange bytes but are otherwise text) or not? If it does, you can just force git to treat them as text with
.gitattributes
. If not, it still might be better to create custom diff and merge scripts (that will use the tr as needed to convert) and tell git to use it, again with.gitattributes
.In either case you will not be using hooks (those are for running in particular operations), but
.gitattributes
, which are file-specific.如果您首选的编辑格式是 ASCII,并且只有您的构建需要二进制文件,我建议您使用构建规则从首选源生成二进制版本,并将其提交到存储库。
鉴于您的 IDE 已经以二进制格式生成文件,我认为最好的办法是以该格式将它们存储在存储库中。
除了钩子之外,还可以查看 git 帮助属性,尤其是 diff 和 textconv,它们允许您配置与某些模式匹配的文件以使用替代的 diff 方法。您应该能够生成有效的 ASCII 差异,而不必影响存储文件或编辑文件的方式。
编辑:根据您在其他地方的评论“每隔一个字节都是 0”,表明该文件是 UTF-16 或 UCS-2。请参阅此答案以获取可以处理 unicode 的
diff
:我可以让 git 将 UTF-16 文件识别为文本吗?If your preferred editing format were ASCII and only your builds required the binary files I would recommend using build rules to generate the binary version from the preferred source which you would commit to the repository.
Given that your IDE makes the files in the binary format already, I think the best thing is to store them in the repository in that format.
Rather than hooks, look at
git help attributes
, especiallydiff
andtextconv
which allow you to configure files matching certain patterns to use alternate means of diffing. You should be able to produce working ASCII diffs without having to compromise how you store the files or edit them.EDIT: Based on your comment elsewhere that "every other byte is 0" that suggest the file is UTF-16 or UCS-2. See this answer for a
diff
which can handle unicode: Can I make git recognize a UTF-16 file as text?