版本控制下的 Gettext .po 文件
目前在项目上使用 Gettext,并且 .po 文件保存得很好 在版本控制下。
PO 文件当然包含翻译,但除此之外,它们还 还包含一些元数据 - 有关确切文件的信息和 可翻译字符串所在的行号。
问题是每次更新 PO 文件时元数据 变化比实际翻译要大得多。这使得 以后真的很难从版本控制差异中看出实际是什么 更改 - 您只会看到文件名和行的无数更改 数字。就像那样:
- #: somefile.js:43
- #: somefile.js:45
- #: somefile.js:118
+ #: somefile.js:203
+ #: somefile.js:215
msgid "Translate me please"
msgstr "Tõlgi mind palun"
- #: somefile.js:23
- #: somefile.js:135
+ #: otherfile.js:23
+ #: otherfile.js:135
msgid "Note"
msgstr "Märkus"
- #: andThatFile.js:18
#: orThisFile.js:131
- msgid "Before I was like this"
- msgstr "Selline olin ma enne"
+ msgid "I happen to be changed"
+ msgstr "Paistab, et mind muudeti"
当然,一个简单的修复方法就是禁用 xgettext 输出中的文件名/行号注释。但我实际上发现 这些文件名在翻译时是非常有用的提示。
我肯定不是唯一一个不喜欢 PO 文件差异的人。 建议?
Currently using Gettext on a project and the .po files are nicely kept
under version control.
PO files of course contain translations, but in addition to that they
also contain some metadata - information about the exact files and
line numbers where the translatable strings are located.
The problem is that each time you update the PO files the metadata
changes a whole lot more than the actual translations. This makes it
really hard to later see from version control diff what actually was
changed - you just see a myriad of changes to file names and line
numbers. Like that:
- #: somefile.js:43
- #: somefile.js:45
- #: somefile.js:118
+ #: somefile.js:203
+ #: somefile.js:215
msgid "Translate me please"
msgstr "Tõlgi mind palun"
- #: somefile.js:23
- #: somefile.js:135
+ #: otherfile.js:23
+ #: otherfile.js:135
msgid "Note"
msgstr "Märkus"
- #: andThatFile.js:18
#: orThisFile.js:131
- msgid "Before I was like this"
- msgstr "Selline olin ma enne"
+ msgid "I happen to be changed"
+ msgstr "Paistab, et mind muudeti"
Of course, a simple fix would be to just disable the generation of
filename/linenumber comments in xgettext output. But I actually find
those file names to be quite useful hints when translating.
I surely cannot be the only one who doesn't like the diffs of his PO files.
Suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
一个简单的修复方法是应用 grep 过滤器从查看的差异中删除注释元数据。您可以对版本控制 diff 实用程序的输出执行此操作:
或者您可以指示版本控制 diff 实用程序在进行比较之前忽略这些内容,这可能会导致更多结果可靠且更漂亮的输出:
我不知道您使用什么版本控制系统,但 git (例如)允许您预处理 diff 的输入并删除某些文件类型的注释行(感谢 VonC ),请参阅
man gitattributes
并搜索执行二进制文件的文本差异。下面是保存为/usr/local/bin/strippocomments
的示例脚本的正文,它将执行此操作:然后,您可以告诉 git 使用此脚本来预处理 po 文件,方法是将以下内容添加到存储库中的文件
.git/info/attributes
:以及存储库中的文件
.git/config
:使用 git diff 不应包含任何以
#:。
请注意,使用此方法从 git diff 生成的差异不应用于修补 - 但 git format-patch 仍将使用默认差异,因此为电子邮件生成的补丁将还是可以的。
A simple fix would be to apply a grep filter to remove comment metadata from the viewed diff. You can either do this to the output of the version control diff utility:
or you may be able to instruct the version control diff utility to ignore these before it makes the comparison, which will likely result in a more reliable and prettier output:
I don't know what version control system you use, but git (for example) allows you to preprocess the input to diff and remove the comment lines for certain file types (thanks VonC), see
man gitattributes
and search for Performing text diffs of binary files. Here's the body of a sample script to save as/usr/local/bin/strippocomments
which will do that:You can then tell git to use this script to preprocess po files, by adding the following to the file
.git/info/attributes
in your repository:and to the file
.git/config
in your repository:Using git diff should then not include any lines starting with
#:
.Note that the diffs generated from
git diff
using this approach should not be used for patching - butgit format-patch
will still use the default diff, so patches generated for emailing will still be ok.gitattributes
/textconv
方法是正确的方法。我想提供一个关于预处理工具的更好的解决方案。在
.gitattributes
中:在
.gitconfig
中:gettext 包中的
msgcat
是一个有用的工具。它有许多可供您使用的选项。选项--no-location
尤其是您想要过滤掉行号差异的选项。如果 xgettext 和/或 msgmerge 和/或您的编辑器不断以烦人的方式重新格式化字符串,其他选项可能会很有用。 (在这种情况下,最好将这些相同的选项传递给这些工具,并重新配置您的编辑器。)The
gitattributes
/textconv
approach is the right way to go. I'd like to offer a better solution regarding to tools to do the preprocessing.In
.gitattributes
:In
.gitconfig
:msgcat
from the gettext package is a useful tool there. It has a number of options you can play with. The option--no-location
is especially what you want to filter out the line number differences. The other options might be useful ifxgettext
and/ormsgmerge
and/or your editor keep reformatting the strings in annoying ways. (In that case, it would also be good to pass those same options to those tools, and reconfigure your editor.)GNU gettext 软件包有许多有用的实用程序来使用 PO 文件执行各种任务。 msgcmp 用于比较两个 PO 文件,msgcomm 用于选择常见/独特消息,msgattrib 用于选择/过滤/转换现有 PO 文件。取决于您实际需要 PO 文件的差异,我认为您需要使用 msgattrib 或 msgcomm。
如果您只需要比较两个 PO 文件而不需要对文件/行进行注释,那么简单的 grep 脚本并将其保存在临时目录中,您的旧 PO 文件和新 PO 文件就足够了。
GNU gettext package has numerous useful utilities to perform various tasks with PO files. There is msgcmp to compare two PO files, msgcomm to select common/unique messages, msgattrib to select/filter/transform existing PO files. Depends on what you actually need from diff of PO file, I think you need to use either msgattrib or msgcomm.
If you need to just compare two PO files without comments about file/line then simple script to grep and save in temp dir your old and new PO files would be sufficient.
您可以查看 自定义 diff .gitattribute 文件 提供的不同选项,例如指定po 文件的特殊 diff,
带有
mypodiff
一个调用任何diff
工具的脚本,能够过滤掉您想要的行You could look at the different options offered by a custom diff a .gitattribute file, like specifying a special diff for po files
with
mypodiff
a script calling anydiff
tool able to filter out the line that you wa wnt