Gettext:消息 ID 是英文文本是个好主意吗?

发布于 2024-07-07 04:29:43 字数 259 浏览 7 评论 0原文

我们正准备将我们的 PHP 网站翻译成各种语言,PHP 中的 gettext 支持看起来是可行的方法。

我看到的所有教程都建议使用英文文本作为消息 ID,即

gettext("Hi There!")

但这真的是个好主意吗? 假设营销人员想要将文本更改为“大家好!”。 那么您是否不必更新所有语言文件,因为该字符串(实际上是消息 ID)已更改?

拥有某种通用 ID(例如“hello.message”)和英文翻译文件是否更好?

We're getting ready to translate our PHP website into various languages, and the gettext support in PHP looks like the way to go.

All the tutorials I see recommend using the english text as the message ID, i.e.

gettext("Hi there!")

But is that really a good idea? Let's say someone in marketing wants to change the text to "Hi there, y'all!". Then don't you have to update all the language files because that string -- which is actually the message ID -- has changed?

Is it better to have some kind of generic ID, like "hello.message", and an english translations file?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

作业与我同在 2024-07-14 04:29:44

总而言之,不要这样做。

英语中的同一个单词/短语通常可以有多种含义,并且每种含义都有不同的翻译。

为您的字符串定义助记符 id,并将英语视为另一种语言。

同意其他发帖者的观点,即代码中的 id 数字对于代码可读性来说是一场噩梦。

前本地化工程师

In a word don't do this.

The same word/phrase in English can often enough have more than one meaning, and each meaning a different translation.

Define mnemonic ids for your strings,and treat English as just another language.

Agree with other posters that id numbers in code are a nightmare for code readability.

Ex localisation engineer

喜你已久 2024-07-14 04:29:44

你不是已经回答了你自己的问题了吗? :)

显然,如果您打算支持应用程序的 i18n,则应该对所有语言实现一视同仁。 如果有人决定需要更改某个字符串,您可以在所有语言文件中进行类似的更改。 签入的元数据应将所有语言文件分组到同一更改中。 如果您的“默认”语言的处理方式不同,那么维护就会变得更加困难。

Haven't you already answered your own question? :)

Clearly, if you intend to support i18n of your application, you should treat all the language implementations the same. If someone decides a string needs to change, you make a similar change in all the language files. The metadata with the checkin should group all the language files together in the same change. If your "default" language is handled differently, that makes it more difficult to maintain.

别念他 2024-07-14 04:29:44

最终,翻译人员应该能够坐下来更改每种语言的文本(以便它们在含义上匹配),而不必让已经完成其工作的程序员参与进来。

这让我觉得正确的答案是使用 gettext 的修改版本,在其中放置像此

_(id, backup_text, context)

_('ABOUT_ME', 'About Me', 'HOMEPAGE')

上下文这样的字符串是可选的,

为什么这样呢?
因为您需要使用唯一 ID 来识别系统中的文本,而不是可能在其他地方重复的英文文本。

您还应该将备份、ID 和上下文保留在代码中的同一位置,以减少差异。

id 也必须是可读的,这会带来同义词和重复使用的问题(即使是 ids),我们可以给 ids 添加前缀,如“HOMEPAGE_ABOUT_ME”或“MAIL_LETTER”,但

  1. 人们忘记在开始时这样做并更改它后来是一个问题,
  2. 系统能够更灵活地按 id 和上下文进行分组,

这就是为什么我还在末尾添加了上下文变量,

备份文本几乎可以是任何内容,甚至可以是“[ABOUT_ME@HOMEPAGE”文本加载失败,请联系[电子邮件受保护]]"

它不会'无法使用当前的 gettext 编辑程序(例如“poedit”),但我认为您可以为翻译定义自定义变量名称,例如“t()”,开头不带下划线。

我知道 gettext 也支持上下文,但它没有很好的记录或广泛使用。

PS 我不确定执行良好且可扩展代码的最佳变量顺序,因此欢迎提出建议。

At the end of the day, a translator should be able to sit down and change the texts for every language (so they match in meaning) without having to involve the programmer that already did his/her job.

This makes me feel like the proper answer is to use a modified version of gettext where you put strings like this

_(id, backup_text, context)

_('ABOUT_ME', 'About Me', 'HOMEPAGE')

context being optional

why like this?
because you need to identify text in the system using unique ID's not english text that could get repeated elsewhere.

You should also keep the backup, id and context in the same place in your code to reduce discrepancies.

The id's also have to be readable, which brings in the problem of synonyms and duplicate use (even as ids), we could prefix the ids like this "HOMEPAGE_ABOUT_ME" or "MAIL_LETTER", but

  1. people forget to do this at the start and changing it later is a problem
  2. its more flexible for the system to be able to group both by id and context

which is why I also added the context variable at the end

the backup text can be pretty much anything, could even be "[ABOUT_ME@HOMEPAGE text failed to load, please contact [email protected]]"

It won't work with the current gettext editing programs like "poedit", but I think you can define custom variable names for translations like just "t()" without the underscore at the start.

I know that gettext also has support for contexts, but its not very well documented or widely used.

P.S. I'm not sure about the best variable order to enforce good and extendable code so suggestions are welcome.

不必在意 2024-07-14 04:29:44

我什至可以说,您永远不会(对于大多数从不的值)想要使用自由文本作为任何内容的键。 想象一下,如果 SO 使用查询标题作为该页面的关键字。 如果有人链接到它,然后编辑了标题,则该链接不再有效。

你的问题是类似的,除了你还要负责更新所有链接......

就像 Douglas Leeder 提到的那样,你可能想要做的是使用英语作为默认(备份)语言,尽管使用英语和另一种语言混合的界面非常令人困惑(但也有点有趣)。

I'd go so far as to say that you never (for most values of never) want to use free text as keys to anything. Imagine if SO used the query title as key to this page for instance. If someone links to it, and then the title is edited, the link is no longer valid.

Your problem is similar, except you would also be responsible for updating all links...

Like Douglas Leeder mentions, what you probably want to do is use English as the default (backup) language, although an interface that uses English and another language intermixed is highly confusing (but mildly amusing, too).

递刀给你 2024-07-14 04:29:44

除了上述注意事项之外,在许多情况下您还希望“键”(msgid) 与源文本(英语)不同。 例如,在 HTML 视图中,我可能想说 [yyyy],其中该锚标记的目的地和标签取决于用户的区域设置。 例如,它可能是社交网络的链接,在美国它是 Facebook,但在中国它是微博。 因此,MsgIds 可能类似于socialSiteUrl 和socialSiteLabel。

我使用混合。

对于我认为不会有冲突/变化/奇怪含义的基本字符串,我将使密钥与英语相同。

In addition to the considerations above, there are many cases where you'd want the "key" (msgid) to be different from the source text (English). For example, in the HTML view, I might want to say [yyyy] where the destination and label of that anchor tag depend on the locale of the user. E.g. it might be a link to a social network, and in US it would be Facebook but in China it would be Weibo. So the MsgIds might be something like socialSiteUrl and socialSiteLabel.

I use a mix.

For basic strings that I don't think will have conflicts/changes/weird meanings, I'll make the key be the same as the English.

南街女流氓 2024-07-14 04:29:44

我们使用荷兰语。 字符串应该用作者的母语编写; 这使得与译者的交流不太容易出错,因为作者可以用他们的母语与他们交流。

We use Dutch. The strings should be written in the native language of the writer; this makes communication with translators less prone to errors, since the writer(s) can communicatie in their native language with them.

相思故 2024-07-14 04:29:44

但是,当您有很大的文本而不仅仅是标题时,密钥可能会太大。

But what when you have a large text, not just a title, the key could be too big.

小傻瓜 2024-07-14 04:29:43

哇,我很惊讶没有人提倡使用英语作为密钥。 我在几个软件项目中使用了这种风格,恕我直言,它的效果非常好。 代码的可读性非常好,如果您更改英文字符串,很明显需要考虑重新翻译该消息(这是一件好事)。

如果您只是纠正拼写或进行其他一些绝对不需要翻译的更改,则更新资源文件中该字符串的 ID 是一件简单的事情。

也就是说,我目前正在评估是否将这种国际化方式延续到一个新项目中,所以很高兴听到一些关于为什么它可能不是一个好主意的想法。

Wow, I'm surprised that no one is advocating using the English as a key. I used this style in a couple of software projects, and IMHO it worked out pretty well. The code readability is great, and if you change an English string it becomes obvious that the message needs to be considered for re-translation (which is a good thing).

In the case that you're only correcting spelling or making some other change that definitely doesn't require translation, it's a simple matter to update the IDs for that string in the resource files.

That said, I'm currently evaluating whether or not to carry this way of doing I18N forward to a new project, so it's good to hear some thoughts on why it might not be a good idea.

烂人 2024-07-14 04:29:43

我使用有意义的 ID,例如“welcome_back_1”,即“欢迎回来,%1”等。我始终将英语作为我的“基础”语言,因此在最坏的情况下当特定语言没有消息 ID 的情况下,我会求助于英语。

我不喜欢使用实际的英语短语作为消息 ID,因为如果英语发生变化,ID 也会发生变化。 如果您使用一些自动化工具,这可能不会对您产生太大影响,但它让我感到困扰。 我不喜欢使用简单的代码(比如 msg3975),因为它们没有任何意义,所以阅读代码会更困难,除非你到处乱扔注释。

I use meaningful IDs such as "welcome_back_1" which would be "welcome back, %1" etc. I always have English as my "base" language so in the worst case scenario when a specific language doesn't have a message ID, I fall-back on English.

I don't like to use actual English phrases as message ID's because if the English changes so does the ID. This might not affect you much if you use some automated tools, but it bothers me. I don't like to use simple codes (like msg3975) because they don't mean anything, so reading the code is more difficult unless you litter comments everywhere.

筱武穆 2024-07-14 04:29:43

我强烈不同意理查德·哈里森的回答,他说这是“唯一的方法”。 亲爱的提问者,不要相信声称这是唯一方法的答案,因为“唯一方法”不存在。

恕我直言,这是另一种比理查兹方法有一些优势的方法:

  • 首先使用英语字符串的原始版本作为原始版本。
  • 不要显示这些原始字符串,但创建一个英语翻译文件
  • 将原始字符串复制到开头的翻译

优点:

  • 代码中的可读代码
  • 文本非常接近(如果与视图显示的内容不同的话
  • )更改英文文本,您不会更改原始字符串,而是更改翻译,
  • 如果您想将同一内容翻译两次,只需编写稍微不同的原始字符串或仅添加“这个和那个的版本”,您仍然有一个完全可读的代码

I strongly disagree with Richard Harrisons answer about which he states it is "the only way". Dear asker, do not trust an answer that states it is the only way, because the "only way" doesn't exist.

Here is another way which IMHO has a few advantages over Richards approach:

  • Start with using the proto-version of the English string as Original.
  • Don't display these proto-strings but create a translation file for English nontheless
  • Copy the proto-strings to the translation for the beginning

Advantages:

  • readable code
  • text in your code is very close if not identical to what your view displays
  • if you want to change the English text, you don't change the proto-string but the translation
  • if you want to translate the same thing twice, just write a slightly different proto-string or just add 'version for this and that' and you still have a perfectly readable code
伪装你 2024-07-14 04:29:43

ID 为英语的原因是,如果翻译因任何原因失败(当前语言和标记的翻译不可用或其他错误),则会返回 ID。
当然,这是假设开发人员正在编写原始英文文本,而不是某个文档人员。

另外,如果英文文本发生变化,那么其他翻译可能需要更新吗?

在实践中,我们也使用纯 ID,而不是英文文本,但这确实意味着我们必须做很多额外的工作才能默认为英文。

The reason for the IDs being English is so that the ID is returned if the translation fails for whatever reason - the translation for the current language and token not being available, or other errors.
That of course assumes the developer is writing the original English text, not some documentation person.

Also if the English text changes then probably the other translations need to be updated?

In practice we also use Pure IDs rather than then English text, but it does mean we have to do lots of extra work to default to English.

你是我的挚爱i 2024-07-14 04:29:43

有很多事情需要考虑,而答案并不那么容易。

使用简单的英语

优点

  • 易于编写和阅读代码
  • 在大多数情况下,即使代码中没有运行翻译功能,它也能工作

缺点

  • 参与其中的程序员也必须是优秀的文案撰写者:)
  • 您需要完全用英语编写正确、精确的文本,即使您需要运行的第一语言是其他语言(即我们正在用捷克语启动许多项目,稍后我们会将它们本地化为 EN)。
  • 在很多情况下,您需要使用上下文。 如果你从一开始就没有做到这一点,那么以后添加它们就需要做很多工作。 解释一下:在英语中,一个单词可以有多种不同的含义 - 你需要使用上下文来区分它们 - 而且它并不总是那么容易(顺序 = 排序顺序,也可以是采购订单)。
  • 在此过程的后期纠正英语可能非常困难。 对源字符串的更正通常会导致已翻译短语的丢失。 仅仅因为你纠正了英语就失去了对 3 种不同语言的翻译,这是非常令人沮丧的。

使用按键

优点

  • 您甚至可以使用英语的本地化平台功能。 也就是说,我们正在使用可爱的 Crowdin 平台。 有很多方便的工具 - 或者更确切地说是一个完整的工作流程 - 用于翻译管理:对不同的翻译进行投票、翻译历史记录、术语表(这有助于保持翻译/语言的连贯性)、校对、批准等。使用密钥可以使此过程变得更加轻松更流畅。

  • 发送英文文本进行校对等要容易得多。通常,让撰稿人直接修改您的代码不是一个好主意:)

缺点

  • 项目设置更复杂。
  • %d、%s 等更难使用。

There is a lot to consider and answer is not so easy.

Using plain English

Pros

  • Easy to write and READ code
  • In most cases, it works even without running translation functions in code

Cons

  • Involved programmers must be also good copywriters :)
  • You need to write correct precise texts fully in English, even in the case that first language you need to run is something else (ie we're starting lof of projects in Czech language and we're localizing them to EN later).
  • In a lot of cases, you need to use contexts. If you fail to do it from begginig, it's a lot of work to add them later. To explain: In English, one word can have many different meands - and you need to use contexts to differentiate them - and it's not always so easy (order = sort order, or it can be purchase order).
  • It can be very hard to correct English later in the process. Corrections of the source strings will very often lead to loss of already translated phrases. It's very frustrating to loose translation to 3 different languages just because you corrected English.

Using keys

Pros

  • You can use localization platform functions even for the English language. I.e. we're using the lovely Crowdin platform. There is a lot of handy tools - or rather a complete workflow - for translation management: voting for different translations, translation history, glossaries (which helps to keep translation/language coherent), proofing, approval, etc. Using keys make this process much more smooth.

  • It's much easier to send Engish texts for proofreading etc. Usually, it's not a good idea to let copywriters to modify your code directly :)

Cons

  • More complicated project setup.
  • Harder to use %d, %s etc.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文