如果您的应用程序本地化为 pt-br 和 pt-pt，并且系统仅报告“pt”，那么您应该选择哪种语言？代码？

发布于 2024-08-25 19:50:24 字数 949 浏览 11 评论 0原文

如果您的应用程序本地化为 pt-br 和 pt-pt，并且系统仅报告 pt 代码（通用葡萄牙语）？

这个问题与应用程序的性质（桌面、移动或基于浏览器）无关。我们假设您无法从其他来源获取区域信息，并且您必须选择一种语言作为默认语言。

这个问题也适用于更多情况，包括：

pt-pt 和 pt-br
en-us 和 en-gb< /code>
fr-fr 和 fr-CA
zh-cn, zh-tw, .... - 事实上，在这种情况下，我知道 zh 可以用作简体中文的主要语言，其中完整代码是 zh-hans。对于繁体中文，代码如 zh-tw、zh-hant-tw、zh-hk、zh-mo 正确的代码（规范）应该是 zh-hant。

问题1：如何确定指定元语言的主要语言？

我需要一个至少包含葡萄牙语、英语和法语的解决方案。

问题2：如果系统报告简体中文 (PRC) (zh-cn) 作为用户的首选语言，而我只有英文和繁体中文 (en,zh-tw) 我应该从 en 或 zh-tw 这两个选项中选择什么？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

吐个泡泡 2024-09-01 19:50:24

一般来说，您应该将“猜测缺少的参数”问题与“匹配我想要的语言环境列表与我拥有的语言环境列表”问题分开。他们是不同的。

猜测缺失的部分

这些都是棘手的领域，甚至（可能）带有政治色彩。

但除了极少数例外，规则是选择该语言的“原籍国”。
例外情况主要基于人口。
因此 fr-FR 代表 fr，es-ES 等。
一些例外：pt-BR 代替 pt-PT，en-US 代替 en-GB。

zh 映射到 zh-CN 也是普遍接受的（并且是中国标准所要求的）。

您可能还需要查看所在国家/地区来确定脚本，反之亦然。
例如 az => az-AZ 但 az-Arab => az-Arab-IR，并且 az_IR => az_Arab_IR

匹配“想要”与“拥有”

这涉及匹配想要的语言列表与拥有的语言列表。
处理列表会变得更加困难。如果可能的话，结果也应该以智能的方式排序。（例如，如果 want = [ fr ro ] 和 have = [ en fr_CA fr_FR ro_RO ] 那么您可能需要 [ fr_FR fr_CA ro_RO ] 作为结果

不同脚本的语言之间不应该匹配。因此 zh-TW 不应该回退到 zh-CN，mn-Mong 不应该回退到 mn-Cyrl。
棘手的地方：理论上 sr-Cyrl 不应该回退到 sr-Latn，但用户可能会理解。 ro-Cyrl 可能会回退到 ro-Latn，但反之则不然。

一些参考

RFC 4647 处理语言回退（但在这种情况下不是很有用，因为它遵循“从右侧剪切”规则）。
ICU 4.2 及更新版本（我认为是 4.0 中的草案）在 uloc.h 中具有 uloc_addLikelySubtags（和 uloc_minimizeSubtags）。实现http://www.unicode.org/reports/tr35/#Likely_Subtags
另外，在 ICU uloc.h 中，还有 uloc_acceptLanguageFromHTTP 和 uloc_acceptLanguage 处理想要与拥有的问题。但它们有点无用，因为它们采用 UEnumeration* 作为输入，并且没有公共 API 来构建 UEnumeration。
除了简单的 RFC 4647 之外，还有一些关于语言匹配的工作。请参阅 http://cldr。 unicode.org/development/design-proposals/languagedistance
ActionScript 中的区域设置匹配 http:// code.google.com/p/as3localelib/
新的 Flash Player 10.1 flash.globalization 命名空间中的 API 可以进行标签猜测和语言匹配 (http://help.adobe.com/en_US/FlashPlatform/beta/参考/actionscript/3/flash/globalization/package-detail.html）。它适用于TR-35，可以超越@并考虑操作。例如，如果 have = [ ja ja@collation=radical ja@calendar=japanese ] 和 want = [ ja@calendar=japanese;collation=radical ] 那么最好匹配取决于您想要的操作。对于日期格式，ja@calendar=japanese 是更好的匹配，但对于排序规则，您需要 ja@collation=radical

In general you should separate the "guess the missing parameters" problem from the "matching a list of locales I want vs. a list of locales I have" problem. They are different.

Guessing the missing parts

These are all tricky areas, and even (potentially) politically charged.

But with very few exceptions the rule is to select the "original country" of the language.
The exceptions are mostly based on population.
So fr-FR for fr, es-ES, etc.
Some exceptions: pt-BR instead of pt-PT, en-US instead of en-GB.

It is also commonly accepted (and required by the Chinese standards) that zh maps to zh-CN.

You might also have to look at the country to determine the script, or the other way around.
For instance az => az-AZ but az-Arab => az-Arab-IR, and az_IR => az_Arab_IR

Matching 'want' vs. 'have'

This involves matching a list of want vs. a list of have languages.
Dealing with lists makes it harder. And the result should also be sorted in a smart way, if possible. (for instance if want = [ fr ro ] and have = [ en fr_CA fr_FR ro_RO ] then you probably want [ fr_FR fr_CA ro_RO ] as result.

There should be no match between language with different scripts. So zh-TW should not fallback to zh-CN, and mn-Mong should not fallback to mn-Cyrl.
Tricky areas: sr-Cyrl should not fallback to sr-Latn in theory, but it might be understood by users. ro-Cyrl might fallback to ro-Latn, but not the other way around.

Some references

RFC 4647 deals with language fallback (but is not very useful in this case, because it follows the "cut from the right" rule).
ICU 4.2 and newer (draft in 4.0, I think) has uloc_addLikelySubtags (and uloc_minimizeSubtags) in uloc.h. That implements http://www.unicode.org/reports/tr35/#Likely_Subtags
Also in ICU uloc.h there are uloc_acceptLanguageFromHTTP and uloc_acceptLanguage that deal with want vs have. But kind of useless as they are, because they take a UEnumeration* as input, and there is no public API to build a UEnumeration.
There is some work on language matching going beyond the simple RFC 4647. See http://cldr.unicode.org/development/design-proposals/languagedistance
Locale matching in ActionScript at http://code.google.com/p/as3localelib/
The APIs in the new Flash Player 10.1 flash.globalization namespace do both tag guessing and language matching (http://help.adobe.com/en_US/FlashPlatform/beta/reference/actionscript/3/flash/globalization/package-detail.html). It works on TR-35 and can look beyond the @ and consider the operation. For instance, if have = [ ja ja@collation=radical ja@calendar=japanese ] and want = [ ja@calendar=japanese;collation=radical ] then the best match depends on the operation you want. For date formatting ja@calendar=japanese is the better match, but for collation you want ja@collation=radical