如果您的应用程序本地化为 pt-br 和 pt-pt,并且系统仅报告“pt”,那么您应该选择哪种语言?代码?
如果您的应用程序本地化为 pt-br
和 pt-pt
,并且系统仅报告 pt
代码(通用葡萄牙语)?
这个问题与应用程序的性质(桌面、移动或基于浏览器)无关。我们假设您无法从其他来源获取区域信息,并且您必须选择一种语言作为默认语言。
这个问题也适用于更多情况,包括:
pt-pt
和pt-br
en-us
和en-gb< /code>
fr-fr
和fr-CA
zh-cn
,zh-tw
, .... - 事实上,在这种情况下,我知道zh
可以用作简体中文的主要语言,其中完整代码是zh-hans
。对于繁体中文,代码如zh-tw
、zh-hant-tw
、zh-hk
、zh-mo 正确的代码(规范)应该是
zh-hant
。
问题1:如何确定指定元语言的主要语言?
我需要一个至少包含葡萄牙语、英语和法语的解决方案。
问题2:如果系统报告简体中文 (PRC) (zh-cn
) 作为用户的首选语言,而我只有英文和繁体中文 (en,zh-tw
) 我应该从 en
或 zh-tw
这两个选项中选择什么?
If you have an application localized in pt-br
and pt-pt
, what language you should choose if the system is reporting only pt
code (generic Portuguese)?
This question is independent of the nature of the application, desktop, mobile or browser based. Let's assume you are not able to get region information from another source and you have to choose one language as the default one.
The question does apply as well for more case including:
pt-pt
andpt-br
en-us
anden-gb
fr-fr
andfr-CA
zh-cn
,zh-tw
, .... - in fact in this case I know thatzh
can be used as predominant language for Simplified Chinese where full code iszh-hans
. For Traditional Chinese, with codes likezh-tw
,zh-hant-tw
,zh-hk
,zh-mo
the proper code (canonical) should bezh-hant
.
Q1: How to I determine the predominant languages for a specified meta-language?
I need a solution that will include at least Portuguese, English and French.
Q2: If the system reported Simplified Chinese (PRC) (zh-cn
) as preferred language of the user and I have translation only for English and Traditional Chinese (en,zh-tw
) what should I choose from the two options: en
or zh-tw
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一般来说,您应该将“猜测缺少的参数”问题与“匹配我想要的语言环境列表与我拥有的语言环境列表”问题分开。他们是不同的。
猜测缺失的部分
这些都是棘手的领域,甚至(可能)带有政治色彩。
但除了极少数例外,规则是选择该语言的“原籍国”。
例外情况主要基于人口。
因此 fr-FR 代表 fr,es-ES 等。
一些例外:pt-BR 代替 pt-PT,en-US 代替 en-GB。
zh 映射到 zh-CN 也是普遍接受的(并且是中国标准所要求的)。
您可能还需要查看所在国家/地区来确定脚本,反之亦然。
例如 az => az-AZ 但 az-Arab => az-Arab-IR,并且 az_IR => az_Arab_IR
匹配“想要”与“拥有”
这涉及匹配想要的语言列表与拥有的语言列表。
处理列表会变得更加困难。如果可能的话,结果也应该以智能的方式排序。 (例如,如果
want = [ fr ro ]
和have = [ en fr_CA fr_FR ro_RO ]
那么您可能需要[ fr_FR fr_CA ro_RO ]
作为结果不同脚本的语言之间不应该匹配。因此 zh-TW 不应该回退到 zh-CN,mn-Mong 不应该回退到 mn-Cyrl。
棘手的地方:理论上 sr-Cyrl 不应该回退到 sr-Latn,但用户可能会理解。 ro-Cyrl 可能会回退到 ro-Latn,但反之则不然。
一些参考
uloc.h
中具有uloc_addLikelySubtags
(和uloc_minimizeSubtags
)。实现http://www.unicode.org/reports/tr35/#Likely_Subtagsuloc.h
中,还有uloc_acceptLanguageFromHTTP
和uloc_acceptLanguage
处理想要与拥有的问题。但它们有点无用,因为它们采用 UEnumeration* 作为输入,并且没有公共 API 来构建 UEnumeration。flash.globalization
命名空间中的 API 可以进行标签猜测和语言匹配 (http://help.adobe.com/en_US/FlashPlatform/beta/参考/actionscript/3/flash/globalization/package-detail.html)。它适用于TR-35,可以超越@并考虑操作。例如,如果have = [ ja ja@collation=radical ja@calendar=japanese ]
和want = [ ja@calendar=japanese;collation=radical ]
那么最好匹配取决于您想要的操作。对于日期格式,ja@calendar=japanese 是更好的匹配,但对于排序规则,您需要 ja@collation=radicalIn general you should separate the "guess the missing parameters" problem from the "matching a list of locales I want vs. a list of locales I have" problem. They are different.
Guessing the missing parts
These are all tricky areas, and even (potentially) politically charged.
But with very few exceptions the rule is to select the "original country" of the language.
The exceptions are mostly based on population.
So fr-FR for fr, es-ES, etc.
Some exceptions: pt-BR instead of pt-PT, en-US instead of en-GB.
It is also commonly accepted (and required by the Chinese standards) that zh maps to zh-CN.
You might also have to look at the country to determine the script, or the other way around.
For instance az => az-AZ but az-Arab => az-Arab-IR, and az_IR => az_Arab_IR
Matching 'want' vs. 'have'
This involves matching a list of want vs. a list of have languages.
Dealing with lists makes it harder. And the result should also be sorted in a smart way, if possible. (for instance if
want = [ fr ro ]
andhave = [ en fr_CA fr_FR ro_RO ]
then you probably want[ fr_FR fr_CA ro_RO ]
as result.There should be no match between language with different scripts. So zh-TW should not fallback to zh-CN, and mn-Mong should not fallback to mn-Cyrl.
Tricky areas: sr-Cyrl should not fallback to sr-Latn in theory, but it might be understood by users. ro-Cyrl might fallback to ro-Latn, but not the other way around.
Some references
uloc_addLikelySubtags
(anduloc_minimizeSubtags
) inuloc.h
. That implements http://www.unicode.org/reports/tr35/#Likely_Subtagsuloc.h
there areuloc_acceptLanguageFromHTTP
anduloc_acceptLanguage
that deal with want vs have. But kind of useless as they are, because they take a UEnumeration* as input, and there is no public API to build a UEnumeration.flash.globalization
namespace do both tag guessing and language matching (http://help.adobe.com/en_US/FlashPlatform/beta/reference/actionscript/3/flash/globalization/package-detail.html). It works on TR-35 and can look beyond the @ and consider the operation. For instance, ifhave = [ ja ja@collation=radical ja@calendar=japanese ]
andwant = [ ja@calendar=japanese;collation=radical ]
then the best match depends on the operation you want. For date formatting ja@calendar=japanese is the better match, but for collation you want ja@collation=radical您预计葡萄牙或巴西会拥有更多用户吗?相应地选择。
对于一般解决方案,您可以通过阅读 Ethnologue 找到答案。
Do you expect to have more users in Portugal or in Brazil? Pick accordingly.
For your general solution, you find out by reading up on Ethnologue.