我应该使用哪种 ISO 格式来存储用户的语言代码?
我应该使用 ISO 639-1(2 个字母缩写)还是 ISO 639-2(3 个字母缩写)来存储用户的语言代码?两者都是官方标准,但哪个是开发社区事实上的标准呢?我认为 ISO 639-1 会更容易记住,并且可能因此更受欢迎,但这只是一个猜测。
我正在构建的网站将有一个针对美国、巴西、俄罗斯、中国和美国的单独网站。英国。
Should I use ISO 639-1 (2-letter abbreviation) or ISO 639-2 (3 letter abbrv) to store a user's language code? Both are official standards, but which is the de facto standard in the development community? I think ISO 639-1 would be easier to remember, and is probably more popular for that reason, but thats just a guess.
The site I'm building will have a separate site for the US, Brazil, Russia, China, & the UK.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
您应该使用 IETF 语言标记,因为它们已经用于 HTTP/HTML/XML 和许多其他技术。它们基于多个标准,包括 ISO-639 集合(是的,语言、地区和文化选择并不是那么容易定义)。
我写了一篇关于正确的语言代码选择和使用的更详细的文章。其想法是使用最简单/较短的 ISO-639-1 代码,并仅针对特殊情况指定更多代码。文章内有大约 30 种最常用语言的代码,以及我认为其中一种替代方案优于另一种替代方案的原因。
如果您想跳过阅读整篇文章,这里有一个简短的语言代码列表(不要与国家/地区代码混淆):
ar、cs、da、de、el、en、 en-gb、es、fr、fi、he、hu、it、ja、ko、nb、nl、pl、pt、pt-pt、ro、ru、sv、tr、uk、zh、zh-hant
以下几点可能不太明显,但应牢记:
en
用于en-us
- 美式英语,而英式英语则使用en -gb
pt
用于pt-br
,而不是pt-pt
,女巫的说话者要少得多zh<使用 /code> 代替
zh-hans
、zh-CN
、...zh-hant
(繁体中文)代替更具体的代码,例如zh-hant-TW
或zh-TW
您可以在 文章。
You should use IETF language tags because they are already used for HTTP/HTML/XML and many other technologies. They are based on several standards including the ISO-639 collection (yes language, region and culture selection are not so simple to define).
I wrote a more detailed article regarding the proper language code selection and usage. The idea is to use the simplest/shorter ISO-639-1 codes and specify more only for special cases. Inside the article there are codes for ~30 most used languages with reasons why I consider one alternative better than another.
In case you want to skip reading the entire article here is a short list of language codes (not to be confused with country codes):
ar, cs, da, de, el, en, en-gb, es, fr, fi, he, hu, it, ja, ko, nb, nl, pl, pt, pt-pt, ro, ru, sv, tr, uk, zh, zh-hant
The following points may not be obvious but should be borne in mind:
en
is used foren-us
- American English, and for British English is useden-gb
pt
is used forpt-br
, and notpt-pt
witch has much less speakerszh
is used instead ofzh-hans
,zh-CN
,...zh-hant
(Traditional Chinese) is used instead of more specific codes likezh-hant-TW
orzh-TW
You can find more explanations inside the article.
我会选择 ISO 639 的衍生版本。具体来说,我喜欢使用这个:http://en.wikipedia。 org/wiki/IETF_language_tag
I would go with a derivative of ISO 639. Specifically I like to use this: http://en.wikipedia.org/wiki/IETF_language_tag
我不是专家,但我见过的每个网站都使用 ISO 639-1,包括我当前正在处理的网站。
它对我们有用!
I'm no expert, but every site I've ever seen uses ISO 639-1, including the current site I'm working on.
It works for us!
我只见过使用 2 个字符的语言代码 - 所以我建议使用它们,除非您的工作涉及以某种方式深入研究语言学。如果您所做的只是为整个世界定制浏览体验,那么您将不需要 3 字符代码提供的额外功能。
I've only ever seen 2-character language codes in use - so I'd recommend going with them unless your work involves delving into linguistics in some way. If all you're doing is customizing the browsing experience for the world at large, you won't need the extra repertoire offered by 3-character codes.
ISO 639-1 Alpha-2 几乎得到普遍使用。
例如,它们用于 HTTP 内容协商。如果您想知道国际网站如何自动以您的母语向您显示其主页,那就是它的工作原理。 (虽然有时有点烦人。例如,我经常会看到德语的默认 Apache 主页,因为网站管理员打开了内容协商,但只放入了英语内容。)
大多数网络浏览器直接在设置对话框中使用它们。
大多数操作系统在其设置对话框或配置文件中使用它们。
维基百科在不同语言版本的服务器名称中使用它们。
换句话说:如果您的用户的母语不是英语,他们在配置软件时可能已经遇到过这些问题,否则他们将无法使用他们的计算机。
语言学家对 ISO 639 家族的其他成员最感兴趣。除非您期望耶稣基督本人(ISO 639-2 Alpha-3 代码
arc
)访问您的网站,或者克林贡语 (tlh
),否则 ISO 639-1 有更多语言超出您的希望支持。ISO 639-1 Alpha-2 are used pretty much universally.
They are used for example in HTTP content negotiation. If you ever wondered how an international website can automatically show you their homepage in your native language, that's how it works. (Although it's sometimes kinda annoying. I, for example, often get shown the default Apache homepage in German, because the webmaster turned on content negotiation, but only put content for English in.)
Most web browsers use them directly in their settings dialog box.
Most operating systems use them in their settings dialog boxes or configuration files.
Wikipedia uses them in their server names for the different language versions.
In other words: if your users aren't native English speakers, they will probably already have encountered them when configuring their software, because otherwise they wouldn't be able to use their computers.
The other members of the ISO 639 family are mostly of interest to linguists. Unless you expect Jesus Christ himself (ISO 639-2 Alpha-3 code
arc
) to visit your website, or maybe Klingons (tlh
), ISO 639-1 has more languages than you ever can hope to support.