是否有区域设置解析的标准算法?
为了支持软件国际化,许多编程语言和平台都支持获取本地化资源的方法,以在向用户显示的 UI 中使用(例如 Java 的 java.util.ResourceBundle
类)。通常,如果用户首选区域设置的资源不可用,则存在后备机制或区域设置解析过程,它将尝试从可用资源集中查找最接近匹配的资源。例如,如果 en-US
的资源不可用,则系统通常会尝试查找 en
的资源。
对于许多语言和平台的资源捆绑解决方案来说,区域设置解析过程似乎几乎相同。他们是否遵循某种标准的区域设置解析算法,或者如果没有,是否存在这样的标准?
In support of software internationalization, many programming languages and platforms support a means of obtaining localized resources to be used in the UI that is shown to the user (e.g. Java's java.util.ResourceBundle
class). Often, if resources for the user's preferred locale are not available, then there is a fallback mechanism, or locale resolution process, that will attempt to locate the nearest-matching resources from the sets of available resources. For example, if resources for en-US
are not available, then commonly the system attempts to find resources for en
.
The locale resolution process seems nearly the same for many languages' and platforms' resource bundle solutions. Are they following some standard locale resolution algorithm, or, if not, does such a standard exist?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
显然有 RFC 4647,语言标签匹配,它描述了“语言”的语法-ranges”用于指定用户的首选语言列表,以及用于将语言范围与 RFC 4646 语言进行比较和匹配的“过滤”和“查找”机制标签。 RFC 4647 将这些机制描述为:
There is apparently RFC 4647, Matching of Language Tags, which describes the syntax of "language-ranges" for specifying the list of a user's preferred languages, as well as the "filtering" and "lookup" mechanisms for comparing and matching language-ranges to RFC 4646 language tags. RFC 4647 describes these mechanisms as:
我不知道标准本身。
然而,所使用的算法是区域设置是分层的这一事实的一个微不足道的结果。有一个(名义上的)根区域设置没有名称。下面是仅语言的区域设置(en、fr 等)。下面是国家语言环境(en_GB、en_US 等)。在这些下面是可选的变体语言环境(en_GB_Yorkshire、en_GB_cockney 等 - 对于实际示例,请查看挪威)。
寻找合适资源的自然方法是从最低的、最具体的区域开始,然后沿着树向上走,直到找到一些东西。因此,从 en_US_TX 开始,逐步上升到 en_US,然后是 en,然后是根。
I'm not aware of a standard per se.
However, the algorithm being used is a trivial consequence of the fact that locales are hierarchical. There is a (notional) root locale with no name. Beneath this are language-only locales (en, fr, etc). Beneath those are national locales (en_GB, en_US, etc). Beneath those are, optionally, variant locales (en_GB_Yorkshire, en_GB_cockney, etc - for realistic examples, look at Norway).
The natural way to find an appropriate resource is to start with the lowest, most specific, locale you can, and walk up the tree until you find something. So, starting with en_US_TX, you step up to en_US, then en, then the root.
CLDR - Unicode 通用区域设置数据存储库 提出了一个基于 语言距离。如果没有距离数据,这不是一个解决方案,但值得关注未来的解决方案。
The CLDR - Unicode Common Locale Data Repository has a proposed (as of 2015) algorithm based on language distance. Without the distance data this is not a solution, but is worth watching for a solution in the future.