icu 排序规则支持哪些语言?
我正在浏览 ICU 源代码 (http://icu-project.org/),我无法找不到它支持哪些开箱即用的语言进行排序。有人可以帮助我吗?
I was browsing through the ICU source code (http://icu-project.org/), and I couldn't find what languages it supports out of the box for collation. Could someone help me?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
编辑:请注意,此列表是几年前编写的。点击链接获取更新列表。 CLDR 不再通告声称隐式支持哪些子区域。
colfiles.mk 列出剪裁和别名。
除了 root (UCA) 之外,还有以下定制: (COLLATION_SOURCE) af ar as az be bg bn bs ca cs cy da de el eo es et fa fa_AF fi fil fo fr gu ha haw he hi hr hu hy ig is ja kk kl km kn ko kok lt lv mk ml mr mt nb nn om 或 pa pl ps ro ru si sk sl sq sr sr_Latn sv ta te th to tr uk ur vi yo zh zh_Hant
但是,许多语言(例如英语、意大利语、日语等) ) 未列出,因为根(UCA、后备)行为是正确的。
COLLATION_EMPTY_SOURCE 包含被视为有效的其他语言环境列表: af_NA af_ZA ar_AE ar_BH ar_DZ ar_EG ar_IQ ar_JO ar_KW ar_LB ar_LY ar_MA ar_OM ar_QA ar_SA ar_SD ar_SY ar_TN ar_YE as_IN az_Latn az_Latn_ AZ be_BY bg_BG bn_BD bn_IN bs_BA ca_ES chr chr_US cs_CZ cy_GB da_DK de_AT de_BE de_CH de_DE de_LI de_LU el_CY el_GR en en_AS en_AU en_BE en_BW en_BZ en_CA en_GB en_GU en_HK en_IE en_IN en_JM en_MH en_MP en_MT en_MU en_NA en_NZ en_PH en_PK en_SG en_TT en_UM en_US en_US_POSIX en_VI en_ZA en_ZW es_419 es_AR es_ BO es_CL es_CO es_CR es_DO es_EC es_ES es_GQ es_GT es_HN es_MX es_NI es_PA es_PE es_PR es_PY es_SV es_US es_UY es_VE et_EE fa_IR fi_FI fil_PH fo_FO fr_BE fr_BF fr_BI fr_BJ fr_BL fr_CA fr_CD fr_CF fr_CG fr_CH fr_CI fr_CM fr_DJ fr_FR fr_GA fr_GN fr_GP fr_GQ fr_KM fr_LU fr_MC fr_MF fr_MG fr_ML fr_MQ fr_NE fr_RE fr_RW fr_SN fr_TD fr_TG ga ga_IE gu_IN ha_Latn ha_Latn_GH ha_Latn_NE ha_Latn_NG he_IL hi_IN hr_HR hu_HU hy_AM id id_ID ig_NG is_is_IS it_CH it_IT ja_JP ka ka_GE kk_KZ kl_GL kn_IN ko_KR kok_IN lt_LT lv_LV mk_MK ml_IN mr_IN ms ms_BN ms_MY mt_MT nb_NO nl nl_BE nl_NL nn_NO om_ET om_KE or_IN pa_Arab pa_Arab_PK pa_Guru pa_Guru_IN pl_PL ps_AF pt pt_BR pt_PT ro_MD ro_RO ru_MD ru_RU ru_UA si_LK sk_SK sl_SI sq_AL sr_Cyrl sr_Cyrl_BA sr_Cyrl_ME sr_Cyrl_RS sr_Latn_BA sr_Latn_ME sr_Latn_RS sv_FI sv_SE sw sw_KE sw_TZ ta_IN ta_LK te_IN th_TH tr_TR uk_UA ur _IN ur_PK vi_VN yo_NG zh_Hans zh_Hans_CN zh_Hans_SG zh_Hant_HK zh_Hant_MO zh_Hant_TW zu zu_ZA
希望这会有所帮助。
所有这些数据均来自 Unicode CLDR。
Edit: Note that this list was written a couple of years ago. Follow the links for updated lists. CLDR no longer advertises which sublocales are claimed to be supported implicitly.
colfiles.mk lists the tailorings and aliases.
Besides root (UCA) there are tailorings for: (COLLATION_SOURCE) af ar as az be bg bn bs ca cs cy da de el eo es et fa fa_AF fi fil fo fr gu ha haw he hi hr hu hy ig is ja kk kl km kn ko kok lt lv mk ml mr mt nb nn om or pa pl ps ro ru si sk sl sq sr sr_Latn sv ta te th to tr uk ur vi yo zh zh_Hant
However, many languages (such as English, Italian, Japanese, … ) are not listed, because the root (UCA, fallback) behavior is correct.
COLLATION_EMPTY_SOURCE has the list of additional locales which are considered to be valid: af_NA af_ZA ar_AE ar_BH ar_DZ ar_EG ar_IQ ar_JO ar_KW ar_LB ar_LY ar_MA ar_OM ar_QA ar_SA ar_SD ar_SY ar_TN ar_YE as_IN az_Latn az_Latn_AZ be_BY bg_BG bn_BD bn_IN bs_BA ca_ES chr chr_US cs_CZ cy_GB da_DK de_AT de_BE de_CH de_DE de_LI de_LU el_CY el_GR en en_AS en_AU en_BE en_BW en_BZ en_CA en_GB en_GU en_HK en_IE en_IN en_JM en_MH en_MP en_MT en_MU en_NA en_NZ en_PH en_PK en_SG en_TT en_UM en_US en_US_POSIX en_VI en_ZA en_ZW es_419 es_AR es_BO es_CL es_CO es_CR es_DO es_EC es_ES es_GQ es_GT es_HN es_MX es_NI es_PA es_PE es_PR es_PY es_SV es_US es_UY es_VE et_EE fa_IR fi_FI fil_PH fo_FO fr_BE fr_BF fr_BI fr_BJ fr_BL fr_CA fr_CD fr_CF fr_CG fr_CH fr_CI fr_CM fr_DJ fr_FR fr_GA fr_GN fr_GP fr_GQ fr_KM fr_LU fr_MC fr_MF fr_MG fr_ML fr_MQ fr_NE fr_RE fr_RW fr_SN fr_TD fr_TG ga ga_IE gu_IN ha_Latn ha_Latn_GH ha_Latn_NE ha_Latn_NG he_IL hi_IN hr_HR hu_HU hy_AM id id_ID ig_NG is_IS it it_CH it_IT ja_JP ka ka_GE kk_KZ kl_GL kn_IN ko_KR kok_IN lt_LT lv_LV mk_MK ml_IN mr_IN ms ms_BN ms_MY mt_MT nb_NO nl nl_BE nl_NL nn_NO om_ET om_KE or_IN pa_Arab pa_Arab_PK pa_Guru pa_Guru_IN pl_PL ps_AF pt pt_BR pt_PT ro_MD ro_RO ru_MD ru_RU ru_UA si_LK sk_SK sl_SI sq_AL sr_Cyrl sr_Cyrl_BA sr_Cyrl_ME sr_Cyrl_RS sr_Latn_BA sr_Latn_ME sr_Latn_RS sv_FI sv_SE sw sw_KE sw_TZ ta_IN ta_LK te_IN th_TH tr_TR uk_UA ur_IN ur_PK vi_VN yo_NG zh_Hans zh_Hans_CN zh_Hans_SG zh_Hant_HK zh_Hant_MO zh_Hant_TW zu zu_ZA
Hope this helps.
All of this data comes from Unicode CLDR.