特定于区域设置的查找表
我正在使用查找表来优化适用于单个字符的算法。目前我正在将 a..z、A..Z、0..9 添加到查找表中。这在欧洲国家行得通,但在亚洲国家却没有多大意义。
我的想法是,我也许可以使用 Windows 默认代码页中的字符作为查找表的字母表。
伪代码:
for Ch in DefaultCodePage.Characters do
LookupTable.Add (Ch, ComputeValue (Ch));
您认为如何以及如何实现这一目标?还有其他建议吗?
I'm using a lookup table for optimizing an algorithm that works on single characters. Currently I'm adding a..z, A..Z, 0..9 to the lookup table. This works fine in european countries, but in asian countries it doesn't make much sense.
My idea was that I could perhaps use the characters in the windows default code page as an alphabet for the lookup table.
Pseudocode:
for Ch in DefaultCodePage.Characters do
LookupTable.Add (Ch, ComputeValue (Ch));
What do you think and how could this be achieved? Any alternative suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
正如您所提到的,对于不同的脚本来说没有多大意义。它可能只对基于字母的语言有意义。
顺便提一句。 AZ 对于大多数欧洲语言来说是不够的。
我不太清楚你在做什么以及你需要这个查找表做什么,但似乎你正在寻找的是索引字符。您可以在CLDR中找到此类信息 - 查找indexCharacters。 此处提供了各种语言的资源。
您将面临的唯一问题是,实际上对于某些语言,索引字符往往是基于拉丁语的。那只是因为这些语言实际上没有它们......在这种情况下,您可能想使用所谓的示例字符来代替,但请注意,这对于某些用例来说可能还不够。
As you mentioned, it does not make much sense for different scripts. It may only make some sense for alphabet-based languages.
BTW. A-Z is not enough for most of European languages.
I don't quite know what you are doing and what you need this look-up table for but it seems that what you are looking for are Index Characters. You could find such information in CLDR – look for indexCharacters. The resources for various languages are available here.
The only problem you'll face that in fact for some languages Index Characters tend to be Latin based. That is just because these languages do not actually have them... In that case you might want to use so called Exemplar Characters instead but please be warned that it might be just not enough for some use cases.