Unicode 字符串与 .NET 框架默认值比较的示例
我正在寻找一些示例,说明聪明人如何以及何时对字符串与框架默认值进行 Unicode 比较。
由于许多人不使用来自其他文化的字符串,因此这里有一些我发现的有趣的比较示例。
- .ToUpper()
- 小写土耳其语“i”转换为大写 ï (U+0130)
- 等于
- 上面土耳其语示例的大写版本
- 比较句子中最后一个单词的相等性
- 希伯来语对句子中最后一个字母的处理方式与阿拉伯语的表示方式不同
- ...其他示例...
问题
Unicode 世界中常见的比较有哪些? (随意扩展语言示例)
我应该在什么情况下使用(或不使用)不区分文化的比较?这似乎可以归结为“语言”或“非语言/(二进制)”操作。
- 这与安全性和检查用户名/密码有何关系。
- 如何以及何时在语言操作和非语言操作之间进行选择?
我特别感兴趣这将如何影响中文和其他东方语言。
参考文献
在研究这个问题时,我遇到了这些网站
I'm looking for some examples of how and when smart people do a Unicode comparison of strings, versus the framework default.
Since many people don't work with strings from other cultures, here are a few interesting comparison examples I found.
- .ToUpper()
- The lowercase Turkish 'i' converts to an uppercase İ (U+0130)
- Equals
- The uppercase version of the Turkish example above
- Comparing equality for the last word in a sentence
- Hebrew treats the last letter in a sentence differently then how it would be represented in Arabic
- ... other examples ...
Question
What comparisons are common in the Unicode world? (feel free to expand on the language examples)
What situations should I use (or not use) culture insensitive comparisons? This seems to boil down to either a "linguistic", or a "non-linguistic/(binary)" operation.
- How does this relate to security and checking usernames/passwords.
- How and when does one choose between linguistic operations and non-linguistic operations?
I'm particularly interested how this would affect Chinese and other eastern languages.
References
While researching this question, I came across these sites
Joel on Software: What every developer should know about Unicode
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
最佳实践文档:http://msdn.microsoft.com/en-us/library /dd465121.aspx
Best practices doc: http://msdn.microsoft.com/en-us/library/dd465121.aspx