如何统计日英混合字符串所需的列数?
我的字符串包含日语(双宽)和英语(单宽)字符的混合:
string str = "女性love";
在 C# 中,我的方法必须将日语字符计为两列,将英语字符计为一列。 这样上面的字符串应该有 8 列:
2 + 2 + 1 + 1 + 1 + 1 = 8
My string contains a mix of japanese (double width) and english (single width) characters:
string str = "女性love";
In C#, my method has to count japanese characters as two columns and english characters as one.
So that the above string should get me a 8 columns :
2 + 2 + 1 + 1 + 1 + 1 = 8
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
![扫码二维码加入Web技术交流群](/public/img/jiaqun_03.jpg)
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
也许你想要这样的东西,非常粗糙,但是通过一点点工作,你可以让它变得更好:
现在为什么
System.Text.Encoding.UTF8.GetBytes(str).Length
返回 10,它只是导致UTF8
编码规范。请点击此链接Joel on Unicode 并阅读整篇文章。特别是关于这个问题最重要的事情:检查你的日文字母代码点,你会找出为什么它返回 10 的答案。
编辑
请注意,此代码实际上将英文字母与“其他”分开,而不仅仅仅与日本的。如果您只需要过滤日语字母,因为您可能需要处理阿拉伯语、埃布拉语、俄语或其他语言,您需要了解日语字母在代码方面的限制。
问候。
Probbaly you want something like this, very rough one, but by working a little bit on it you can make it much nicer:
Now what about why
System.Text.Encoding.UTF8.GetBytes(str).Length
returns 10, it simply causeUTF8
ecoding specification. Follow this link Joel on Unicode and read entire article. In particular here is most importnat stuff in regard of this question:Check your Japanese letters code points and you will figure out an aswer on why it returns 10.
EDIT
Pay attention that this code, actually separate English letters from "others", and not only from Japanese ones. If you need to filter only on Japanese ones, cause may be you need to deal with Arabic, Ebraic, Russian or whatever, you need to know limits, in terms of codes, of Japanese alphabet.
Regards.
尝试这样的事情:
Try something like this: