Ruby:检查东亚宽度 (Unicode)
使用 Ruby,我必须以柱状格式将字符串输出到终端。像这样的事情:
| row 1 | a string here | etc
| row 2 | another string | etc
我可以使用 String#ljust 和 %s 来处理拉丁 UTF8 字符。
但是,当字符是韩文、中文等时,就会出现问题。当英文行散布在包含韩文等的行中时,列根本不会对齐。
如何在此处获得列对齐?有没有办法以相当于固定宽度字体的方式输出亚洲字符?对于要在 Vim 中显示和编辑的文档怎么样?
Using Ruby, I have to output strings in an columnar format to the terminal. Something like this:
| row 1 | a string here | etc
| row 2 | another string | etc
I can do this fine with Latin UTF8 characters using String#ljust and %s.
But a problem arises when the characters are Korean, Chinese, etc. The columns simply won't align when there are rows of English interspersed with rows containing Korean, etc.
How can I get column alignment here? Is there a way to output Asian characters in the equivalent of a fixed-width font? How about for documents that are meant to be displayed and edited in Vim?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您的问题发生在 CJK(中文/日文/韩文)全角和宽字符(也向下滚动查看图表);这些字符占据两个固定宽度的单元格。
String#ljust
和朋友们没有考虑到这一点。Python中有
unicodedata.east_asian_width
,这将允许你编写自己的宽度感知 ljust,但它似乎不存在于 Ruby 中。我能找到的最好的就是这篇博客文章: http://d.hatena .ne.jp/hush_puppy/20090227/1235740342 (机器翻译 )。如果你看一下原始底部的输出,它似乎做了你想要的,所以也许你可以重用一些 Ruby 代码。或者,如果您只打印全角字符(即您没有混合半角和全角),您可以偷懒,只使用所有内容的全角形式,包括间距和方框图。您可以复制和粘贴以下几个字符:
Your problem happens with CJK (Chinese/Japanese/Korean) full-width and wide characters (also scroll down for diagrams); those characters occupy two fixed-width cells.
String#ljust
and friends don't take this into account.There is
unicodedata.east_asian_width
in Python, which would allow you to write your own width-aware ljust, but it doesn't seem to exist in Ruby. The best I've been able to find is this blog post: http://d.hatena.ne.jp/hush_puppy/20090227/1235740342 (machine translation). If you look at the output at the bottom of the original, it seems to do what you want, so maybe you can reuse some of the Ruby code.Or if you're only printing full-width characters (i.e. you're not mixing half-width and full-width), you can be lazy and just use full-width forms of everything, including the spacing and the box drawing. Here's a couple characters you can copy and paste:
虽然迟到了,但希望仍然有帮助:在 Ruby 中,您可以使用 unicode-display_width gem检查字符串的东亚宽度:
Late to the party, but hopefully still helpful: In Ruby, you can use the unicode-display_width gem to check for a string's east-asian-width:
虽然迟到了,但你可以尝试 east_asian_width_simple。
Late to the party, but you can try east_asian_width_simple.
It aims be fast and flexible.
Fast
east_asian_width_simple is faster than other pure Ruby implementations. Below is the comparison table of time cost:
Flexible
east_asian_width_simple is flexible that it decouples the East Asian Width Property Data File.
Unlike other gems, you update by downloading the latest property file from unicode.org instead of upgrading the gem.
For example, the latest data file draft version is v15.0.0d5 but no other gem can not apply it without releasing a new gem version.