Ruby:检查东亚宽度 (Unicode)

发布于 2024-10-11 13:18:08 字数 312 浏览 11 评论 0原文

使用 Ruby,我必须以柱状格式将字符串输出到终端。像这样的事情:

| row 1     | a string here     | etc
| row 2     | another string    | etc

我可以使用 String#ljust 和 %s 来处理拉丁 UTF8 字符。

但是,当字符是韩文、中文等时,就会出现问题。当英文行散布在包含韩文等的行中时,列根本不会对齐。

如何在此处获得列对齐?有没有办法以相当于固定宽度字体的方式输出亚洲字符?对于要在 Vim 中显示和编辑的文档怎么样?

Using Ruby, I have to output strings in an columnar format to the terminal. Something like this:

| row 1     | a string here     | etc
| row 2     | another string    | etc

I can do this fine with Latin UTF8 characters using String#ljust and %s.

But a problem arises when the characters are Korean, Chinese, etc. The columns simply won't align when there are rows of English interspersed with rows containing Korean, etc.

How can I get column alignment here? Is there a way to output Asian characters in the equivalent of a fixed-width font? How about for documents that are meant to be displayed and edited in Vim?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

终陌 2024-10-18 13:18:08

您的问题发生在 CJK(中文/日文/韩文)全角和宽字符(也向下滚动查看图表);这些字符占据两个固定宽度的单元格。 String#ljust 和朋友们没有考虑到这一点。

Python中有 unicodedata.east_asian_width,这将允许你编写自己的宽度感知 ljust,但它似乎不存在于 Ruby 中。我能找到的最好的就是这篇博客文章: http://d.hatena .ne.jp/hush_puppy/20090227/1235740342 (机器翻译 )。如果你看一下原始底部的输出,它似乎做了你想要的,所以也许你可以重用一些 Ruby 代码。

或者,如果您只打印全角字符(即您没有混合半角和全角),您可以偷懒,只使用所有内容的全角形式,包括间距和方框图。您可以复制和粘贴以下几个字符:

  • |(全角竖线)
  • (全角空格)
  • -(全角破折号;在我的终端字体中无法很好地呈现)
  • ー(另一个全角破折号)

Your problem happens with CJK (Chinese/Japanese/Korean) full-width and wide characters (also scroll down for diagrams); those characters occupy two fixed-width cells. String#ljust and friends don't take this into account.

There is unicodedata.east_asian_width in Python, which would allow you to write your own width-aware ljust, but it doesn't seem to exist in Ruby. The best I've been able to find is this blog post: http://d.hatena.ne.jp/hush_puppy/20090227/1235740342 (machine translation). If you look at the output at the bottom of the original, it seems to do what you want, so maybe you can reuse some of the Ruby code.

Or if you're only printing full-width characters (i.e. you're not mixing half-width and full-width), you can be lazy and just use full-width forms of everything, including the spacing and the box drawing. Here's a couple characters you can copy and paste:

  • | (full-width vertical bar)
  •   (full-width space)
  • - (full-width dash; does not get rendered nicely in my terminal font)
  • ー (another full-width dash)
秋叶绚丽 2024-10-18 13:18:08

虽然迟到了,但希望仍然有帮助:在 Ruby 中,您可以使用 unicode-display_width gem检查字符串的东亚宽度:

require 'unicode/display_width'
"⚀".display_width #=> 1
'一'.display_width #=> 2

Late to the party, but hopefully still helpful: In Ruby, you can use the unicode-display_width gem to check for a string's east-asian-width:

require 'unicode/display_width'
"⚀".display_width #=> 1
'一'.display_width #=> 2
暮光沉寂 2024-10-18 13:18:08

虽然迟到了,但你可以尝试 east_asian_width_simple

require 'east_asian_width_simple'
eaw = EastAsianWidthSimple.new(File.open('EastAsianWidth.txt'))
eaw.string_width('台灣 No.1') # => 9
eaw.string_width('No code, no

Late to the party, but you can try east_asian_width_simple.

require 'east_asian_width_simple'
eaw = EastAsianWidthSimple.new(File.open('EastAsianWidth.txt'))
eaw.string_width('台灣 No.1') # => 9
eaw.string_width('No code, no ????') # => 14

It aims be fast and flexible.

Fast

east_asian_width_simple is faster than other pure Ruby implementations. Below is the comparison table of time cost:

GemWidth CalculationProperty Lookup
east_asian_width_simple1x1x
east_asian_width v0.0.28.78x4.57x
reline v0.3.110.25x-
unicode-display_width v2.1.04.45x-
unicode-eaw v2.2.0-10.60x
visual_width v0.0.62.03x-

Flexible

east_asian_width_simple is flexible that it decouples the East Asian Width Property Data File.

Unlike other gems, you update by downloading the latest property file from unicode.org instead of upgrading the gem.

For example, the latest data file draft version is v15.0.0d5 but no other gem can not apply it without releasing a new gem version.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文