Ruby 1.9.2：irb 抛出 ArgumentError：输入德语变音符号时 UTF-8 中的字节序列无效

发布于 2024-10-17 03:01:43 字数 895 浏览 8 评论 0原文

我想在我的 irb 中输入德语变音符号，但出现奇怪的错误。我可以毫无问题地输入 äöü 的任何字符，但是每个 äÖÜß 都会导致以下错误：

$ irb
ruby-1.9.2-p136 :001 > ? # here I entered Ü but it displays only ?
/Users/lorenz/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/irb/ruby-lex.rb:728:in
`block in lex_int2': invalid byte sequence in UTF-8 (ArgumentError)

我已经查看了很多有关 Ruby、rvm 和 UTF 的问题-8 但没有任何帮助。大多数都与 Rails 或数据库配置相关。我专门检查了以下内容：

locale 设置正确

$ locale
LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_ALL="de_DE.UTF-8"

Terminal.app 设置为 Unicode (UTF-8) 并且 Encoding.default_external 设置正确：

$ irb
ruby-1.9.2-p136 :001 > Encoding.default_external
 => #<Encoding:UTF-8>

为什么这在 Ruby 中仍然如此困难？

原文

I want to enter German Umlauts in my irb but get a weird error. I can enter any character of äöü without problems, but each of ÄÖÜß leads to the following error:

$ irb
ruby-1.9.2-p136 :001 > ? # here I entered Ü but it displays only ?
/Users/lorenz/.rvm/rubies/ruby-1.9.2-p136/lib/ruby/1.9.1/irb/ruby-lex.rb:728:in
`block in lex_int2': invalid byte sequence in UTF-8 (ArgumentError)

I have looked at a lot of SO questions regarding Ruby, rvm, and UTF-8 but none helped. Most are tied to rails or database configuration. I specifically checked the following:

locale is set correctly

$ locale
LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_ALL="de_DE.UTF-8"

Terminal.app is set to Unicode (UTF-8) and Encoding.default_external is set correctly:

$ irb
ruby-1.9.2-p136 :001 > Encoding.default_external
 => #<Encoding:UTF-8>

Why is this still so difficult in Ruby?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏末 2024-10-24 03:01:43

通常您可以使用#coding: UTF-8为文件设置编码。

如果是 irb，可能需要提前明确设置：

irb -E UTF-8:UTF-8

这会将内部和外部编码设置为 UTF -8 在irb上。

或者另外尝试

irb -U

将内部编码设置为 UTF-8。

回复收藏 0 原文

杯别 2024-10-24 03:01:43

我不知道如何解决这个问题，但可以肯定的是，这是 irb 独有的事情，我注意到很多时候 irb 有自己独特的处理用户输入的方法（它甚至可能是 readline 的限制），而且它只与某些角色配合得很好。

您可以做一个简单的测试来检查，使用以下命令创建一个新的 rb 文件

# encoding: utf-8
puts "test: Ü"

并执行它，它是否有效？

虽然这仍然是一个麻烦，但到目前为止，这对我来说还不是一个足够大的问题，不需要真正寻找解决方案。

I don't know how to solve the problem but the sure thing is this is an irb only thing, I noticed many times irb has its own unique of dealing with user's inputs (it may even well be a limitation in readline) and it only works well with some characters.

You can do a simple test to check that, create a new rb file with: