使 Ruby 1.9 认为所有源文件都是 UTF-8 编码的。 (即使需要重新编译解释器)
我想将 Rails 应用程序从 Ruby 1.8.7 移植到 1.9.2。有些文件在字符串和注释中都包含变音符号,例如 ä/ö/ü。 文件保存为 UTF-8,但开头没有 BOM(字节顺序标记)。
您可能知道,Ruby 1.9 拒绝解析这些文件,给出无效的多字节字符(US-ASCII)
我在谷歌上搜索并阅读了很多内容,但唯一的解决方案似乎是在
- 插入 BOM 或
- 插入
#coding: utf-8
每个文件。
我选择的编辑器 (gEdit) 似乎没有插入 BOM。我还读到,拥有 BOM 是一种不好的做法,因为它可能会破坏某些编辑器,如果您想使用 shebang 表示法,它还会破坏 shell 脚本。
编辑:BOM 破坏了 Ruby 1.8.7 解析器,给出语法错误、意外的 kEND、期望 $end (SyntaxError)
我尝试使用 ruby -Eutf-8:utf-8 强制进行外部编码,
但这在调用 rake 时似乎被忽略(我尝试过:/home/malte/.rvm/gems/ruby-1.9 .2-p180/bin/rake 测试)。
所以我的问题是:
由于 RVM 正在从源代码构建 ruby 1.9,是否有构建选项或补丁可以将默认编码从 US-ASCII 更改为 UTF-8?
我快速浏览了源代码,但找不到设置默认值的行(我不是 C 专家,很难)。
I want to port a rails app from Ruby 1.8.7 to 1.9.2. Some of the files contain umlauts like ä/ö/ü both within strings and comments.
The files were saved as UTF-8 but without a BOM (byte order mark) at the beginning.
As you might know, Ruby 1.9 refuses to parse these files, giving an invalid multibyte char (US-ASCII)
I was googling and reading a lot but the only solution to this seems to be to
- insert a BOM or
- insert
# coding: utf-8
at the beginning of each file.
My editor of choice (gEdit) doesn't seem to insert a BOM. I also read that having a BOM is bad practice because it may break some editors, it also breaks shell scripts if you want to use the shebang notation.
EDIT: The BOM breaks the Ruby 1.8.7 parser, giving a syntax error, unexpected kEND, expecting $end (SyntaxError)
for the file!
I tried forcing the external encoding with ruby -Eutf-8:utf-8 but this seems to be ignored when calling rake (I tried: /home/malte/.rvm/gems/ruby-1.9.2-p180/bin/rake test).
So my question is:
As RVM is building ruby 1.9 from source anyway, is there a build option or a patch to change the default encoding from US-ASCII to UTF-8?
I took a quick look at the source code but couldn't find the line where the default is set (I'm no C expert, tough).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我找到了一个解决方法:
设置
RUBYOPT
环境变量,例如通过执行
export RUBYOPT=-Ku
在 shell 中
。这将在调用 ruby 时设置 -Ku 作为默认选项。您现在可以调用所有其他调用 ruby 的工具,而无需担心参数。
rails server
或rake
工作并将所有文件视为 UTF-8。无需 BOM 或魔法注释!I found a workaround:
set the
RUBYOPT
environment variable, for example by executingexport RUBYOPT=-Ku
in your shell.
This will set -Ku als default option when calling ruby. You can now call all other tools which invoke ruby without worrying about parameters.
rails server
orrake
works and regards all files as UTF-8. No BOM or magic comments necessary!