在 Ruby 1.8 和 1.9 中使用相同的代码处理字符串编码
我有一个 gem ,很多人使用了很多不同的 Ruby 解释器,并且它包含归结为以下代码的内容:
res = RestClient.post(...)
doc = REXML::Document.new(res).root
res
的内容始终为 UTF-8,这在 Ruby 1.8 中工作正常,但如果响应不是纯 ASCII,则在 Ruby 1.9 下它会崩溃并且用户的默认编码不是UTF-8。
现在,如果我想单独在 Ruby 1.9 上实现此功能,我只需将 res.force_encoding('utf-8')
粘贴在那里即可完成,但该方法是 1.9-仅然后在 Ruby 1.8 下中断:
NoMethodError: undefined method `force_encoding' for #<String:0x101318178>
最好的解决方案是这样,它强制系统范围内的默认编码为 UTF-8:
Encoding.default_external = 'UTF-8' if defined? Encoding
更好的想法,或者这已经是最好的了吗?会对尝试使用不同编码的图书馆用户产生负面影响吗?
I've got a gem that's used a bunch of people using a bunch of different Ruby interpreters, and it includes what boils down to this code:
res = RestClient.post(...)
doc = REXML::Document.new(res).root
The content of res
is always UTF-8, and this works fine in Ruby 1.8, but it blows up under Ruby 1.9 if the response is not pure ASCII and the user's default encoding is not UTF-8.
Now, if I wanted to make this work on Ruby 1.9 alone, I'd just stick res.force_encoding('utf-8')
in there and be done with it, but that method is 1.9-only and then breaks under Ruby 1.8:
NoMethodError: undefined method `force_encoding' for #<String:0x101318178>
The best solution can come up with is this, which forces the systemwide default encoding to UTF-8:
Encoding.default_external = 'UTF-8' if defined? Encoding
Better ideas, or is this as good as it gets? Will there be any negative impact on library users who are trying to use different encodings?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
为了向后兼容,我会做类似的事情。
I'd do something like that for backwards compatibility.
我同意 Mike Lewis 使用
respond_to
,但不要在整个代码中的任何地方对变量 res 执行此操作。我查看了 gateway.rb 中的代码 看起来你在使用
res
的任何地方,它都是通过调用make_api_request
来设置的,因此你可以在该方法中的 return 语句之前添加它:即使它是其他地方,但它并不是字面意义上的你遇到的每个字符串,我相信你可以找到一种方法来重构有意义的代码,并在一个地方解决问题,而不是在你遇到的任何地方解决问题。
其他地方有问题吗?
I'm with Mike Lewis in using
respond_to
, but don't do it on the variable res everywhere throughout your code.I took a look at your code in gateway.rb and it looks like everywhere you are using
res
, it gets set by a call tomake_api_request
so you could add this before your return statement in that method:Even if it's other places but it's not literally with every string you encounter, I'm sure you can find a way to refactor the code that makes sense and solves the problems in one place instead of everywhere you encounter it.
Are you having a problem with other places?
据我从代码片段中看到,问题的原因是
RestClient
,它没有以正确的编码返回字符串(HTTP 响应中指定的编码),所以我首先尝试解决这个问题。如果无法做到这一点,那么您可以使用强制编码的代码包装RestClient
调用(按照 Mike Lewis 建议的方式)。或者您在RestClient
调用以外的地方也遇到了问题?As far as I can see from the snippet, the cause of the problem is
RestClient
, which doesn't return string in proper encoding (the one specified in HTTP response), so I'd first try to get that problem fixed. If that can't be done, then you could wrapRestClient
calls with your code that forces the encoding (the way Mike Lewis suggested). Or you are experiencing the problem on places other thanRestClient
calls as well?如果您在使用此方法的特定文件中包含
#encoding: utf-8
标头,它是否有效。Ruby 1.9 在整个应用程序中支持不同的编码,如果该内容是 utf-8 编码的,则应该可以正常工作。
Ruby 1.8 会简单地忽略
#encoding
标头并继续正常工作。这是一个非常简单的方法,但我相信它值得一试!
Does it work if you include an
#encoding: utf-8
header in this particular file that uses this method.Ruby 1.9 support different encodings throughout the application and should work fine if this content is utf-8 encoded.
Ruby 1.8 would simply ignore the
#encoding
header and keep on working nicely.It's a very simple approach but i believe it deserves a try!