使用 ruby​​ 进行文件编码

发布于 2024-09-02 23:39:02 字数 527 浏览 8 评论 0原文

我在文件编码方面遇到了一些问题。

我收到一个 url 编码的字符串,如“sometext%C3%B3+more+%26+andmore”,对其进行转义,处理数据,并使用 windows-1252 编码保存。

转换如下:

irb(main) >> value
=> "sometext%C3%B3+more+%26+andmore"
irb(main) >> CGI::unescape(value)
=> "sometext\303\263 more & andmore"
irb(main) >> #Some code and saved into a file using open(filename, "w:WINDOWS-1252")
irb(main) >> # result in the file:
=> sometextĂ³ more & andmore

结果应该是 sometextó more &还有更多

I'm having a bit problems with file encodings.

I'm receiving a url-encoded string like "sometext%C3%B3+more+%26+andmore", unescape it, process the data, and save it with windows-1252 encoding.

The conversions are these:

irb(main) >> value
=> "sometext%C3%B3+more+%26+andmore"
irb(main) >> CGI::unescape(value)
=> "sometext\303\263 more & andmore"
irb(main) >> #Some code and saved into a file using open(filename, "w:WINDOWS-1252")
irb(main) >> # result in the file:
=> sometextĂ³ more & andmore

And the result should be sometextó more & andmore

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

赠意 2024-09-09 23:39:02

Ruby 1.9 中已添加编码支持,因此以下代码来自 Ruby 1.9.1:

require 'cgi'
#=> true
s = "sometext%C3%B3+more+%26+andmore"
#=> "sometext%C3%B3+more+%26+andmore"
t = CGI::unescape s
#=> "sometext\xC3\xB3 more & andmore"
t.force_encoding 'utf-8' # telling Ruby that the string is UTF-8 encoded
#=> "sometextó more & andmore"
t.encode! 'windows-1252' # changing encoding to windows-1252
#=> "sometext? more & andmore"
# here you do whatever you want to do with windows-1252 encoded string

此处 你有很多关于 Ruby 和编码的信息。

附言。 Ruby 1.8.7 没有内置的编码支持,因此您必须使用一些外部库进行转换,例如

require 'iconv'
#=> true
require 'cgi'
#=> true
s = "sometext%C3%B3+more+%26+andmore"
#=> "sometext%C3%B3+more+%26+andmore"
t = CGI::unescape s
#=> "sometext\303\263 more & andmore"
Iconv.conv 'windows-1252', 'utf-8', t
#=> "sometext\363 more & andmore"
# \363 is ó in windows-1252 encoding

Encoding support has been added to Ruby 1.9, so the following code is from Ruby 1.9.1:

require 'cgi'
#=> true
s = "sometext%C3%B3+more+%26+andmore"
#=> "sometext%C3%B3+more+%26+andmore"
t = CGI::unescape s
#=> "sometext\xC3\xB3 more & andmore"
t.force_encoding 'utf-8' # telling Ruby that the string is UTF-8 encoded
#=> "sometextó more & andmore"
t.encode! 'windows-1252' # changing encoding to windows-1252
#=> "sometext? more & andmore"
# here you do whatever you want to do with windows-1252 encoded string

Here you have lots of informations on Ruby and encodings.

PS. Ruby 1.8.7 doesn't have built-in support for encodings, so you have to use some external library for conversion, for example iconv:

require 'iconv'
#=> true
require 'cgi'
#=> true
s = "sometext%C3%B3+more+%26+andmore"
#=> "sometext%C3%B3+more+%26+andmore"
t = CGI::unescape s
#=> "sometext\303\263 more & andmore"
Iconv.conv 'windows-1252', 'utf-8', t
#=> "sometext\363 more & andmore"
# \363 is ó in windows-1252 encoding
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文