如何在 Ruby 中对字符串进行 URL 编码
如何 URI::encode
一个字符串,如:
\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a
格式获取它
%124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A
以按照 RFC 1738 的
?这是我尝试过的:
irb(main):123:0> URI::encode "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a"
ArgumentError: invalid byte sequence in UTF-8
from /usr/local/lib/ruby/1.9.1/uri/common.rb:219:in `gsub'
from /usr/local/lib/ruby/1.9.1/uri/common.rb:219:in `escape'
from /usr/local/lib/ruby/1.9.1/uri/common.rb:505:in `escape'
from (irb):123
from /usr/local/bin/irb:12:in `<main>'
另外:
irb(main):126:0> CGI::escape "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a"
ArgumentError: invalid byte sequence in UTF-8
from /usr/local/lib/ruby/1.9.1/cgi/util.rb:7:in `gsub'
from /usr/local/lib/ruby/1.9.1/cgi/util.rb:7:in `escape'
from (irb):126
from /usr/local/bin/irb:12:in `<main>'
我查遍了互联网,但还没有找到一种方法来做到这一点,尽管我几乎肯定前几天我做到了这一点,没有任何麻烦。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
现在,您应该使用
ERB::Util.url_encode
或CGI.escape
。它们之间的主要区别在于它们对空格的处理:CGI.escape
遵循 CGI/HTML 表单规范 并为您提供一个application/x-www-form-urlencoded
字符串,该字符串需要将空格转义为+
,而ERB::Util.url_encode
遵循 RFC 3986,要求将它们编码为%20
。请参阅“URI.escape 和CGI.escape?”以获取更多讨论。
Nowadays, you should use
ERB::Util.url_encode
orCGI.escape
. The primary difference between them is their handling of spaces:CGI.escape
follows the CGI/HTML forms spec and gives you anapplication/x-www-form-urlencoded
string, which requires spaces be escaped to+
, whereasERB::Util.url_encode
follows RFC 3986, which requires them to be encoded as%20
.See "What's the difference between URI.escape and CGI.escape?" for more discussion.
摘自@J-Rou 的评论
Taken from @J-Rou's comment
我最初试图从完整的 URL 字符串中仅转义文件名中的特殊字符,而不是路径上的特殊字符。
ERB::Util.url_encode
不适用于我的使用:基于“https://stackoverflow.com/questions/34274838/why-is-uri-escape-marked-as”中的两个答案-obsolete-and-where-is-this-regexpunsafe-constant”,看起来
URI::RFC2396_Parser#escape
比使用更好URI::Escape#escape
。然而,它们对我的行为都是一样的:更新:我认为它来自 Ruby 3.0,
URI.escape
不再起作用。除了URI::Parser.new.escape
之外,我还没有找到替代品。I was originally trying to escape special characters in a file name only, not on the path, from a full URL string.
ERB::Util.url_encode
didn't work for my use:Based on two answers in "https://stackoverflow.com/questions/34274838/why-is-uri-escape-marked-as-obsolete-and-where-is-this-regexpunsafe-constant", it looks like
URI::RFC2396_Parser#escape
is better than usingURI::Escape#escape
. However, they both are behaving the same to me:UPDATE: I think it's from Ruby 3.0,
URI.escape
does not work any more. I have not found replacement exceptURI::Parser.new.escape
yet.您可以使用
Addressable::URI
gem 来实现:它使用比
CGI.escape
更现代的格式,例如,它正确地将空格编码为%20< /code> 而不是
+
符号,您可以在“维基百科上的 application/x-www-form-urlencoded 类型”。You can use
Addressable::URI
gem for that:It uses more modern format, than
CGI.escape
, for example, it properly encodes space as%20
and not as+
sign, you can read more in "The application/x-www-form-urlencoded type" on Wikipedia.代码:
结果:
Code:
Result:
我创建了一个 gem 来使 URI 编码内容更清晰,以便在代码中使用。它会为您处理二进制编码。
运行
gem install uri-handler
,然后使用:它将 URI 转换功能添加到 String 类中。您还可以向它传递一个带有您想要使用的可选编码字符串的参数。默认情况下,如果直接 UTF-8 编码失败,它会设置为编码“二进制”。
I created a gem to make URI encoding stuff cleaner to use in your code. It takes care of binary encoding for you.
Run
gem install uri-handler
, then use:It adds the URI conversion functionality into the String class. You can also pass it an argument with the optional encoding string you would like to use. By default it sets to encoding 'binary' if the straight UTF-8 encoding fails.
如果您想“编码”完整的 URL,而不必考虑手动将其拆分为不同的部分,我发现以下方法的工作方式与我过去使用
URI.encode
的方式相同:If you want to "encode" a full URL without having to think about manually splitting it into its different parts, I found the following worked in the same way that I used to use
URI.encode
: