Ruby:转义 unicode 字符串
Unicode字符串:
string = "CEO Frye \u2013 response to Capitalism discussion in Davos: Vote aggressively with your wallet against firms without social conscience."
我尝试过(通过 Is这是在 Ruby 中取消转义 unicode 转义序列的最佳方法吗?):
def unescape_unicode(s)
s.gsub(/\\u([\da-fA-F]{4})/) {|m| [$1].pack("H*").unpack("n*").pack("U*")}
end
unescape_unicode(string) #=> CEO Frye \u2013 response to Capitalism discussion in Davos: Vote aggressively with your wallet against firms without social conscience.
但是输出(到文件)仍然与输入相同!任何帮助将不胜感激。
编辑: 不使用 IRB,使用 RubyMine,并且输入是从 Twitter 解析的,因此单个 "\u"
不是 "\\u"
编辑 2:
Unicode string:
string = "CEO Frye \u2013 response to Capitalism discussion in Davos: Vote aggressively with your wallet against firms without social conscience."
I tried (via Is this the best way to unescape unicode escape sequences in Ruby?):
def unescape_unicode(s)
s.gsub(/\\u([\da-fA-F]{4})/) {|m| [$1].pack("H*").unpack("n*").pack("U*")}
end
unescape_unicode(string) #=> CEO Frye \u2013 response to Capitalism discussion in Davos: Vote aggressively with your wallet against firms without social conscience.
But output (to file) is still identical to input! Any help would be appreciated.
Edit:
Not using IRB, using RubyMine, and input is parsed from Twitter, hence the single "\u"
not "\\u"
Edit 2:
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您是从 irb 尝试,还是用 p 输出字符串?
String#inspect
(从irb
和p str
调用)将 unicode 字符转换为\uxxxx
格式以允许字符串可以在任何地方打印。另外,当您输入“CEO Frye \u2013 response to...”时,这是由 ruby 解析器解析的转义序列。它是最终字符串中的 unicode 字符。Are you trying it from
irb
, or outputting the string withp
?String#inspect
(called fromirb
andp str
) transform unicode characters into\uxxxx
format to allow the string to be printed anywhere. Also, when you type"CEO Frye \u2013 response to..."
, this is a escaped sequence resolved by the ruby parser. It is a unicode character in the final string.