JSON 编码错误转义（Rails 3、Ruby 1.9.2）

发布于 2024-10-19 19:50:40 字数 217 浏览 4 评论 0原文

在我的控制器中，以下内容有效（打印“oké”）

puts obj.inspect

但这不起作用（呈现“ok\u00e9”）

render :json => obj

显然 to_json 方法转义了 unicode 字符。有没有办法可以防止这种情况发生？

原文

In my controller, the following works (prints "oké")

puts obj.inspect

But this doesn't (renders "ok\u00e9")

render :json => obj

Apparently the to_json method escapes unicode characters. Is there an option to prevent this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

青春有你 2024-10-26 19:50:41

要将 \uXXXX 代码设置回 utf-8：

json_string.gsub!(/\\u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")}

To set the \uXXXX codes back to utf-8:

json_string.gsub!(/\\u([0-9a-z]{4})/) {|s| [$1.to_i(16)].pack("U")}

回复收藏 0 原文

一抹苦笑 2024-10-26 19:50:41

你可以通过猴子补丁来防止它，mu提到的方法太短了。将以下内容放入 config/initializers/patches.rb （或用于修补内容的类似文件）中，然后重新启动 Rails 进程以使更改生效。

module ActiveSupport::JSON::Encoding
  class << self
    def escape(string)
      if string.respond_to?(:force_encoding)
        string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
      end
      json = string.gsub(escape_regex) { |s| ESCAPED_CHARS[s] }
      json = %("#{json}")
      json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
      json
    end
  end
end

请注意，无法保证该补丁适用于 ActiveSupport 的未来版本。写这篇文章时使用的版本是3.1.3。

You can prevent it by monkey-patching the method mentioned by mu is too short. Put the following into config/initializers/patches.rb (or similar file used for patching stuff) and restart your rails process for the change to take effect.

module ActiveSupport::JSON::Encoding
  class << self
    def escape(string)
      if string.respond_to?(:force_encoding)
        string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
      end
      json = string.gsub(escape_regex) { |s| ESCAPED_CHARS[s] }
      json = %("#{json}")
      json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
      json
    end
  end
end

Be advised that there's no guarantee that the patch will work with future versions of ActiveSupport. The version used when writing this post is 3.1.3.

回复收藏 0 原文

浮生未歇 2024-10-26 19:50:41

如果您深入研究源代码，您最终会看到 ActiveSupport： :JSON::Encoding 和 escape 方法：

def escape(string)
  if string.respond_to?(:force_encoding)
    string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
  end
  json = string.
    gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
    gsub(/([\xC0-\xDF][\x80-\xBF]|
           [\xE0-\xEF][\x80-\xBF]{2}|
           [\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
    s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')
  }
  json = %("#{json}")
  json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
  json
end

各种 gsub 调用将非 ASCII UTF-8 强制转换为您所看到的 \uXXXX 表示法。十六进制编码的 UTF-8 应该可以接受任何处理 JSON 的内容，但您始终可以对 JSON（或修改后的 JSON 转义器中的猴子补丁）进行后处理，以将 \uXXXX 表示法转换为原始 UTF-8如果需要的话。

我同意强制 JSON 为 7 位干净有点假，但你就知道了。

简短的回答：不。

If you dig through the source you'll eventually come to ActiveSupport::JSON::Encoding and the escape method:

def escape(string)
  if string.respond_to?(:force_encoding)
    string = string.encode(::Encoding::UTF_8, :undef => :replace).force_encoding(::Encoding::BINARY)
  end
  json = string.
    gsub(escape_regex) { |s| ESCAPED_CHARS[s] }.
    gsub(/([\xC0-\xDF][\x80-\xBF]|
           [\xE0-\xEF][\x80-\xBF]{2}|
           [\xF0-\xF7][\x80-\xBF]{3})+/nx) { |s|
    s.unpack("U*").pack("n*").unpack("H*")[0].gsub(/.{4}/n, '\\\\u\&')
  }
  json = %("#{json}")
  json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
  json
end

The various gsub calls are forcing non-ASCII UTF-8 to the \uXXXX notation that you're seeing. Hex encoded UTF-8 should be acceptable to anything that processes JSON but you could always post-process the JSON (or monkey patch in a modified JSON escaper) to convert the \uXXXX notation to raw UTF-8 if necessary.

I'd agree that forcing JSON to be 7bit-clean is a bit bogus but there you go.

Short answer: no.

回复收藏 0 原文

帅冕 2024-10-26 19:50:41

使用 Rails2.3.11/Ruby1.8 中的其他方法不会将字符转义为 unicode，因此我使用了以下方法：

render :json => JSON::dump(obj)

Characters were not escaped to unicode with the other methods in Rails2.3.11/Ruby1.8 so I used the following:

render :json => JSON::dump(obj)

回复收藏 0 原文

手长情犹 2024-10-26 19:50:41

这是正确的编码。 JSON 不要求转义 Unicode 字符，但 JSON 库通常会生成仅包含 7 位 ASCII 字符的输出，以避免传输过程中出现任何潜在的编码问题。

任何 JSON 解释器都能够使用该字符串并重现原始字符串。要查看实际效果，只需在浏览器的地址栏中输入 javascript:alert("ok\u00e9") 即可。

回复收藏 0 原文

看透却不说透 2024-10-26 19:50:41

如果对象不是字符串，则 render :json 将调用 .to_json 。您可以通过执行以下操作来避免此问题：

render :json => JSON.generate(obj)

这将直接传递字符串，从而避免调用 ActiveSupport 的 to_json。

另一种方法是在要序列化的对象上覆盖 to_json ，因此在这种情况下，您可以执行以下操作：

class Foo < ActiveRecord::Base
  def to_json(options = {})
    JSON.generate(as_json)
  end
end

如果您使用 ActiveModelSerializers，则可以通过在序列化程序中覆盖 to_json 来解决此问题：

# controller
respond_with foo, :serializer => MySerializer

# serializer
attributes :bar, :baz

def to_json(options = {})
  JSON.generate(serializable_hash)
end

render :json will call .to_json on the object if it's not a string. You can avoid this problem by doing:

render :json => JSON.generate(obj)

This will by pass a string directly and therefore avoid the call to ActiveSupport's to_json.

Another approach would be to override to_json on the object you are serializing, so in that case, you could do something like:

class Foo < ActiveRecord::Base
  def to_json(options = {})
    JSON.generate(as_json)
  end
end

And if you use ActiveModelSerializers, you can solve this problem by overriding to_json in your serializer:

# controller
respond_with foo, :serializer => MySerializer

# serializer
attributes :bar, :baz

def to_json(options = {})
  JSON.generate(serializable_hash)
end

回复收藏 0 原文

长不大的小祸害 2024-10-26 19:50:41

我有一个非常棘手的方法来解决这个问题。好吧，如果 to_json 不允许您拥有正确的代码，那么您可以直接尝试编写：

render text: tags

render json:tags 或 render json:tags.to_json 将始终自动传输编码样式，但如果您使用 render text:tags，则字符串将保持原样。我认为 jQuery 仍然可以识别数据。

I have got a very tricky way to solve this problem. Well, if to_json did not allow you to have the correct code, then you could directly try to write :

render text: tags

render json: tags or render json: tags.to_json will always auto transfer the encoding style, but if you use render text:tags, then the string will stay as it is. And I think jQuery could still recognize the data.

回复收藏 0 原文

~没有更多了~