为什么 ActiveRecord 和/或 MySQL 对此字符有问题?
当我将从 API 调用传入的某些字符串插入到我的数据库中时,它们会在某些字符处被截断。这是 ruby 1.8.7 的情况。我已将应用程序范围和 MySQL 中的所有内容设置为 utf8。我通常在应用程序其他部分将 utf8 内容输入数据库时没有任何问题。
它应该是“El Soldado y La Muñeca”。如果我将它插入数据库,只有这样才能使其进入:“11 El Soldado y La Mu”。
>> name => "11 El Soldado y La Mu?eca(1).mp3" >> name[20..20] => "u" >> name[21..21] => "\361" >> name[22..22] => "e"
- 那是utf8字符吗?
- 我知道 ruby 1.8 不支持编码,但说实话,我总是忘记这对我有何影响——我总是将所有其他层的所有内容设置为 utf8,一切都很好。为什么现在不起作用?
更新
更正——我错了,它不是来自 API,而是来自文件系统。
错误编码的字符来自房子内部!
When I insert certain strings coming in from API calls into my db, they get cut off at certain characters. This is with ruby 1.8.7. I have everything set to utf8 app-wide and in MySQL. I typically don't have any problem entering utf8 content into the DB in other parts of the app.
It's supposed to be "El Soldado y La Muñeca". If I insert it into the db, only this makes it in: "11 El Soldado y La Mu".
>> name => "11 El Soldado y La Mu?eca(1).mp3" >> name[20..20] => "u" >> name[21..21] => "\361" >> name[22..22] => "e"
- is that a utf8 character?
- i know that ruby 1.8 isn't encoding aware, but to be honest i always forget how this should affect me -- i always just set everything at all the other layers to utf8 and everything is fine. WHY THIS NO WORK NOW?
update
CORRECTION-- i was wrong, it's not coming from the api, it's coming from the file system.
the wrongly-encoded character is coming from inside the house!
new question: How can I get utf8 characters from File#path
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您以某种方式获得了 Latin-1 (又名 ISO-8859-1)
ñ
而不是 UTF-8ñ
。在 Latin-1 中,ñ
是八进制的 361(因此是单字节"\361"
)。在 UTF-8 中,小写波浪线 n 应该是\303\261
(即八进制的字节 0303 和 0261 或十六进制的 0xc3 和 0xb1)。您可能必须开始使用 Iconv 在 Ruby 端,确保您获得 UTF-8 格式的所有内容。
You are somehow getting a Latin-1 (AKA ISO-8859-1)
ñ
rather than a UTF-8ñ
. In Latin-1 theñ
is 361 in octal (hence your single byte"\361"
). In UTF-8 that lower case tilde-n should be\303\261
(i.e. bytes 0303 and 0261 in octal or 0xc3 and 0xb1 in hex).You might have to start playing with Iconv in the Ruby side to make sure you get everything in UTF-8.