MySQL 导入错误,现在我们用垃圾显示代替 utf-8 字符
我们从不同格式的备份恢复到新的 MySQL 结构(已正确设置以支持 UTF-8)。我们在浏览器中显示了奇怪的字符,但我们不确定它们的名称,因此我们可以找到它们翻译的主列表。
我注意到事实上它们确实与特定的角色相关。例如:
â„¢ always translates to ™
— always translates to —
• always translates to ·
我引用了这篇文章,它让我开始了,但这远不是一个完整的列表。要么我没有搜索正确的名称,要么这些从坏到好转换的“主列表”作为参考不存在。
另外,当尝试通过 MySQL 查询进行搜索时,如果我搜索 â,我总是让 MySQL 将其视为“a”。有什么方法可以调整我的 MySQL 查询,使它们更加字面搜索吗?我们不太使用国际化,所以我可以放心地假设任何包含 â 字符的字段都被认为是有问题的条目,这需要通过我们正在构建的“fixit”脚本来修复。
We restored from a backup in a different format to a new MySQL structure (which is setup correctly for UTF-8 support). We have weird characters showing in the browser, but we're not sure what they're called so we can find a master list of what they translate to.
I have noticed that they do, in fact, correlate to a specific character. For example:
â„¢ always translates to ™
— always translates to —
• always translates to ·
I referenced this post, which got me started, but this is far from a complete list. Either I'm not searching for the correct name, or the "master list" of these bad-to-good conversions as a reference doesn't exist.
Reference:
Detecting utf8 broken characters in MySQL
Also, when trying to search via MySQL query, if I search for â, I always get MySQL treating it as an "a". Is there any way to tweak my MySQL queries so that they are more literal searches? We don't use internationalization much so I can safely assume any fields containing the â character is considered to be a problematic entry, which would need to be remedied by our "fixit" script we're building.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我认为最好直接解决问题,而不是设计一个“fixit”脚本来检查和替换这些数据。数据似乎最初以与 UTF-8 不同的格式存储,因此当您将其放入为
UTF-8
设置的表中时,文本会出现乱码。如果有机会,请返回到原始备份以确定数据存储的格式。如果做不到,您可能需要进行一些尝试和错误来确定数据的格式然而,一旦你知道了这一点,转换就很容易了。阅读以下文章中有关修复的部分:http://www.istognosis.com/en/mysql/35-garbled-data-set-utf8-characters-to-mysql-
基本上,您要将列设置为
BINARY
然后将其设置为原始字符集。这应该使文本正确显示(这是一个很好的检查,以了解您使用的是正确的字符集)。完成后,将该列设置为UTF-8
。这将正确转换数据并纠正您当前遇到的问题。Instead of designing a "fixit" script to go through and replace this data, I think it would be better to simply fix the issue directly. It seems like the data was originally stored in a different format than UTF-8 so that when you brought it into the table that was set up for
UTF-8
, it garbled the text. If you have the opportunity, go back to your original backup to determine the format the data was stored in. If you can't do that, you will probably need to do a bit of trial and error to figure out which format the data is in. However, once you know that, conversion is easy. Read the following article's section on Repairing:http://www.istognosis.com/en/mysql/35-garbled-data-set-utf8-characters-to-mysql-
Basically you are going to set the column to
BINARY
and then set it to the original charset. That should make the text appear properly (a good check to know you are using the correct charset). Once that is done, set the column toUTF-8
. This will convert the data properly and it will correct the problems you are currently experiencing.