进行 mySQL DB 迭代并将字符串转换为不同的编码
我从一位朋友那里收到了一份给她的 Wordpress 博客的数据库备份。之前进行备份的人显然做错了什么,因为博客文章中的所有重音字符都编码错误,并且回响了“é”或“�”之类的内容。
现在,我想到了一个“简单”脚本,该脚本将循环遍历数据库,查找给定的编码错误的字符字符串并将它们转换为应有的内容。但我想这并不是最好的方法。我知道PHP中有字符编码函数,但我根本不熟悉这些函数,因为我不太了解字符编码的机制。
任何人都可以帮助我吗?
I've received from a friend a DB backup of a Wordpress blog that had been given to her. The person who did the backup before obviously did something wrong, as all the accented character in the blog posts are badly encoded and echoes like "é" or "À".
Now, I though about a "simple" script that would loop through the DB, look for given string of badly encoded character and convert them to what it should be. But I guess it's not really the best way to do it. I know that there is character encoding functions in PHP, but I'm not at all versed into theses, as I don't really understand the mechanics of character encoding.
Anyone can help me on this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您使用单字节编码从文本编辑器查看转储文件,这是正常的:多字节 UTF-8 字符将显示为两字节对,就像您显示的那样。
您应该能够在导入时指定转储的字符集(例如,使用 phpMyAdmin 中的相应下拉列表)。将字符集设置为 UTF-8,它应该可以正确导入。
This is normal if you look at the dump file from an text editor with a single-byte encoding: multibyte UTF-8 characters will show up as two-byte pairs like those you show.
You should be able to specify the dump's character set when importing (e.g. using the appropriate drop-down in phpMyAdmin). Set the character set to UTF-8 and it should import properly.
可能您有双重编码的 UTF-8。您可以尝试将数据库重新转储为 latin1,然后重新导入为 UTF-8。 src
Potentially you have double encoded UTF-8. You might try re-dumping the database as latin1, and re-importing as UTF-8. src