WordPress 在其 MySQL 数据库中对其内容编码做了什么?

发布于 2024-08-27 02:09:03 字数 1122 浏览 4 评论 0原文

由于一些最好抛在脑后的复杂原因,我需要直接访问 WordPress 数据库的内容。我正在 MySQL 5.0.70-r1 使用en.wikipedia.org/wiki/Gentoo_Linux" rel="nofollow noreferrer">Gentoo 与 WordPress 2.6 和 Perl 5.8.8 ftr。

因此,有时我们会在博客中获得高阶字符,我们也有不少作者做出了贡献,大多数情况下,这些字符最终会出现在 WordPress 的数据库中的 wp_posts.post_contentwp_postmeta 中。 meta_value,Wordpress 在其网站上正确显示这些内容,但数据库使用单字节编码存储它,我不知道如何转换为正确的字符串。今天的例子:

博客显示了这一点,甚至似乎没有转义 HTML 中的任何字符,

   Hãhãhães  

但是当通过 MySQL 提示符查看数据库时,有,

   Hãhãhães

所以显然这是某种双字节编码问题,但我不知道如何纠正它。我需要能够从数据库中提取第二个字符串(因为这就是它给我的)并将其转换为第一个字符串,并且我需要使用 Perl 来执行此操作。

另外,为了澄清一切,我使用 Perl 的 ord() 函数获取这些字符串并打印出每个字符的 ASCII 代码。

这是“错误”字符串的输出:

H = 72
à = 195
£ = 163
h = 104
à = 195
£ = 163
h = 104
à = 195
£ = 163
e = 101
s = 115

这是我需要在脚本中生成的正确字符串:

H = 72
ã = 227
h = 104
ã = 227
h = 104
ã = 227
e = 101
s = 115

For some convoluted reasons best left behind us, I require direct access the contents of a WordPress database. I'm using MySQL 5.0.70-r1 on Gentoo with WordPress 2.6, and Perl 5.8.8 ftr.

So, sometimes we get high-order characters in the blog, we have quite a few authors contributing too, for the most part these characters end up in WordPress's database in wp_posts.post_content or wp_postmeta.meta_value, Wordpress is displaying these correctly on its site, but the database stores it using single byte encoding that I can't figure out how to convert to the correct string. Today's example:

The blog shows this, and doesn't even seem to escape any chars in the HTML,

   Hãhãhães  

but the database, when viewed via the MySQL prompt, has,

   Hãhãhães

So clearly this is some kind of double-byte encoding issue, but I don't know how I can correct it. I need to be able to pull that second string from the database (because that's what it gives me) and convert it to the first one, and I need to do so using Perl.

Also, just to help unmuddy any waters, I took these strings and printed out the ASCII codes for each character using Perl's ord() function.

Here is the output of the "wrong" string:

H = 72
à = 195
£ = 163
h = 104
à = 195
£ = 163
h = 104
à = 195
£ = 163
e = 101
s = 115

This is the correct string, that I need to produce in my script:

H = 72
ã = 227
h = 104
ã = 227
h = 104
ã = 227
e = 101
s = 115

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

思慕 2024-09-03 02:09:03

设置名称'utf8'怎么样?

How about SET NAMES 'utf8'?

怂人 2024-09-03 02:09:03

我修好了......感谢那些阅读和/或尝试过的人。

my $dbh = DBI->connect('mysql:etc:etc');
$dbh->{mysql_enable_utf8}++;  #<---- solution

就这样,叹息...

不确定 MySQL 提示符的情况,因为我不太关心,但我确信这是一个类似的解决方案,以确保 MySQL 将双字节结果返回到提示符。尽管请参阅上面的评论,但“set character_set_*”变量似乎并没有影响它。

I fixed it... Thanks to those who read and/or tried.

my $dbh = DBI->connect('mysql:etc:etc');
$dbh->{mysql_enable_utf8}++;  #<---- solution

That's all, sigh...

Not sure about the MySQL prompt thing, because I don't really care, but I'm sure it's a similar solution, to make sure MySQL is returning results in double-bytes to its prompt. Though see my comment above, "set character_set_*" variables didn't seem to affect it though.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文