mysql编码错误->我怎样才能将它重新转换为其他东西?

发布于 2024-11-05 20:32:44 字数 1294 浏览 0 评论 0原文

我前段时间创建了一个网站,在我的数据库和网站中使用了错误的字符集。 HTML 设置为 ISO...数据库设置为拉丁语...,页面以西方拉丁语保存...一团糟。

该网站是法语的,因此我创建了一个函数,将所有重音符号(如“é”)替换为“é”。这暂时解决了问题。

我刚刚学到了很多关于编程的知识,现在我的文件保存为 Unicode UTF-8,HTML 为 UTF-8,我的 MySQL 表列设置为 ut8_encoding...

我尝试将重音符号移回“é” ”而不是“é”,但我在 MySQL 中以及显示页面时遇到了常见的字符集问题(?)或奇怪的字符“Д。

我需要找到一种方法来更新我的sql,通过一个清理字符串的函数,以便它最终可以恢复正常。目前我的功能看起来像这样但不起作用:

function stripAcc3($value){

 $ent =   array(
          'à'=>'à', 
          'â'=>'â', 
            'ù'=>'ù', 
          'û'=>'û',
            'é'=>'é', 
          'è'=>'è', 
          'ê'=>'ê', 
            'ç'=>'ç', 
            'Ç'=>'Ç', 
            "î"=>'î', 
            "Ï"=>'ï', 
            "ö"=>'ö', 
            "ô"=>'ô', 
            "ë"=>'ë', 
            "ü"=>'ü', 
            "Ä"=>'ä',
            "€"=>'€',
          "′"=> "'",
          "é"=> "é"
        );

    return strtr($value, $ent);
}

欢迎任何帮助。提前致谢。如果您需要代码,请告诉我哪一部分。

更新

如果您想要奖励积分,我需要有关如何操作的详细说明。谢谢。

I started a website some time ago using the wrong CHARSET in my DB and site. The HTML was set to ISO... and the DB to Latin... , the page was saved in Western latin... a big mess.

The site is in French, so I created a function that replaced all accents like "é" to "é". Which solved the issue temporarily.

I just learned a lot more about programming, and now my files are saved as Unicode UTF-8, the HTML is in UTF-8 and my MySQL table columns are set to ut8_encoding...

I tried to move back the accents to "é" instead of the "é", but I get the usual charset issues with the (?) or weird characters "â" both in MySQL and when the page is displayed.

I need to find a way to update my sql, through a function that cleans the strings so that it can finally go back to normal. At the moment my function looks like this but doesn't work:

function stripAcc3($value){

 $ent =   array(
          'à'=>'à', 
          'â'=>'â', 
            'ù'=>'ù', 
          'û'=>'û',
            'é'=>'é', 
          'è'=>'è', 
          'ê'=>'ê', 
            'ç'=>'ç', 
            'Ç'=>'Ç', 
            "î"=>'î', 
            "Ï"=>'ï', 
            "ö"=>'ö', 
            "ô"=>'ô', 
            "ë"=>'ë', 
            "ü"=>'ü', 
            "Ä"=>'ä',
            "€"=>'€',
          "′"=> "'",
          "é"=> "é"
        );

    return strtr($value, $ent);
}

Any help welcome. Thanks in advance. If you need code, please tell me which part.

UPDATE

If you want the bounty points, I need detailed instructions on how to do it. Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

身边 2024-11-12 20:32:44

尝试使用以下函数,它应该可以处理您描述的所有问题:

function makeStringUTF8($data)
{
    if (is_string($data) === true)
    {
        // has html entities?
        if (strpos($data, '&') !== false)
        {
            // if so, revert back to normal
            $data = html_entity_decode($data, ENT_QUOTES, 'UTF-8');
        }

        // make sure it's UTF-8
        if (function_exists('iconv') === true)
        {
            return @iconv('UTF-8', 'UTF-8//IGNORE', $data);
        }

        else if (function_exists('mb_convert_encoding') === true)
        {
            return mb_convert_encoding($data, 'UTF-8', 'UTF-8');
        }

        return utf8_encode(utf8_decode($data));
    }

    else if (is_array($data) === true)
    {
        $result = array();

        foreach ($data as $key => $value)
        {
            $result[makeStringUTF8($key)] = makeStringUTF8($value);
        }

        return $result;
    }

    return $data;
}

关于如何使用它的具体说明,我建议如下:

  1. 将您的旧拉丁数据库(我希望您仍然拥有它)内容导出为 SQL/CSV转储 *
  2. 对文件内容使用上述函数并将结果保存到另一个文件中
  3. 将您在上一步中生成的文件导入到 UTF-8 感知架构/数据库中

*示例:

file_put_contents('utf8.sql', makeStringUTF8(file_get_contents('latin.sql')));

如果它不让我知道,这应该可以。

Try using the following function instead, it should handle all the issues you described:

function makeStringUTF8($data)
{
    if (is_string($data) === true)
    {
        // has html entities?
        if (strpos($data, '&') !== false)
        {
            // if so, revert back to normal
            $data = html_entity_decode($data, ENT_QUOTES, 'UTF-8');
        }

        // make sure it's UTF-8
        if (function_exists('iconv') === true)
        {
            return @iconv('UTF-8', 'UTF-8//IGNORE', $data);
        }

        else if (function_exists('mb_convert_encoding') === true)
        {
            return mb_convert_encoding($data, 'UTF-8', 'UTF-8');
        }

        return utf8_encode(utf8_decode($data));
    }

    else if (is_array($data) === true)
    {
        $result = array();

        foreach ($data as $key => $value)
        {
            $result[makeStringUTF8($key)] = makeStringUTF8($value);
        }

        return $result;
    }

    return $data;
}

Regarding the specific instructions of how to use this, I suggest the following:

  1. export your old latin database (I hope you still have it) contents as an SQL/CSV dump *
  2. use the above function on the file contents and save the result on another file
  3. import the file you generated in the previous step into the UTF-8 aware schema / database

* Example:

file_put_contents('utf8.sql', makeStringUTF8(file_get_contents('latin.sql')));

This should do it, if it doesn't let me know.

故事未完 2024-11-12 20:32:44

您可能想研究如何修复 WP 数据库编码问题:

http://codex.wordpress.org/Converting_Database_Character_Sets< /a>

长话短说,大多数旧的 WP 站点都是使用瑞典语/拉丁语整理表创建的,这些表用于存储 UTF8 字符串。要正确整理表格,方法是将列更改为二进制类型,然后将其更改为 UTF8 文本。

这可以避免直接从 Latin1 转换为 UTF8 时文本出现混乱。

You might want to investigate what is used to fix WP database encoding issues:

http://codex.wordpress.org/Converting_Database_Character_Sets

To cut a long story short, most old WP sites were created with Swedish/Latin1 collated tables, which were used to store UTF8 strings. To collate the tables properly, the approach is to change the column to binary type, and then to change it to UTF8 text.

This avoids that the text gets wrangled when converting from Latin1 to UTF8 directly.

女皇必胜 2024-11-12 20:32:44

您需要使用 iconv 等方法转换有问题的行。您面临的挑战是了解哪些行已经是 UTF-8,哪些行是 latin-1。

You will need to convert the offending rows using for example iconv. The challenge for you will be to know what rows are already UTF-8 and which are latin-1.

郁金香雨 2024-11-12 20:32:44

我不完全确定我理解你的问题,但是
如果您有

  • 一个UTF-8数据库

  • 其中的所有特殊字符都存储为HTML实体

然后

html_entity_decode($string, ENT_QUOTES, "UTF-8");

应该做到这一点并将所有实体转回其 UTF-8 本机字符。

I'm not completely sure I understand your question, but
if you have

  • a UTF-8 database

  • all special characters in there stored as HTML entities

then a

html_entity_decode($string, ENT_QUOTES, "UTF-8");

should do the trick and turn all entities back into their UTF-8 native characters.

残疾 2024-11-12 20:32:44

确保不仅您的表使用 utf-8,您的数据库连接也应使用 utf-8。

$this->db = mysql_connect(MYSQL_SERVER,DB_LOGIN,DB_PASS);
mysql_set_charset  ('utf8',$this->getConnection());

Make sure, not just your tables use utf-8, your database connection should use utf-8 as well.

$this->db = mysql_connect(MYSQL_SERVER,DB_LOGIN,DB_PASS);
mysql_set_charset  ('utf8',$this->getConnection());
长梦不多时 2024-11-12 20:32:44

如果您想以 UTF-8 与数据库进行讨论,您必须告诉数据库连接流是 UTF-8 流。在向数据库发出每个请求之前,您必须发送一个请求,该请求如下:

“SET NAMES utf8”;

我个人在 connect.inc.php 文件中使用它来创建与数据库的连接。数据库知道您发送的UTF-8编码字符串并且工作正常!

mysql_set_charset 函数工作得不好,我过去尝试过这个函数,但事实是它不起作用。

对于您的完整问题,如果要将 latin1 字符串转换为 UTF-8,则必须首先将 latin1 字符串转换为二进制字符串格式。然后将二进制字符串转换为UTF-8字符串,所有这些都可以在数据库内部使用数据库命令完成。请参阅该文章(法语): http://www.noidea.ca/2009/06/15/comment-convertir-une-db-de-latin1-a-utf8/

我可以告诉您此方法有效,因为我用它来转换我创建的数据库中的数据。

If you want to discuss with your database in UTF-8 you have to tell the Database that the connexion flow is a UTF-8 flow. You have to sent a request before each request you make to the database, this request in the following :

"SET NAMES utf8";

Personnaly I use that in the connect.inc.php files which create the connection to the database. Which this statement the database know that your sending UTF-8 encoded string and works perfectly !

mysql_set_charset function isn't working well, i tried this function in the past but the truth is that it don't do the trick.

For your complete issue, if you want to convert latin1 string to UTF-8, you have to convert first the latin1 string to a binary string format. Then convert the binary string into UTF-8 string, all can be done inside the database with database commands. See that artile (in french) : http://www.noidea.ca/2009/06/15/comment-convertir-une-db-de-latin1-a-utf8/

I can tell you that this method works because i used it to transform data from a database I've created.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文