What you're running into is the result of the data being written in one encoding, and interpreted as being another. You need to make sure that you're requesting input to be in the same format that you're expecting it to be in. I recommend just sticking with UTF-8 the whole way through unless you need to avoid multibyte characters, in which case you might want to look at forcing ASCII.
Make sure you're telling PHP to use UTF-8 internally:
ini_set('default_charset', 'UTF-8');
And make sure that you are telling the browser to expect UTF-8 encoded text, both in headers…
header("Content-Type:text/html; charset=UTF-8");
…and in your meta tags (html5 below)…
<meta charset="utf-8">
Setting this will tell the browser to sent UTF-8 encoded content to you when a form is submitted, and it'll interpret the results you send back as UTF-8 as well.
You should also make sure both your database storage and connection encoding are in UTF-8 as well. Usually as long as it is just a dumb data store (i.e. it won't be manipulating or interpreting the data in any way) it doesn't matter, but it's better to have it all right than run into problems with it later.
Also if I may add aside from those points stated above, if you are saving the data to database tables, the tables and columns (and maybe the database itself too) should have utf8_general_ci so that it can handle multibyte characters.
I also issue this query set names 'utf8' before running any query.
发布评论
评论(6)
您遇到的情况是数据以一种编码写入并解释为另一种编码的结果。您需要确保请求输入的格式与您期望的格式相同。我建议自始至终都坚持使用 UTF-8,除非您需要避免使用多字节字符,在这种情况下你可能想看看强制使用 ASCII。
确保你告诉 PHP 在内部使用 UTF-8:
并确保你告诉浏览器期望 UTF-8 编码的文本,无论是在标头中......
还是在你的元标记中(下面的 html5)......
设置这个会告诉你当提交表单时,浏览器会向您发送 UTF-8 编码的内容,并且它也会将您发回的结果解释为 UTF-8。
您还应该确保数据库存储和连接编码也采用 UTF-8。通常只要它只是一个愚蠢的数据存储(即它不会以任何方式操纵或解释数据)就没关系,但最好保持一切正常,而不是以后遇到问题。
What you're running into is the result of the data being written in one encoding, and interpreted as being another. You need to make sure that you're requesting input to be in the same format that you're expecting it to be in. I recommend just sticking with UTF-8 the whole way through unless you need to avoid multibyte characters, in which case you might want to look at forcing ASCII.
Make sure you're telling PHP to use UTF-8 internally:
And make sure that you are telling the browser to expect UTF-8 encoded text, both in headers…
…and in your meta tags (html5 below)…
Setting this will tell the browser to sent UTF-8 encoded content to you when a form is submitted, and it'll interpret the results you send back as UTF-8 as well.
You should also make sure both your database storage and connection encoding are in UTF-8 as well. Usually as long as it is just a dumb data store (i.e. it won't be manipulating or interpreting the data in any way) it doesn't matter, but it's better to have it all right than run into problems with it later.
iconv
函数一般能够处理遇到这种编码问题。The
iconv
function is generally able to deal with this sort of encoding issue.请参阅此线程: PHP:删除 `â` 或 `â 的正则表达式€`?
See this thread: PHP: regular expression to remove `â` or `â€`?
你的php设置是什么?
您可以配置 php 对字符串外进行编码,在大多数情况下建议使用 utf8,并且您的 html 页面中必须有一个 Content-Type 标记
What are your php settings??
You can configure php to encode the out of strings, in most of case utf8 it's recommended and also you must have a Content-Type tag in your html page
另外,除了上述几点之外,如果我可以添加,如果您将数据保存到数据库表中,则表和列(也许数据库本身也)应该具有 utf8_general_ci ,以便它可以处理多字节字符。
在运行任何查询之前,我还会发出此查询设置名称'utf8'。
Also if I may add aside from those points stated above, if you are saving the data to database tables, the tables and columns (and maybe the database itself too) should have utf8_general_ci so that it can handle multibyte characters.
I also issue this query set names 'utf8' before running any query.
看起来正确的解决方案是 mb_convert_encoding()
string mb_convert_encoding ( string $str , string $to_encoding [, mix $from_encoding ] )
将字符串 str 的字符编码从可选的 from_encoding 转换为 to_encoding。
Looks like the right solution is mb_convert_encoding()
string mb_convert_encoding ( string $str , string $to_encoding [, mixed $from_encoding ] )
Converts the character encoding of string str to to_encoding from optionally from_encoding.