json_encode() 非 utf-8 字符串?
所以我有一个字符串数组,所有字符串都使用系统默认的 ANSI 编码,并从 SQL 数据库中提取。因此有 256 种不同的可能的字符字节值(单字节编码)。
有没有办法让 json_encode()
工作并显示这些字符,而不必在所有字符串上使用 utf8_encode()
并最终得到诸如 <代码>\u0082?
或者说这就是 JSON 的标准吗?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
如果您有 ANSI 编码的字符串,则使用
utf8_encode()
处理此问题的函数是错误。您需要首先将其从 ANSI 正确转换为 UTF-8。这肯定会减少 json 输出中\u0082
等 Unicode 转义序列的数量,但从技术上讲,这些序列 对json有效,你不必害怕它们。使用 PHP 将 ANSI 转换为 UTF-8
json_encode
仅适用于UTF-8
编码字符串。如果您需要从ANSI
编码字符串成功创建有效的json
,则需要首先将其重新编码/转换为UTF-8
。然后 json_encode 将按照记录工作。从
ANSI
转换编码(更正确地说,我假设您有一个Windows-1252
编码字符串,它很流行,但被错误地称为ANSI
)到UTF-8
您可以使用mb_convert_encoding()
函数:PHP 中另一个可以转换字符串的编码/字符集的函数称为
iconv
基于 libiconv。您也可以使用它:关于 utf8_encode() 的注意事项
utf8_encode()< /code>
仅适用于
Latin-1
,不适用于ANSI
。因此,当您通过该函数运行该字符串时,您将破坏该字符串内的部分字符。相关:什么是 ANSI 格式?
更细粒度地控制
json_encode ()
返回,请参阅预定义常量列表 ( PHP版本依赖,包括 PHP 5.4,一些常量仍然没有记录,到目前为止仅在源代码中可用)。更改数组的编码/迭代(PDO 注释)
正如您在注释中所写,您在将函数应用到数组时遇到问题,这里是一些代码示例。在使用
json_encode
之前总是需要首先更改编码。这只是一个标准的数组操作,对于pdo::fetch()
的更简单情况是foreach
迭代:If you have an ANSI encoded string, using
utf8_encode()
is the wrong function to deal with this. You need to properly convert it from ANSI to UTF-8 first. That will certainly reduce the number of Unicode escape sequences like\u0082
from the json output, but technically these sequences are valid for json, you must not fear them.Converting ANSI to UTF-8 with PHP
json_encode
works withUTF-8
encoded strings only. If you need to create validjson
successfully from anANSI
encoded string, you need to re-encode/convert it toUTF-8
first. Thenjson_encode
will just work as documented.To convert an encoding from
ANSI
(more correctly I assume you have aWindows-1252
encoded string, which is popular but wrongly referred to asANSI
) toUTF-8
you can make use of themb_convert_encoding()
function:Another function in PHP that can convert the encoding / charset of a string is called
iconv
based on libiconv. You can use it as well:Note on utf8_encode()
utf8_encode()
does only work forLatin-1
, not forANSI
. So you will destroy part of your characters inside that string when you run it through that function.Related: What is ANSI format?
For a more fine-grained control of what
json_encode()
returns, see the list of predifined constants (PHP version dependent, incl. PHP 5.4, some constants remain undocumented and are available in the source code only so far).Changing the encoding of an array/iteratively (PDO comment)
As you wrote in a comment that you have problems to apply the function onto an array, here is some code example. It's always needed to first change the encoding before using
json_encode
. That's just a standard array operation, for the simpler case ofpdo::fetch()
aforeach
iteration:JSON 标准强制使用 Unicode 编码。来自 RFC4627:
因此,从最严格的意义上来说,ANSI 编码的 JSON 不是有效的 JSON ;这就是为什么 PHP 在使用
json_encode()
时强制执行 unicode 编码。至于“默认 ANSI”,我很确定您的字符串是用 Windows-1252 编码的。它被错误地称为 ANSI。
The JSON standard ENFORCES Unicode encoding. From RFC4627:
Therefore, on the strictest sense, ANSI encoded JSON wouldn't be valid JSON; this is why PHP enforces unicode encoding when using
json_encode()
.As for "default ANSI", I'm pretty sure that your strings are encoded in Windows-1252. It is incorrectly referred to as ANSI.
JSON_UNESCAPED_UNICODE(整数)
按字面意思对多字节 Unicode 字符进行编码(默认转义为 \uXXXX)。自 PHP 5.4.0 起可用。
http://php.net/manual/en/json .constants.php#constant.json-unescaped-unicode
JSON_UNESCAPED_UNICODE (integer)
Encode multibyte Unicode characters literally (default is to escape as \uXXXX). Available since PHP 5.4.0.
http://php.net/manual/en/json.constants.php#constant.json-unescaped-unicode
我发现了以下类似问题的答案,其中嵌套数组不是 utf-8 编码,我必须进行 json 编码:
I found the following answer for an analogous problem with a nested array not utf-8 encoded that i had to json encode:
这会将基于 Windows 的 ANSI 转换为 utf-8,错误将不再存在。
that will convert windows based ANSI to utf-8 and the error will be no more.
请使用此替代:
复制 json_encode php 手册 的注释。总是阅读评论。它们很有用。
Use this instead:
Copy from json_encode php manual's comments. Always read the comments. They are useful.