jQuery AJAX 调用弄乱了字符编码

发布于 2024-09-08 17:19:34 字数 4870 浏览 8 评论 0原文

我有一个输出 JSON 的 servlet。 Servlet 的输出编码是 ISO-8859-1。我们的网络应用程序中的页面也设置为 ISO-8859-1。我会使用 UTF-8,但这超出了我的控制范围;我们必须使用 ISO-8859-1。

当我单独点击servlet时,我可以看到已经输出的JSON数据。字符编码是正确的,并且没有一个字符看起来很奇怪。

然而,当我通过 AJAX 调用 servlet 并使用检索到的数据填充选择框时,我得到 � 代替(看起来)所有带重音的字符(例如带有重音或锐音、分音符号或扬抑符号的 i) )。当我查看 Firebug 下的“网络”选项卡中的响应时,我可以看到文本看起来不错。但是,当我使用该数据填充选择框时,我得到了带问号的菱形。

这些字符都是有效的 ISO-8859-1 字符,所以我不明白为什么它们没有正确显示。

编辑

一些更多信息。我在 jQuery.ajax 中使用 GET 并将 scriptCharset 设置为 ISO-8859-1。在服务器端,我已使用 request.setCharacterEncoding("ISO-8859-1");

EDIT

显式将编码设置为 ISO-8859-1代码示例:

这是我现在的情况。我添加了 scriptCharset: "ISO-8859-1" 没有效果。

        jQuery.ajax({
            url: "/countryAndProvinceCodeServlet",
            data: data,
            dataType: "json",
            type: "GET",
            success: function(data) {
               ...
            },
        });

简单地输出字符串

我的 servlet 使用 org.json.JSONObject 并通过执行 response.getWriter().print(jsonObject.toString()); UPDATE 根据

有关 JSON 的评论以及它应该如何为 UTF-8,我尝试查看是否可以将数据作为文本获取(因此将 dataType 设置为 text < code>jQuery.ajax),然后我自己将其评估为 JSON(在 Javascript 中)。这似乎也不起作用!当我执行console.log时,我仍然得到了时髦的钻石。但是,当我在 Firebug 中的“网络”选项卡下查看它时,一切都显示正常:

“网络”选项卡:

{"error":false,
 "provinces":{"DZ-01":"Adrar",
              "DZ-16":"Alger",
              "DZ-23":"Annaba",
              "DZ-44":"Aïn Defla",
              "DZ-46":"Aïn Témouchent",
              "DZ-05":"Batna",
              "DZ-07":"Biskra",
              "DZ-09":"Blida",
              "DZ-34":"Bordj Bou Arréridj",
              "DZ-10":"Bouira",
              "DZ-35":"Boumerdès",
              "DZ-08":"Béchar",
              "DZ-06":"Béjaïa",
              "DZ-02":"Chlef",
              "DZ-25":"Constantine",
              "DZ-17":"Djelfa",
              "DZ-32":"El Bayadh",
              "DZ-39":"El Oued",
              "DZ-36":"El Tarf",
              "DZ-47":"Ghardaïa",
              "DZ-24":"Guelma",
              "DZ-33":"Illizi",
              "DZ-18":"Jijel",
              "DZ-40":"Khenchela",
              "DZ-03":"Laghouat",
              "DZ-29":"Mascara",
              "DZ-43":"Mila",
              "DZ-27":"Mostaganem",
              "DZ-28":"Msila",
              "DZ-26":"Médéa",
              "DZ-45":"Naama",
              "DZ-31":"Oran",
              "DZ-30":"Ouargla",
              "DZ-04":"Oum el Bouaghi",
              "DZ-48":"Relizane",
              "DZ-20":"Saïda",
              "DZ-22":"Sidi Bel Abbès",
              "DZ-21":"Skikda",
              "DZ-41":"Souk Ahras",
              "DZ-19":"Sétif",
              "DZ-11":"Tamanghasset",
              "DZ-14":"Tiaret",
              "DZ-37":"Tindouf",
              "DZ-42":"Tipaza",
              "DZ-38":"Tissemsilt",
              "DZ-15":"Tizi Ouzou",
              "DZ-13":"Tlemcen",
              "DZ-12":"Tébessa"}}

但是当我使用从 jQuery.ajax 获得的内容执行 console.log(text) 时/code>,我得到以下信息:

{"error":false,
 "provinces":{"DZ-01":"Adrar",
              "DZ-16":"Alger",
              "DZ-23":"Annaba",
              "DZ-44":"A�n Defla",
              "DZ-46":"A�n T�mouchent",
              "DZ-05":"Batna",
              "DZ-07":"Biskra",
              "DZ-09":"Blida",
              "DZ-34":"Bordj Bou Arr�ridj",
              "DZ-10":"Bouira",
              "DZ-35":"Boumerd�s",
              "DZ-08":"B�char",
              "DZ-06":"B�ja�a",
              "DZ-02":"Chlef",
              "DZ-25":"Constantine",
              "DZ-17":"Djelfa",
              "DZ-32":"El Bayadh",
              "DZ-39":"El Oued",
              "DZ-36":"El Tarf",
              "DZ-47":"Gharda�a",
              "DZ-24":"Guelma",
              "DZ-33":"Illizi",
              "DZ-18":"Jijel",
              "DZ-40":"Khenchela",
              "DZ-03":"Laghouat",
              "DZ-29":"Mascara",
              "DZ-43":"Mila",
              "DZ-27":"Mostaganem",
              "DZ-28":"Msila",
              "DZ-26":"M�d�a",
              "DZ-45":"Naama",
              "DZ-31":"Oran",
              "DZ-30":"Ouargla",
              "DZ-04":"Oum el Bouaghi",
              "DZ-48":"Relizane",
              "DZ-20":"Sa�da",
              "DZ-22":"Sidi Bel Abb�s",
              "DZ-21":"Skikda",
              "DZ-41":"Souk Ahras",
              "DZ-19":"S�tif",
              "DZ-11":"Tamanghasset",
              "DZ-14":"Tiaret",
              "DZ-37":"Tindouf",
              "DZ-42":"Tipaza",
              "DZ-38":"Tissemsilt",
              "DZ-15":"Tizi Ouzou",
              "DZ-13":"Tlemcen",
              "DZ-12":"T�bessa"}}

在我看来,jQuery 对数据做了一些奇怪的事情。

I have a servlet that outputs JSON. The output encoding for the servlet is ISO-8859-1. Pages in our webapp are also set to ISO-8859-1. I would use UTF-8, but this is outside my control; we have to use ISO-8859-1.

When I hit the servlet by itself, I can see JSON data that has been outputted. The character encoding is correct, and none of the characters look strange.

However, when I call the servlet via AJAX and use the data retrieved to populate a select box, I get � in the place of (it seems) all characters that have accents (for example i with grave or acute accent, dieresis, or circumflex). When I look at the response in the Net tab under Firebug, I can see that that the text looks fine. However, when I use that data to populate the select box, I get the diamond-with-questionmark.

These characters are all valid ISO-8859-1 characters, and so I don't understand why they don't show up correctly.

EDIT

Some more information. I use GET in jQuery.ajax and I've set scriptCharset to ISO-8859-1. On the server-side, I've explicitly set the encoding to ISO-8859-1 using request.setCharacterEncoding("ISO-8859-1");

EDIT

Code samples:

This is what I have currently. I added scriptCharset: "ISO-8859-1" to no effect.

        jQuery.ajax({
            url: "/countryAndProvinceCodeServlet",
            data: data,
            dataType: "json",
            type: "GET",
            success: function(data) {
               ...
            },
        });

My servlet uses org.json.JSONObject and simply outputs the string by doing response.getWriter().print(jsonObject.toString());

UPDATE

Per the comments about JSON and how it should be UTF-8, I tried to see if I could grab the data as text (so set dataType to text in jQuery.ajax) and then evaluate it as JSON myself (in Javascript). That doesn't seem to work either! When I do console.log, I still get the funky diamonds. However, when I look at it under the Net tab in Firebug everything shows up fine:

Net tab:

{"error":false,
 "provinces":{"DZ-01":"Adrar",
              "DZ-16":"Alger",
              "DZ-23":"Annaba",
              "DZ-44":"Aïn Defla",
              "DZ-46":"Aïn Témouchent",
              "DZ-05":"Batna",
              "DZ-07":"Biskra",
              "DZ-09":"Blida",
              "DZ-34":"Bordj Bou Arréridj",
              "DZ-10":"Bouira",
              "DZ-35":"Boumerdès",
              "DZ-08":"Béchar",
              "DZ-06":"Béjaïa",
              "DZ-02":"Chlef",
              "DZ-25":"Constantine",
              "DZ-17":"Djelfa",
              "DZ-32":"El Bayadh",
              "DZ-39":"El Oued",
              "DZ-36":"El Tarf",
              "DZ-47":"Ghardaïa",
              "DZ-24":"Guelma",
              "DZ-33":"Illizi",
              "DZ-18":"Jijel",
              "DZ-40":"Khenchela",
              "DZ-03":"Laghouat",
              "DZ-29":"Mascara",
              "DZ-43":"Mila",
              "DZ-27":"Mostaganem",
              "DZ-28":"Msila",
              "DZ-26":"Médéa",
              "DZ-45":"Naama",
              "DZ-31":"Oran",
              "DZ-30":"Ouargla",
              "DZ-04":"Oum el Bouaghi",
              "DZ-48":"Relizane",
              "DZ-20":"Saïda",
              "DZ-22":"Sidi Bel Abbès",
              "DZ-21":"Skikda",
              "DZ-41":"Souk Ahras",
              "DZ-19":"Sétif",
              "DZ-11":"Tamanghasset",
              "DZ-14":"Tiaret",
              "DZ-37":"Tindouf",
              "DZ-42":"Tipaza",
              "DZ-38":"Tissemsilt",
              "DZ-15":"Tizi Ouzou",
              "DZ-13":"Tlemcen",
              "DZ-12":"Tébessa"}}

But when I do console.log(text) with what I get from jQuery.ajax, I get the following:

{"error":false,
 "provinces":{"DZ-01":"Adrar",
              "DZ-16":"Alger",
              "DZ-23":"Annaba",
              "DZ-44":"A�n Defla",
              "DZ-46":"A�n T�mouchent",
              "DZ-05":"Batna",
              "DZ-07":"Biskra",
              "DZ-09":"Blida",
              "DZ-34":"Bordj Bou Arr�ridj",
              "DZ-10":"Bouira",
              "DZ-35":"Boumerd�s",
              "DZ-08":"B�char",
              "DZ-06":"B�ja�a",
              "DZ-02":"Chlef",
              "DZ-25":"Constantine",
              "DZ-17":"Djelfa",
              "DZ-32":"El Bayadh",
              "DZ-39":"El Oued",
              "DZ-36":"El Tarf",
              "DZ-47":"Gharda�a",
              "DZ-24":"Guelma",
              "DZ-33":"Illizi",
              "DZ-18":"Jijel",
              "DZ-40":"Khenchela",
              "DZ-03":"Laghouat",
              "DZ-29":"Mascara",
              "DZ-43":"Mila",
              "DZ-27":"Mostaganem",
              "DZ-28":"Msila",
              "DZ-26":"M�d�a",
              "DZ-45":"Naama",
              "DZ-31":"Oran",
              "DZ-30":"Ouargla",
              "DZ-04":"Oum el Bouaghi",
              "DZ-48":"Relizane",
              "DZ-20":"Sa�da",
              "DZ-22":"Sidi Bel Abb�s",
              "DZ-21":"Skikda",
              "DZ-41":"Souk Ahras",
              "DZ-19":"S�tif",
              "DZ-11":"Tamanghasset",
              "DZ-14":"Tiaret",
              "DZ-37":"Tindouf",
              "DZ-42":"Tipaza",
              "DZ-38":"Tissemsilt",
              "DZ-15":"Tizi Ouzou",
              "DZ-13":"Tlemcen",
              "DZ-12":"T�bessa"}}

It seems to me that jQuery is doing something weird with the data.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

屋檐 2024-09-15 17:19:34

我终于想通了。这很奇怪!

response.setCharacterEncoding(String) 确实不起作用(不知道它是否与我的设置有关或什么)。看起来它设置了字符编码,但由于某种原因 jQuery 把它搞乱了。您已明确设置标题,如下所示:

response.setHeader("Content-Type", "application/json; charset=ISO-8859-1");

感谢大家的帮助!

编辑

我做了一些研究并检查了JavaDocs 并看到了这个:

如果协议提供了一种方法,容器必须将用于 servlet 响应编写器的字符编码传达给客户端。对于 HTTP,字符编码作为文本媒体类型的 Content-Type 标头的一部分进行通信。 请注意,如果 servlet 未指定内容类型,则字符编码无法通过 HTTP 标头进行通信;但是,它仍然用于对通过 servlet 响应的编写器编写的文本进行编码

所以上面的方法仍然有效,但是你也可以(并且可能应该)这样做:

response.setContentType("application/json");
response.setCharacterEncoding("ISO-8859-1"); 

I finally figured it out. It's pretty weird!

response.setCharacterEncoding(String) does not work (don't know if it's related to my setup or what). It looks like it sets the character encoding, but for some reason jQuery messes it all up. You have the explicitly set the headers like so:

response.setHeader("Content-Type", "application/json; charset=ISO-8859-1");

Thanks for all the help, everyone!

EDIT

I did some research and checked out the JavaDocs and saw this:

Containers must communicate the character encoding used for the servlet response's writer to the client if the protocol provides a way for doing so. In the case of HTTP, the character encoding is communicated as part of the Content-Type header for text media types. Note that the character encoding cannot be communicated via HTTP headers if the servlet does not specify a content type; however, it is still used to encode text written via the servlet response's writer.

So the above still works, but you can also (and probably should) do this:

response.setContentType("application/json");
response.setCharacterEncoding("ISO-8859-1"); 
失退 2024-09-15 17:19:34

可以改用 UTF-8 吗?

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

在 PHP 中,您可以将 JSON 数据编码为 UTF-8:

/**
 * Applies a UTF-8 encoding conversion for text.
 */
function utf8_enc( $rows ) {
  $encoded = array();

  foreach( $rows as $row ) {
    $temp = array();

    foreach( $row as $name => $value ) {
      $temp[ $name ] = $value = mb_convert_encoding( $value, 'auto', 'UTF-8' );
    }

    array_push( $encoded, $temp );
  }

  return $encoded;
}

function db_json( $query ) {
  echo json_encode( utf8_enc( db_fetch_all( db_query( $query ) ) ) );
}

我使用 ISO-8859-1 重音字符集看到了一些奇怪的结果。我切换到UTF-8,编码问题就消失了。

对于它的价值,我对 getJSON 进行了编码,如下所示:

  $.getJSON( HOST + 'cat.dhtml', function( data ) {
    var h = '';
    var len = data.length;

    for( var i = 0; i < len; i++ ) {
      h += '<option value="' + data[i].id + '">' + data[i].name + '</option>';
      categories[ data[i].id ] = data[i];
    }

    $('#category').html(h);
  });

Can you use UTF-8, instead?

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

In PHP, you can encode JSON data as UTF-8:

/**
 * Applies a UTF-8 encoding conversion for text.
 */
function utf8_enc( $rows ) {
  $encoded = array();

  foreach( $rows as $row ) {
    $temp = array();

    foreach( $row as $name => $value ) {
      $temp[ $name ] = $value = mb_convert_encoding( $value, 'auto', 'UTF-8' );
    }

    array_push( $encoded, $temp );
  }

  return $encoded;
}

function db_json( $query ) {
  echo json_encode( utf8_enc( db_fetch_all( db_query( $query ) ) ) );
}

I was seeing some strange results using the ISO-8859-1 accented character set. I switched to UTF-8 and the encoding problems disappeared.

For what it's worth, I have coded getJSON as follows:

  $.getJSON( HOST + 'cat.dhtml', function( data ) {
    var h = '';
    var len = data.length;

    for( var i = 0; i < len; i++ ) {
      h += '<option value="' + data[i].id + '">' + data[i].name + '</option>';
      categories[ data[i].id ] = data[i];
    }

    $('#category').html(h);
  });
枉心 2024-09-15 17:19:34

在我看来,您收到解析错误,因为响应数据解码错误,因此包含一些错误的字符。

您可以尝试在 jQuery.ajax 中插入一个附加参数

dataFilter : function ( data, type ) {
    alert(data);
    return data;
}

如果所有非 ASCII 字符(“ï”、“é”等)都有错误但不同字符,您可以尝试替换将错误的编码字符转换为正确的字符,并从 dataFilter 返回正确的编码数据。

It seems to me you receive a parsing error because the response data are wrong decoded and so contain some wrong characters.

You could try to insert in jQuery.ajax an additional parameter

dataFilter : function ( data, type ) {
    alert(data);
    return data;
}

If you will have wrong but different characters for all non-ASCII characters ('ï', 'é' and so on) you can try to replace the wrong encoded characters to the correct characters and return correct encoded data from the dataFilter.

不乱于心 2024-09-15 17:19:34

RFC 4627 规定 JSON 文本应以 Unicode 编码,无论这意味着什么,并且 < a href="http://json.org/" rel="nofollow noreferrer">json.org 表示所有字符均为“unicode 字符”:

  • 编码

    JSON 文本应以 Unicode 编码。默认编码是
    UTF-8。

    由于 JSON 文本的前两个字符始终是 ASCII
    字符[RFC0020],可以确定是否是一个八位字节
    通过查看流是否为 UTF-8、UTF-16(BE 或 LE)或 UTF-32(BE 或 LE)
    前四个八位位组中的空值模式。

    <前><代码> 00 00 00 xx UTF-32BE
    00xx 00xx UTF-16BE
    xx 00 00 00 UTF-32LE
    xx 00 xx 00 UTF-16LE
    xx xx xx xx UTF-8

因此,如果您正在传输 JSON 并声明它是 ISO-8859-1,则不同的 JSON 库可能会以各种方式解释 RFC 中定义 JSON 的 SHALL 子句,例如通过对替换字符进行编码或通过嗅探编码。显然,最好的方法是将此问题带到您无法控制的任何地方,并告诉他们修复它:-)

解决方法

解决此问题的一种方法是创建一个 servlet 过滤器,删除所有与 UTF-8 和 ISO 不兼容的字符-8859-1 并将其替换为 JSON 转义符:

在以下片段中,将 'é' 替换为 '\u00E9',以便任何有问题的 ISO-8859-1 字符都在相同的 7 位中安全传输:

之前:< code>{ "a" : "éte" }

之后: { "a" : "\u00E9te" }

它不那么清晰,但从语义上来说,它是相同的,并且任何好的JSON 库应该以相同的方式对待它们。

RFC 4627 states that JSON text SHALL be encoded in Unicode, whatever that means, and json.org indicates that all characters be "unicode characters":

  • Encoding

    JSON text SHALL be encoded in Unicode. The default encoding is
    UTF-8.

    Since the first two characters of a JSON text will always be ASCII
    characters [RFC0020], it is possible to determine whether an octet
    stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
    at the pattern of nulls in the first four octets.

       00 00 00 xx  UTF-32BE
       00 xx 00 xx  UTF-16BE
       xx 00 00 00  UTF-32LE
       xx 00 xx 00  UTF-16LE
       xx xx xx xx  UTF-8
    

So if you're transferring JSON and saying that it's ISO-8859-1 then different JSON libraries may interpret the SHALL clause from the RFC that defines JSON in various ways, e.g. by encoding the replacement character or by sniffing the encoding. The best way if obviously to take this to whatever is outside your control and tell them to fix it :-)

Workarounds

One way to work around it is to create a servlet filter that removes all characters that are incompatible with both UTF-8 and ISO-8859-1 and replace them with JSON escapes:

In the following fragment, replace 'é' with '\u00E9' so that any offending ISO-8859-1 character is safely transported in the 7-bits that are identical:

Before: { "a" : "éte" }

After: { "a" : "\u00E9te" }

It's not as legible, but semantically speaking, it's the same, and any good JSON library should treat them identically.

萌面超妹 2024-09-15 17:19:34

php 函数 json_encode 不支持 ISO-8859-1 编码数据。

本文可能会帮助您解决问题:http://www.pabloviquez.com/2009/07/json-iso-8859-1-and-utf-8-%E2%80%93-part2/

The php function json_encode does not support ISO-8859-1 encoded data.

This article might help you with your problem: http://www.pabloviquez.com/2009/07/json-iso-8859-1-and-utf-8-%E2%80%93-part2/

心舞飞扬 2024-09-15 17:19:34

如果你想从数据库中检索数据,你应该在从ajax页面发送请求的页面中的句子下写下这些数据。例如,如果您在页面“A”中编写 HTML 和 AJAX 代码并将变量从 java 代码发送到页面“B”,则将这些代码编写在页面“B”中。
不要忘记您的数据库应该处于 unicode 模式,例如“utf8_general_ci”。

mysqli_query ($conn,"set character_set_client='utf8'");
mysqli_query ($conn,"set character_set_results='utf8'");
mysqli_query ($conn,"set collation_connection='utf8_general_ci'");
mysqli_query($conn,"set collation_connection='utf8_persian_ci'");
mysqli_set_charset($conn,"set character_set_results='utf8'") ;
mysqli_set_charset($conn,"set collation_connection='utf8_general_ci'") ;

;
这些句子是我为波斯语写的,你可以修改它。 $conn是一个变量,用于连接MySQL数据库中的指定表。

if you want retrieved data from database you should write these under sentences in the page that send request from ajax page. For example if you write HTML and AJAX code in page "A" and send variable from java code to page "B", write these codes in page "B".
don't forgot your database should be in unicode mode such as "utf8_general_ci".

mysqli_query ($conn,"set character_set_client='utf8'");
mysqli_query ($conn,"set character_set_results='utf8'");
mysqli_query ($conn,"set collation_connection='utf8_general_ci'");
mysqli_query($conn,"set collation_connection='utf8_persian_ci'");
mysqli_set_charset($conn,"set character_set_results='utf8'") ;
mysqli_set_charset($conn,"set collation_connection='utf8_general_ci'") ;

;
I wrote these sentence for Persian language, you can modify it. $conn is a variable for connect to specified table in database of MySQL.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文