我有一个托管在 AppEngine 上的简单 Restlet 服务。这对字符串执行基本的 CRUD 操作,并且当我使用curl(对于所有动词)测试它时,它可以很好地处理各种 UTF-8 字符。
这是由另一个 AppEngine 应用程序上的 servlet 中托管的简单 Restlet 客户端使用的:
// set response type
resp.setContentType("application/json");
// Create the client resource
ClientResource resource = new ClientResource(Messages.SERVICE_URL + "myentity/id");
// Customize the referrer property
resource.setReferrerRef("myapp");
// Write the response
resource.get().write(resp.getWriter());
以上内容几乎是我在 servlet 中拥有的全部内容。非常简单。
servlet 是通过 jquery ajax 调用的,我返回的 json 格式良好,一切正常,但问题是 UTF-8 编码的字符串返回时是混乱的,例如:
蒙特利尔大学
成为大学??德蒙特利尔
。
我尝试在 servlet 中添加这一行(在其他所有内容之前):
resp.setCharacterEncoding("UTF-8");
但唯一的区别是,我得到的不是 ??
,而是 Universitᅢᄅ de Montrᅢᄅal
(我什至不知道这些是什么类型的角色,我想是亚洲人)。
我 100% 确定 Restlet 服务正常,因为除了逐行调试之外,我还可以使用curl 从 cmd 行测试它,并且它返回格式正确的字符串。
通过查看 firefox 响应的 http 标头(当通过 javascript 调用 servlet 时),我可以看到编码确实是 UTF-8,正如预期的那样。经过几个小时的努力阅读每一篇可能的相关文章后,我遇到了 这个 Restlet 讨论 并注意到我确实在响应的 http 标头上有 Transfer-Encoding: chunked
。我尝试了建议的解决方案(覆盖 ClientResource.toRepresentation,没有任何好处,所以我尝试使用 ClientResource.setRe questEntityBuffering (true)
暗示的 Restlet 2.1,也没有运气)但是我我不相信我的问题与 Transfer-Encoding: chunked
根本有关。
此时我已经没有想法了,我将真的感谢任何建议! O_o
更新:
我尝试使用经典的 UrlConnection 进行手动 GET,并且字符串正常返回:
URL url = new URL(Messages.SERVICE_URL + "myentity/id");
URLConnection conn = url.openConnection();
InputStream is = conn.getInputStream();
StringWriter writer = new StringWriter();
IOUtils.copy(is, writer, "UTF-8");
resp.getWriter().print(writer.toString());
所有 RESTful 和花哨的内容就这么多了......但我仍然不知道为什么原始版本不这样做不行! :/
I have a simple Restlet service hosted on AppEngine. This performs basic CRUD operations with strings and is working well with all sorts of UTF-8 characters when I test it with curl (for all the verbs).
This is consumed by a simple restlet client hosted in a servlet on another AppEngine app:
// set response type
resp.setContentType("application/json");
// Create the client resource
ClientResource resource = new ClientResource(Messages.SERVICE_URL + "myentity/id");
// Customize the referrer property
resource.setReferrerRef("myapp");
// Write the response
resource.get().write(resp.getWriter());
The above is pretty much all I have in the servlet. Very plain.
The servlet is invoked via jquery ajax, and the json that I get back is well formed and everything, but the problem is that UTF-8 encoded strings are coming back scrambled, for example:
Université de Montréal
becomes Universit?? de Montr??al
.
I tried adding this line in the servlet (before everything else):
resp.setCharacterEncoding("UTF-8");
But the only diference is that instead of getting ??
I get Universitᅢᄅ de Montrᅢᄅal
(I don't even know what kind of characters those are, asian I suppose).
I am 100% sure the restlet service is OK, because other than debugging it line by line I am able to test it from cmd line with curl and it's returning well formed strings.
By looking at the http header of the response from firefox (when calling the servlet via javascript) I can see the encoding is indeed UTF-8, as expected. After hours of struggling reading every possible related article I came across this restlet discussion and noticed that indeed I do have Transfer-Encoding: chunked
on the http header of the response. I tried the proposed solutions (override ClientResource.toRepresentation, didn't do any good so I tried restlet 2.1 as susggested with ClientResource.setRequestEntityBuffering(true)
, no luck there either) but I am not convinced my issue is related to Transfer-Encoding: chunked
at all.
At this point I am out of ideas, and I would really appreciate any suggestions! O_o
UPDATE:
I tried doing a manual GET with a classic UrlConnection and the string is coming back alright:
URL url = new URL(Messages.SERVICE_URL + "myentity/id");
URLConnection conn = url.openConnection();
InputStream is = conn.getInputStream();
StringWriter writer = new StringWriter();
IOUtils.copy(is, writer, "UTF-8");
resp.getWriter().print(writer.toString());
So much for being all RESTful and fancy ...but still I have no clue why the original version doesn't work! :/
发布评论
评论(2)
我尝试使用经典的 UrlConnection 进行手动 GET,并且字符串正常返回:
非常适合 RESTful 和花哨......但我仍然不知道为什么原始版本不起作用! :/
I tried doing a manual GET with a classic UrlConnection and the string is coming back alright:
So much for being all RESTful and fancy ...but still I have no clue why the original version doesn't work! :/
您的响应是否包含适当的“Content-Type”标头?它应该类似于“
Content-Type: application/json; charset=UTF-8
”(注意字符集)。尝试启动开发服务器并使用 cURL 从命令行检索资源并检查标头,例如
curl -i http://localhost:8080/myentity/id
。理论上,浏览器应该假定 JSON 采用 UTF-8,但我不相信这一点。Does your response contain the appropriate "Content-Type" header? It should be something like "
Content-Type: application/json; charset=UTF-8
" (note the charset).Try starting your development server and retrieving your resource from the command line using cURL and inspecting the headers, e.g.
curl -i http://localhost:8080/myentity/id
. In theory browsers should assume UTF-8 for JSON, but I wouldn't trust on that.