Servlet 是否知道使用 http-equiv 指定的发送表单的编码?

发布于 2024-09-01 21:53:10 字数 519 浏览 6 评论 0原文

servlet 是否知道使用 http-equiv 指定的发送表单的编码?

当我使用 http-equiv 指定 POST 表单的编码时,如下所示:

<HTML>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=gb2312'/>
</head>
<BODY >
<form name="form" method="post" >
    <input type="text" name="v_rcvname" value="相宜本草">
</form>
</BODY>
</HTML>

然后在 servlet 中我使用方法 request.getCharacterEncoding() 我得到 null ! 那么,有没有一种方法可以告诉服务器我正在使用某种字符编码对数据进行编码?

Does a servlet knows the encoding of the sent form that specified using http-equiv?

When I specify an encoding of a POSTed form using http-equiv like that:

<HTML>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=gb2312'/>
</head>
<BODY >
<form name="form" method="post" >
    <input type="text" name="v_rcvname" value="相宜本草">
</form>
</BODY>
</HTML>

And then at the servlet I use the method, request.getCharacterEncoding() I got null !
So, Is there a way that I can tell the server that I am encoding the data in some char encoding??

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

白云不回头 2024-09-08 21:53:10

这确实会从大多数网络浏览器返回 null。但通常您可以放心地假设网络浏览器实际上使用了原始响应标头中指定的编码,在本例中为gb2312。一种常见的方法是创建一个 Filter 来检查请求编码,然后使用 ServletRequest#setCharacterEncoding() 强制使用所需的值(当然,您应该在整个过程中一致使用该值)您的网络应用程序)。

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
    if (request.getCharacterEncoding() == null) {
        request.setCharacterEncoding("gb2312");
    }
    chain.doFilter(request, response);
}

将此Filter 映射到覆盖所有servlet 请求的url-pattern,例如/*

如果你没有这样做并放手,那么servlet容器将使用其默认编码来解析参数,通常是ISO-8859-1,这反过来又是错误的。您输入的 相宜本草 最终会像 ÏàÒ˱¾²Ý 一样。

This will indeed return null from most webbrowsers. But usually you can safely assume that the webbrowser has actually used the encoding as specified in the original response header, which is in this case gb2312. A common approach is to create a Filter which checks the request encoding and then uses ServletRequest#setCharacterEncoding() to force the desired value (which you should of course use consistently throughout your webapplication).

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws ServletException, IOException {
    if (request.getCharacterEncoding() == null) {
        request.setCharacterEncoding("gb2312");
    }
    chain.doFilter(request, response);
}

Map this Filter on an url-pattern covering all servlet requests, e.g. /*.

If you didn't do this and let it go, then the servletcontainer will use its default encoding to parse the parameters, which is usually ISO-8859-1, which in turn is wrong. Your input of 相宜本草 would end up like ÏàÒ˱¾²Ý.

只是偏爱你 2024-09-08 21:53:10

无法以 GB2312 格式将 POST 数据发送回。我认为 UTF-8 是 W3C 的推荐标准,所有新浏览器仅以 Latin-1 或 UTF-8 格式发送回数据。

我们能够在 Win 95 上的旧 IE 中恢复 GB2312 编码数据,但在新的基于 Unicode 的浏览器上通常不可能。

在 Firefox 上查看这个测试,

POST / HTTP/1.1
Host: localhost:1234
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 46

我的页面是 GB2312,我在所有地方都指定了 GB2312,但 Firefox 根本忽略了它。

一些损坏的浏览器甚至用 Latin-1 编码中文。我们最近添加了一个具有已知值的隐藏字段。通过检查该值,我们可以找出编码。

request.getCharacterEncoding() 返回 Content-Type 的编码。正如您从我的跟踪中看到的,它始终为空。

It's impossible to send POST data back in GB2312. I think UTF-8 is the W3C recommendation and all new browsers only send data back in either Latin-1 or UTF-8.

We were able to get GB2312 encoded data back in old IE on Win 95 but it's generally not possible on the new Unicode based browsers.

See this test on Firefox,

POST / HTTP/1.1
Host: localhost:1234
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 46

My page is in GB2312 and I specified GB2312 everywhere but the Firefox simply ignores it.

Some broken browsers even encode Chinese in Latin-1. We recently added a hidden field with a known value. By checking the value, we can figure out the encoding.

request.getCharacterEncoding() returns the encoding from Content-Type. As you can see from my trace, it's always null.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文