Servlet 是否知道使用 http-equiv 指定的发送表单的编码?
servlet 是否知道使用 http-equiv 指定的发送表单的编码?
当我使用 http-equiv 指定 POST 表单的编码时,如下所示:
<HTML>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=gb2312'/>
</head>
<BODY >
<form name="form" method="post" >
<input type="text" name="v_rcvname" value="相宜本草">
</form>
</BODY>
</HTML>
然后在 servlet 中我使用方法 request.getCharacterEncoding()
我得到 null
! 那么,有没有一种方法可以告诉服务器我正在使用某种字符编码对数据进行编码?
Does a servlet knows the encoding of the sent form that specified using http-equiv?
When I specify an encoding of a POSTed form using http-equiv like that:
<HTML>
<head>
<meta http-equiv='Content-Type' content='text/html; charset=gb2312'/>
</head>
<BODY >
<form name="form" method="post" >
<input type="text" name="v_rcvname" value="相宜本草">
</form>
</BODY>
</HTML>
And then at the servlet I use the method, request.getCharacterEncoding()
I got null
!
So, Is there a way that I can tell the server that I am encoding the data in some char encoding??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这确实会从大多数网络浏览器返回
null
。但通常您可以放心地假设网络浏览器实际上使用了原始响应标头中指定的编码,在本例中为gb2312
。一种常见的方法是创建一个Filter
来检查请求编码,然后使用ServletRequest#setCharacterEncoding()
强制使用所需的值(当然,您应该在整个过程中一致使用该值)您的网络应用程序)。将此
Filter
映射到覆盖所有servlet 请求的url-pattern
,例如/*
。如果你没有这样做并放手,那么servlet容器将使用其默认编码来解析参数,通常是
ISO-8859-1
,这反过来又是错误的。您输入的相宜本草
最终会像ÏàÒ˱¾²Ý
一样。This will indeed return
null
from most webbrowsers. But usually you can safely assume that the webbrowser has actually used the encoding as specified in the original response header, which is in this casegb2312
. A common approach is to create aFilter
which checks the request encoding and then usesServletRequest#setCharacterEncoding()
to force the desired value (which you should of course use consistently throughout your webapplication).Map this
Filter
on anurl-pattern
covering all servlet requests, e.g./*
.If you didn't do this and let it go, then the servletcontainer will use its default encoding to parse the parameters, which is usually
ISO-8859-1
, which in turn is wrong. Your input of相宜本草
would end up likeÏàÒ˱¾²Ý
.无法以 GB2312 格式将 POST 数据发送回。我认为 UTF-8 是 W3C 的推荐标准,所有新浏览器仅以 Latin-1 或 UTF-8 格式发送回数据。
我们能够在 Win 95 上的旧 IE 中恢复 GB2312 编码数据,但在新的基于 Unicode 的浏览器上通常不可能。
在 Firefox 上查看这个测试,
我的页面是 GB2312,我在所有地方都指定了 GB2312,但 Firefox 根本忽略了它。
一些损坏的浏览器甚至用 Latin-1 编码中文。我们最近添加了一个具有已知值的隐藏字段。通过检查该值,我们可以找出编码。
request.getCharacterEncoding() 返回 Content-Type 的编码。正如您从我的跟踪中看到的,它始终为空。
It's impossible to send POST data back in GB2312. I think UTF-8 is the W3C recommendation and all new browsers only send data back in either Latin-1 or UTF-8.
We were able to get GB2312 encoded data back in old IE on Win 95 but it's generally not possible on the new Unicode based browsers.
See this test on Firefox,
My page is in GB2312 and I specified GB2312 everywhere but the Firefox simply ignores it.
Some broken browsers even encode Chinese in Latin-1. We recently added a hidden field with a known value. By checking the value, we can figure out the encoding.
request.getCharacterEncoding() returns the encoding from Content-Type. As you can see from my trace, it's always null.