如何使最简单的 servlet 过滤器尊重设置的字符编码
感觉就像我被困住了。 我正在尝试编写最简单的servlet Filter(并将其部署到tomcat)。这是一个groovy代码,但实际上我在这里大量使用java方法,所以它几乎是复制粘贴,这也是我添加java标签的原因。
我的问题是 - 如何插入 UTF-8 字符串进行过滤? 代码如下:
public class SimpleFilter implements javax.servlet.Filter
{
...
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws java.io.IOException, javax.servlet.ServletException
{
PrintWriter out = response.getWriter()
chain.doFilter(request, wrapResponse((HttpServletResponse) response))
response.setCharacterEncoding('UTF-8')
response.setContentType('text/plain')
def saw = 'АБВГДЕЙКА ЭТО НЕПРОСТАЯ ПЕРЕДАЧА ABCDEFGHIJKLMNOP!!!'
def bytes = saw.getBytes('UTF-8')
def content = new String(bytes, 'UTF-8')
response.setContentLength(content.length())
out.write(content);
out.close();
}
private static HttpServletResponse wrapResponse(HttpServletResponse response) {
return new HttpServletResponseWrapper(response) {
@Override
public PrintWriter getWriter() {
def writer = new OutputStreamWriter(new ByteArrayOutputStream(), 'UTF-8')
return new PrintWriter(writer)
}
}
}
}
过滤后页面的 Content-Type 为 text/plain;charset=ISO-8859-1
。 因此,内容类型已更改,但字符集被忽略。
正如您所看到的,我采取了一些措施(我想很天真)来确保内容是 UTF-8,但这些步骤实际上都没有帮助。
我还尝试添加 URIEncoding="UTF-8"
或 useBodyEncodingForUri="true"
属性 tomcat conf/server.xml 中的连接器
如果有人解释我做错了什么,那就太好了。
UPD: 只是一点解释 - 我正在编写 XSLT 应用过滤器,这就是我尝试丢弃整个请求的真正原因。
It feels like I'm stuck.
I'm trying to write the simplest servlet Filter (and deploy it to tomcat). It's a groovy code, but actually I'm heavily using java approaches here, so it is almost copy-paste, that's the reason I've added java tag as well.
My question is - how can I insert UTF-8 string to filter?
Here is the code:
public class SimpleFilter implements javax.servlet.Filter
{
...
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain)
throws java.io.IOException, javax.servlet.ServletException
{
PrintWriter out = response.getWriter()
chain.doFilter(request, wrapResponse((HttpServletResponse) response))
response.setCharacterEncoding('UTF-8')
response.setContentType('text/plain')
def saw = 'АБВГДЕЙКА ЭТО НЕПРОСТАЯ ПЕРЕДАЧА ABCDEFGHIJKLMNOP!!!'
def bytes = saw.getBytes('UTF-8')
def content = new String(bytes, 'UTF-8')
response.setContentLength(content.length())
out.write(content);
out.close();
}
private static HttpServletResponse wrapResponse(HttpServletResponse response) {
return new HttpServletResponseWrapper(response) {
@Override
public PrintWriter getWriter() {
def writer = new OutputStreamWriter(new ByteArrayOutputStream(), 'UTF-8')
return new PrintWriter(writer)
}
}
}
}
Content-Type of the filtered page is text/plain;charset=ISO-8859-1
.
So, content type have changed, but charset is ignored.
As you can see, I've take some measures (I guess quite naive) to make sure content is UTF-8, but none of these steps actually was helpful.
I've also tried to add URIEncoding="UTF-8"
or useBodyEncodingForUri="true"
attributes to
Connector in tomcat conf/server.xml
It would be nice if somebody explained me what I'm doing wrong.
UPD: just a bit of explanation - I'm writing XSLT-applying filter, that is the real reason I'm trying to discard whole request.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
不会改变锯齿和内容之间的任何东西。你想要做的是(使用输出流和不是作者,这就是字符集重置为ISO-8859-1 参见 tomcat 文档):
您的代码看起来把字符集设置为UTF-8就可以了。
我不明白你用 HttpResponseWrapper 做什么。
为了明确起见,这将起作用:
Does not change a thing between saw and content. What you want is to do (using the outputstream and not the writer, this is why the charset is reset to ISO-8859-1 See tomcat doc):
Your code looks okay to set the charset as UTF-8.
I don't understand what you are doing with HttpResponseWrapper.
To make it clear, this will work:
这可能是您遇到的问题,或者至少是问题的一部分。作为 < 的文档代码>setCharacterEncoding() 说:
您应该设置编码,并且仅在之后才获取作者。
This might be the problem you're having, or at least it's one part of the problem. As the documentation of
setCharacterEncoding()
says:You should set the encoding, and only after, get the writer.
您尝试在通过调用 getWriter 提交响应后设置内容类型。
请参阅有关 getWriter 的文档和 setCharacterEncoding 了解详细信息。
要修复您的代码,只需将内容类型和编码的设置移至前面几行即可。
You are trying to set the content type after committing the response by calling getWriter.
See the documentation on getWriter and setCharacterEncoding for details.
To fix you code just move the setting of content type and encoding a few lines earlier.