Tomcat、UTF-8 和非 bmp 字符
我正在 Tomcat 6.0.32 上运行基于 GWT 的 Web 应用程序。
我在获取包含非 BMP 字符(这些字符位于文件名中)的 URL 时遇到问题。任何包含 3 字节或更少字符的 URL 都可以正常工作。
例如: 文件名是
I am running a GWT-based web application on Tomcat 6.0.32.
I am having trouble getting URLs that contain non-BMP characters (where these characters live in filenames) to work. Any URLs that contain characters of 3-bytes or less function without a problem.
For example:
The file name is ????.txt - when URL encoded as UTF-8 it is %F0%A5%A7%84.txt
http://localhost:8080/foo/bar/%F0%A5%A7%84.txt?param1=x¶m2=y
that particular view is unable to be found
However if the file name is 犬.txt - when URL encoded as UTF-8 it is %E7%8A%AC.txt -
http://localhost:8080/foo/bar/%E7%8A%AC.txt?param1=x¶m2=y
the view is correctly located.
I have server.xml set in Tomcat to use URIEncoding=UTF-8 on a Windows XP machine.
Does anyone know of any current limitations in Tomcat 6 with respect to the decoding of non-BMP characters?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
几年前就有一些工作来解决剩余的编码问题(早在 Tomcat 4 天),因此所有当前的 Tomcat 版本都应该正确处理任何 UTF-8 字符的解码,前提是 URIEncoding="UTF-8" 设置为 on连接器。
如果没有,则可能的原因(按可能性的顺序是):
- Tomcat配置问题(看起来你已经解决了这个问题)
- 应用程序问题
- 操作系统/文件系统配置/问题
- Tomcat bug
如果您确定这是 Tomcat bug,请报告它,有人会查看。
There was some work quite a few years ago to address the remaining encoding issues (back in the Tomcat 4 days) so all current Tomcat versions should correctly handle decoding of any UTF-8 characters providing that URIEncoding="UTF-8" is set on the connector.
If it doesn't then the possible causes (in order of likelihood are):
- Tomcat configuration issue (it looks like you have this sorted)
- application issue
- OS / file system configuration / issue
- Tomcat bug
If you are sure it is a Tomcat bug, please report it and someone will take a look.