VS2005 UTF-8 通用 HTTP 处理程序:查询字符串中某些字符存在问题(例如 þ æ)
我正在 VS2005 中开发一个通用 HTTP 处理程序并在调试模式下测试它。它工作得很好,除非查询字符串包含高位字符,例如拉丁小写字母 Thorn /u00FE þ 和拉丁小写字母 Ae /u00E6 æ。
我的机器上的 IE8 设置为发送 UTF-8 URL。调试代码时,我在 IE8 地址栏中输入以下内容:
http://app/myHandler.ashx?term=foo // everything works http://app/myHandler.ashx?term=þorn // does not work -- query from database fails
数据库是 SQLite,它使用 UTF-8 编码,并且工作正常。当使用其他 GUI 工具或使用 Visual Studio 的 System.Data.SQLite GUI 加载项直接针对 SQLite 发出时,使用这些特殊字符的查询可以正常工作。
我是否正确解码查询字符串中的值? GetString() 不解码字节吗?
public StandardRequest(HttpContext context) { UTF8Encoding utf8 = new UTF8Encoding(); if (context.Request.QueryString["term"] != null) { byte[] w = utf8.GetBytes(context.Request.QueryString["term"]); word = utf8.GetString(w); ...
在 HTTP 处理程序中,ContentEncoding 设置为 UTF-8:
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
并且在调试器的本地窗口中,Request.ContentEncoding 也是 UTF-8。
但是,当我在本地窗口中检查查询字符串值时,查询字符串 'þorn' 中的 term 值显示为 '[]orn'< /em> 这就是它在我发送到数据库的 sql 语句中的显示方式。就好像这个角色还没有被识别出来一样。
我从查询字符串中获取值并将其转换为字符串的方式是否做错了什么?
I am developing a generic HTTP handler in VS2005 and testing it in Debug Mode. It works well except when the query string contains higher-bit characters, e.g. Latin Small Letter Thorn /u00FE þ and Latin Small Letter Ae /u00E6 æ.
IE8 on my machine is set to send UTF-8 URLs. I am typing the following into the IE8 address bar when debugging the code:
http://app/myHandler.ashx?term=foo // everything works http://app/myHandler.ashx?term=þorn // does not work -- query from database fails
The database is SQLite and it is using UTF-8 encoding and it works fine. The queries that use these special characters work fine when issued directly against SQLite using other GUI tools or using the System.Data.SQLite GUI add-ins to Visual Studio.
Am I decoding the values from the Query String correctly? Does GetString() not decode the bytes?
public StandardRequest(HttpContext context) { UTF8Encoding utf8 = new UTF8Encoding(); if (context.Request.QueryString["term"] != null) { byte[] w = utf8.GetBytes(context.Request.QueryString["term"]); word = utf8.GetString(w); ...
In the HTTP handler, ContentEncoding is set to UTF-8:
context.Response.ContentEncoding = System.Text.Encoding.UTF8;
and in the debugger's local's window, Request.ContentEncoding is also UTF-8.
But when I examine the query string value in the locals window, the term value from the query string 'þorn' is being displayed as '[]orn' and that is how it is displayed in the sql statement that I'm sending through to the database. It's as if the character hasn't been recognized.
Am I doing something wrong in the way the value is being grabbed from the query string and converted to a string?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
解码之前,
context.Request.QueryString["term"]
在整数中包含什么?也许它已经有了你想要的价值。如果当前字节不是 UTF8,则utf8.GetBytes
将无济于事。What does
context.Request.QueryString["term"]
contain in integer, before decoding? Maybe it already has the value you want. If the current bytes aren't in UTF8,utf8.GetBytes
won't help.感谢您的提示,eed3si9n。它引导我找到了解决方案。
我有一个(错误的)印象:IE 会将地址栏中手写的字符转换为“设置”中指定的编码。事实并非如此。在那里输入的 URL 必须已被编码。
Thanks for the tip, eed3si9n. It led me to the solution.
I was under the (mistaken) impression that IE would convert characters typed by hand into the address bar into the encoding specified in Settings. It doesn't. The URL typed there must already be encoded.