VS2005 UTF-8 通用 HTTP 处理程序：查询字符串中某些字符存在问题（例如 þ æ）

发布于 2024-08-18 21:33:30 字数 1259 浏览 22 评论 0原文

我正在 VS2005 中开发一个通用 HTTP 处理程序并在调试模式下测试它。它工作得很好，除非查询字符串包含高位字符，例如拉丁小写字母 Thorn /u00FE þ 和拉丁小写字母 Ae /u00E6 æ。

我的机器上的 IE8 设置为发送 UTF-8 URL。调试代码时，我在 IE8 地址栏中输入以下内容：

    http://app/myHandler.ashx?term=foo  // everything works
    http://app/myHandler.ashx?term=þorn  // does not work -- query from database fails

数据库是 SQLite，它使用 UTF-8 编码，并且工作正常。当使用其他 GUI 工具或使用 Visual Studio 的 System.Data.SQLite GUI 加载项直接针对 SQLite 发出时，使用这些特殊字符的查询可以正常工作。

我是否正确解码查询字符串中的值？ GetString() 不解码字节吗？

  public StandardRequest(HttpContext context)
    {
        UTF8Encoding utf8 = new UTF8Encoding();

        if (context.Request.QueryString["term"] != null)
        {            
            byte[] w = utf8.GetBytes(context.Request.QueryString["term"]);
            word = utf8.GetString(w);
          ...

在 HTTP 处理程序中，ContentEncoding 设置为 UTF-8：

     context.Response.ContentEncoding = System.Text.Encoding.UTF8;

并且在调试器的本地窗口中，Request.ContentEncoding 也是 UTF-8。

但是，当我在本地窗口中检查查询字符串值时，查询字符串 'þorn' 中的 term 值显示为 '[]orn'< /em> 这就是它在我发送到数据库的 sql 语句中的显示方式。就好像这个角色还没有被识别出来一样。

我从查询字符串中获取值并将其转换为字符串的方式是否做错了什么？

原文

I am developing a generic HTTP handler in VS2005 and testing it in Debug Mode. It works well except when the query string contains higher-bit characters, e.g. Latin Small Letter Thorn /u00FE þ and Latin Small Letter Ae /u00E6 æ.

IE8 on my machine is set to send UTF-8 URLs. I am typing the following into the IE8 address bar when debugging the code:

    http://app/myHandler.ashx?term=foo  // everything works
    http://app/myHandler.ashx?term=þorn  // does not work -- query from database fails

The database is SQLite and it is using UTF-8 encoding and it works fine. The queries that use these special characters work fine when issued directly against SQLite using other GUI tools or using the System.Data.SQLite GUI add-ins to Visual Studio.

Am I decoding the values from the Query String correctly? Does GetString() not decode the bytes?

  public StandardRequest(HttpContext context)
    {
        UTF8Encoding utf8 = new UTF8Encoding();

        if (context.Request.QueryString["term"] != null)
        {            
            byte[] w = utf8.GetBytes(context.Request.QueryString["term"]);
            word = utf8.GetString(w);
          ...

In the HTTP handler, ContentEncoding is set to UTF-8:

     context.Response.ContentEncoding = System.Text.Encoding.UTF8;

and in the debugger's local's window, Request.ContentEncoding is also UTF-8.

But when I examine the query string value in the locals window, the term value from the query string 'þorn' is being displayed as '[]orn' and that is how it is displayed in the sql statement that I'm sending through to the database. It's as if the character hasn't been recognized.

Am I doing something wrong in the way the value is being grabbed from the query string and converted to a string?

分享到QQ

分享到微博