WinHTTP 是否正在下载空字节,或者我是否错误地复制了结果缓冲区?

发布于 2024-12-28 19:24:56 字数 3681 浏览 4 评论 0原文

我最近将一个完全工作的 WinInet 程序移植到 WinHTTP。下面是我编写的一个函数,用于将整个 GET 请求包装到一行代码中:

bool Get(Url url, std::vector<char>& data, ProgressCallbackFunction progressCallback = nullptr) throw()
{
    long cl = -1;
    DWORD clSize = sizeof(cl);
    DWORD readCount = 0;
    DWORD totalReadCount = 0;
    DWORD availableBytes = 0;
    std::vector<char> buf;

    if (_session != NULL)
        throw std::exception("Concurrent sessions are not supported");

    _session = ::WinHttpOpen(_userAgent.c_str(), WINHTTP_ACCESS_TYPE_NO_PROXY, NULL, NULL, NULL);
    auto connection = ::WinHttpConnect(_session, url.HostName.c_str(), url.Port, 0);
    auto request = ::WinHttpOpenRequest(connection, TEXT("GET"), url.GetPathAndQuery().c_str(), NULL, NULL, NULL, WINHTTP_FLAG_REFRESH);

    if (request == NULL)
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    auto sendRequest = ::WinHttpSendRequest(request, WINHTTP_NO_ADDITIONAL_HEADERS, NULL, WINHTTP_NO_REQUEST_DATA, NULL, NULL, NULL);
    if (sendRequest == FALSE)
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(request);
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    if (::WinHttpReceiveResponse(request, NULL))
    {
        if (progressCallback != nullptr && progressCallback != NULL)
        {
            if (!::WinHttpQueryHeaders(request, WINHTTP_QUERY_CONTENT_LENGTH | WINHTTP_QUERY_FLAG_NUMBER, WINHTTP_HEADER_NAME_BY_INDEX, reinterpret_cast<LPVOID>(&cl), &clSize, 0))
            {
                cl = -1;    
            }
        }

        while (::WinHttpQueryDataAvailable(request, &availableBytes))
        {
            if (availableBytes)
            {
                buf.resize(availableBytes + 1);
                auto hasRead = ::WinHttpReadData(request, &buf[0], availableBytes, &readCount);
                totalReadCount += readCount;
                data.insert(data.end(), buf.begin(), buf.begin() + readCount);
                buf.clear();

                if (progressCallback != nullptr && progressCallback != NULL)
                {
                    progressCallback(totalReadCount, cl, getProgress(totalReadCount, cl));
                }
            }
            else
                break;
        }
    }
    else
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(request);
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    ::WinHttpCloseHandle(request);
    ::WinHttpCloseHandle(_session);
    _session = NULL;
    return true;
}

该代码可以工作,因为它下载所请求的 URL。当服务器不返回 Content-Length 标头(大多数情况)时,就会出现问题。该代码仍然会下载所有数据,但转换为字符串时会嵌入空字节。

上面的代码是这样调用的:

Url url(TEXT("http://msdn.microsoft.com/en-us/site/aa384376"));
Client wc;
std::vector<char> results;
wc.Get(url, results);
StdString html(results.begin(), results.end());
StdOut << html << endl;

StdString is typedef std::basic_string; StdOut 是一个使用 cout 或 wcout 的宏,具体取决于是否定义了 UNICODE。

由于嵌入了空值,并非所有响应都显示在控制台上。当我在关闭调试的情况下运行代码时显示的输出可以在此处查看(请注意,换行符只是在文本包含在我的控制台中)。第一个空值出现在最后的“__in”之后,并且恰好出现在显示“按任意键继续...”输出的位置。这是输出的屏幕截图:

Console output

这是 html 变量值的文本可视化屏幕截图,准确显示了位置空值的出现与可查看的内容相关:

Text Visualizer for html

我是否在某处进行了一些错误的复制还是 WinHTTP 有一些我不知道的细微差别?

I recently ported a fully working WinInet program to WinHTTP. Here's a function I wrote to wrap an entire GET request in to a single line of code:

bool Get(Url url, std::vector<char>& data, ProgressCallbackFunction progressCallback = nullptr) throw()
{
    long cl = -1;
    DWORD clSize = sizeof(cl);
    DWORD readCount = 0;
    DWORD totalReadCount = 0;
    DWORD availableBytes = 0;
    std::vector<char> buf;

    if (_session != NULL)
        throw std::exception("Concurrent sessions are not supported");

    _session = ::WinHttpOpen(_userAgent.c_str(), WINHTTP_ACCESS_TYPE_NO_PROXY, NULL, NULL, NULL);
    auto connection = ::WinHttpConnect(_session, url.HostName.c_str(), url.Port, 0);
    auto request = ::WinHttpOpenRequest(connection, TEXT("GET"), url.GetPathAndQuery().c_str(), NULL, NULL, NULL, WINHTTP_FLAG_REFRESH);

    if (request == NULL)
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    auto sendRequest = ::WinHttpSendRequest(request, WINHTTP_NO_ADDITIONAL_HEADERS, NULL, WINHTTP_NO_REQUEST_DATA, NULL, NULL, NULL);
    if (sendRequest == FALSE)
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(request);
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    if (::WinHttpReceiveResponse(request, NULL))
    {
        if (progressCallback != nullptr && progressCallback != NULL)
        {
            if (!::WinHttpQueryHeaders(request, WINHTTP_QUERY_CONTENT_LENGTH | WINHTTP_QUERY_FLAG_NUMBER, WINHTTP_HEADER_NAME_BY_INDEX, reinterpret_cast<LPVOID>(&cl), &clSize, 0))
            {
                cl = -1;    
            }
        }

        while (::WinHttpQueryDataAvailable(request, &availableBytes))
        {
            if (availableBytes)
            {
                buf.resize(availableBytes + 1);
                auto hasRead = ::WinHttpReadData(request, &buf[0], availableBytes, &readCount);
                totalReadCount += readCount;
                data.insert(data.end(), buf.begin(), buf.begin() + readCount);
                buf.clear();

                if (progressCallback != nullptr && progressCallback != NULL)
                {
                    progressCallback(totalReadCount, cl, getProgress(totalReadCount, cl));
                }
            }
            else
                break;
        }
    }
    else
    {
        _lastError = ::GetLastError();
        ::WinHttpCloseHandle(request);
        ::WinHttpCloseHandle(_session);
        _session = NULL;
        return false;
    }

    ::WinHttpCloseHandle(request);
    ::WinHttpCloseHandle(_session);
    _session = NULL;
    return true;
}

The code works in that it downloads the requested URL. The problem arises when the server doesn't return the Content-Length header (which is most of the time). The code will still download all the data, but there will be embedded null bytes when converted to a string.

The code above is called like this:

Url url(TEXT("http://msdn.microsoft.com/en-us/site/aa384376"));
Client wc;
std::vector<char> results;
wc.Get(url, results);
StdString html(results.begin(), results.end());
StdOut << html << endl;

StdString is typedef std::basic_string<TCHAR> and StdOut is a macro that uses cout or wcout depending on if UNICODE is defined.

Because of the embedded nulls, not all of the response is displayed on the console. The output displayed when I run the code with debugging off can be viewed here (Note that the line breaks are simply where the text is wrapped in my console). The first null is seen just after "__in" at the very end and happens right where the "Press any key to continue. . . " output is displayed. Here's a screen cap of the output:

Console output

Here's a text visualizer screen cap of the value of the html variable showing exactly where the nulls appear in relation to what's viewable:

Text visualizer for html

Am I doing some bad copying somewhere or is there some nuance of WinHTTP of which I'm unaware?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

Hello爱情风 2025-01-04 19:24:56

进一步检查输出后,发现这些不是空值。它们是控制台无法显示的 unicode 字符,因为它们存储不正确(因此转换不正确)。我能够通过更改

std::vector<char>

std::vector<unsigned char>

解决 Get 方法(以及调用代码)中的问题,现在一切都很好。

Upon further review of the output, those are not nulls. They're unicode characters that the console can't display because they're being stored incorrectly (and thus being converted incorrectly). I was able to solve the problem in the Get method (and in the calling code) by changing

std::vector<char>

to

std::vector<unsigned char>

and now all is well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文