WinHTTP 是否正在下载空字节,或者我是否错误地复制了结果缓冲区?
我最近将一个完全工作的 WinInet 程序移植到 WinHTTP。下面是我编写的一个函数,用于将整个 GET 请求包装到一行代码中:
bool Get(Url url, std::vector<char>& data, ProgressCallbackFunction progressCallback = nullptr) throw()
{
long cl = -1;
DWORD clSize = sizeof(cl);
DWORD readCount = 0;
DWORD totalReadCount = 0;
DWORD availableBytes = 0;
std::vector<char> buf;
if (_session != NULL)
throw std::exception("Concurrent sessions are not supported");
_session = ::WinHttpOpen(_userAgent.c_str(), WINHTTP_ACCESS_TYPE_NO_PROXY, NULL, NULL, NULL);
auto connection = ::WinHttpConnect(_session, url.HostName.c_str(), url.Port, 0);
auto request = ::WinHttpOpenRequest(connection, TEXT("GET"), url.GetPathAndQuery().c_str(), NULL, NULL, NULL, WINHTTP_FLAG_REFRESH);
if (request == NULL)
{
_lastError = ::GetLastError();
::WinHttpCloseHandle(_session);
_session = NULL;
return false;
}
auto sendRequest = ::WinHttpSendRequest(request, WINHTTP_NO_ADDITIONAL_HEADERS, NULL, WINHTTP_NO_REQUEST_DATA, NULL, NULL, NULL);
if (sendRequest == FALSE)
{
_lastError = ::GetLastError();
::WinHttpCloseHandle(request);
::WinHttpCloseHandle(_session);
_session = NULL;
return false;
}
if (::WinHttpReceiveResponse(request, NULL))
{
if (progressCallback != nullptr && progressCallback != NULL)
{
if (!::WinHttpQueryHeaders(request, WINHTTP_QUERY_CONTENT_LENGTH | WINHTTP_QUERY_FLAG_NUMBER, WINHTTP_HEADER_NAME_BY_INDEX, reinterpret_cast<LPVOID>(&cl), &clSize, 0))
{
cl = -1;
}
}
while (::WinHttpQueryDataAvailable(request, &availableBytes))
{
if (availableBytes)
{
buf.resize(availableBytes + 1);
auto hasRead = ::WinHttpReadData(request, &buf[0], availableBytes, &readCount);
totalReadCount += readCount;
data.insert(data.end(), buf.begin(), buf.begin() + readCount);
buf.clear();
if (progressCallback != nullptr && progressCallback != NULL)
{
progressCallback(totalReadCount, cl, getProgress(totalReadCount, cl));
}
}
else
break;
}
}
else
{
_lastError = ::GetLastError();
::WinHttpCloseHandle(request);
::WinHttpCloseHandle(_session);
_session = NULL;
return false;
}
::WinHttpCloseHandle(request);
::WinHttpCloseHandle(_session);
_session = NULL;
return true;
}
该代码可以工作,因为它下载所请求的 URL。当服务器不返回 Content-Length 标头(大多数情况)时,就会出现问题。该代码仍然会下载所有数据,但转换为字符串时会嵌入空字节。
上面的代码是这样调用的:
Url url(TEXT("http://msdn.microsoft.com/en-us/site/aa384376"));
Client wc;
std::vector<char> results;
wc.Get(url, results);
StdString html(results.begin(), results.end());
StdOut << html << endl;
StdString is typedef std::basic_string
由于嵌入了空值,并非所有响应都显示在控制台上。当我在关闭调试的情况下运行代码时显示的输出可以在此处查看(请注意,换行符只是在文本包含在我的控制台中)。第一个空值出现在最后的“__in”之后,并且恰好出现在显示“按任意键继续...”输出的位置。这是输出的屏幕截图:
这是 html 变量值的文本可视化屏幕截图,准确显示了位置空值的出现与可查看的内容相关:
我是否在某处进行了一些错误的复制还是 WinHTTP 有一些我不知道的细微差别?
I recently ported a fully working WinInet program to WinHTTP. Here's a function I wrote to wrap an entire GET request in to a single line of code:
bool Get(Url url, std::vector<char>& data, ProgressCallbackFunction progressCallback = nullptr) throw()
{
long cl = -1;
DWORD clSize = sizeof(cl);
DWORD readCount = 0;
DWORD totalReadCount = 0;
DWORD availableBytes = 0;
std::vector<char> buf;
if (_session != NULL)
throw std::exception("Concurrent sessions are not supported");
_session = ::WinHttpOpen(_userAgent.c_str(), WINHTTP_ACCESS_TYPE_NO_PROXY, NULL, NULL, NULL);
auto connection = ::WinHttpConnect(_session, url.HostName.c_str(), url.Port, 0);
auto request = ::WinHttpOpenRequest(connection, TEXT("GET"), url.GetPathAndQuery().c_str(), NULL, NULL, NULL, WINHTTP_FLAG_REFRESH);
if (request == NULL)
{
_lastError = ::GetLastError();
::WinHttpCloseHandle(_session);
_session = NULL;
return false;
}
auto sendRequest = ::WinHttpSendRequest(request, WINHTTP_NO_ADDITIONAL_HEADERS, NULL, WINHTTP_NO_REQUEST_DATA, NULL, NULL, NULL);
if (sendRequest == FALSE)
{
_lastError = ::GetLastError();
::WinHttpCloseHandle(request);
::WinHttpCloseHandle(_session);
_session = NULL;
return false;
}
if (::WinHttpReceiveResponse(request, NULL))
{
if (progressCallback != nullptr && progressCallback != NULL)
{
if (!::WinHttpQueryHeaders(request, WINHTTP_QUERY_CONTENT_LENGTH | WINHTTP_QUERY_FLAG_NUMBER, WINHTTP_HEADER_NAME_BY_INDEX, reinterpret_cast<LPVOID>(&cl), &clSize, 0))
{
cl = -1;
}
}
while (::WinHttpQueryDataAvailable(request, &availableBytes))
{
if (availableBytes)
{
buf.resize(availableBytes + 1);
auto hasRead = ::WinHttpReadData(request, &buf[0], availableBytes, &readCount);
totalReadCount += readCount;
data.insert(data.end(), buf.begin(), buf.begin() + readCount);
buf.clear();
if (progressCallback != nullptr && progressCallback != NULL)
{
progressCallback(totalReadCount, cl, getProgress(totalReadCount, cl));
}
}
else
break;
}
}
else
{
_lastError = ::GetLastError();
::WinHttpCloseHandle(request);
::WinHttpCloseHandle(_session);
_session = NULL;
return false;
}
::WinHttpCloseHandle(request);
::WinHttpCloseHandle(_session);
_session = NULL;
return true;
}
The code works in that it downloads the requested URL. The problem arises when the server doesn't return the Content-Length header (which is most of the time). The code will still download all the data, but there will be embedded null bytes when converted to a string.
The code above is called like this:
Url url(TEXT("http://msdn.microsoft.com/en-us/site/aa384376"));
Client wc;
std::vector<char> results;
wc.Get(url, results);
StdString html(results.begin(), results.end());
StdOut << html << endl;
StdString is typedef std::basic_string<TCHAR> and StdOut is a macro that uses cout or wcout depending on if UNICODE is defined.
Because of the embedded nulls, not all of the response is displayed on the console. The output displayed when I run the code with debugging off can be viewed here (Note that the line breaks are simply where the text is wrapped in my console). The first null is seen just after "__in" at the very end and happens right where the "Press any key to continue. . . " output is displayed. Here's a screen cap of the output:
Here's a text visualizer screen cap of the value of the html variable showing exactly where the nulls appear in relation to what's viewable:
Am I doing some bad copying somewhere or is there some nuance of WinHTTP of which I'm unaware?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
进一步检查输出后,发现这些不是空值。它们是控制台无法显示的 unicode 字符,因为它们存储不正确(因此转换不正确)。我能够通过更改
来
解决 Get 方法(以及调用代码)中的问题,现在一切都很好。
Upon further review of the output, those are not nulls. They're unicode characters that the console can't display because they're being stored incorrectly (and thus being converted incorrectly). I was able to solve the problem in the Get method (and in the calling code) by changing
to
and now all is well.