libcurl 处理分块响应
最近我正在构建一个程序,使用libcurl从互联网上抓取网页,我发现当使用分块编码进行响应时,libcurl无法获取块头。然后我查看了libcurl在线文档,它说块头通过WriteFunction进行处理,i我正在使用 libcurl 版本 2.18,并且我已经为 CURLOPT_WRITEFUNCTION 和 CURLOPT_HEADERFUNCTION 设置了回调,它们除了单个字符之外什么都没有块头,libcurl 的块编码有问题吗?我怎样才能让它正常工作?谢谢 PS 我试图抓取的网络是 http:// /list.taobao.com/browse/cat-0.htm,这是一个使用gbk编码的中文网站,
这里是我对cliburl所做的设置
int32_t progress = 0;
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_NOPROGRESS, progress) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADER, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_DEBUGFUNCTION, &HttpSpider::curl_debug_callback) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HTTP_TRANSFER_DECODING, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_WRITEFUNCTION, &HttpSpider::_ProcessRecvString) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADERFUNCTION, &HttpSpider::_ProcessRecvHeader) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_PROGRESSFUNCTION, &HttpSpider::_ProcessRecvProgress) == CURLE_OK);
//这里是其他的,
result = curl_easy_setopt(inst->handle_, CURLOPT_HTTPGET, 1);
result = curl_easy_setopt(inst->handle_, CURLOPT_PROGRESSDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEHEADER, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_URL, *url);
printf("/********** HTTP GET **********/\n");
//try to perform a post action
result = curl_easy_perform(inst->handle_);
根据需要声明回调,有块缓冲区中的长度传递给 debugfunction ,但不在 writefunction 中,我如何在 writefunction 中获取它
recently i am building a program grabbing web pages from internet with libcurl, i found that when the response using chunked encoding, libcurl can't get the chunk header.then i looked into libcurl online documentations, it says chunk header tackles by WriteFunction,i am using libcurl version 2.18, and i've set a callback for CURLOPT_WRITEFUNCTION and CURLOPT_HEADERFUNCTION, they've gotten anything but a single char about chunk header, is there problems about libcurl with chunk encoding? how can i make it work properly?thanks p.s. the web i am trying to grab is http://list.taobao.com/browse/cat-0.htm, it's a Chinese web site using gbk encoding
here are settings i made to cliburl
int32_t progress = 0;
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_NOPROGRESS, progress) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADER, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_DEBUGFUNCTION, &HttpSpider::curl_debug_callback) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HTTP_TRANSFER_DECODING, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_WRITEFUNCTION, &HttpSpider::_ProcessRecvString) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADERFUNCTION, &HttpSpider::_ProcessRecvHeader) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_PROGRESSFUNCTION, &HttpSpider::_ProcessRecvProgress) == CURLE_OK);
//here's somthing else
result = curl_easy_setopt(inst->handle_, CURLOPT_HTTPGET, 1);
result = curl_easy_setopt(inst->handle_, CURLOPT_PROGRESSDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEHEADER, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_URL, *url);
printf("/********** HTTP GET **********/\n");
//try to perform a post action
result = curl_easy_perform(inst->handle_);
callbacks are declared as required, there's chunk length in buffer passed to debugfunction , but not in writefunction, how can i get it in writefunction
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
libcurl 自动且无条件地支持分块编码,无需应用程序执行任何操作。
如果您仍然没有获得任何数据,则存在某种问题/错误/问题......
libcurl supports chunked encoding automatically and unconditionally without the application having to do anything.
If you don't get any data still, there's some kind of problem/bug/issue...