libcurl 处理分块响应

发布于 2024-12-15 00:48:07 字数 1781 浏览 1 评论 0原文

最近我正在构建一个程序,使用libcurl从互联网上抓取网页,我发现当使用分块编码进行响应时,libcurl无法获取块头。然后我查看了libcurl在线文档,它说块头通过WriteFunction进行处理,i我正在使用 libcurl 版本 2.18,并且我已经为 CURLOPT_WRITEFUNCTION 和 CURLOPT_HEADERFUNCTION 设置了回调,它们除了单个字符之外什么都没有块头,libcurl 的块编码有问题吗?我怎样才能让它正常工作?谢谢 PS 我试图抓取的网络是 http:// /list.taobao.com/browse/cat-0.htm,这是一个使用gbk编码的中文网站,

这里是我对cliburl所做的设置

int32_t progress = 0;
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_NOPROGRESS, progress) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADER, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_DEBUGFUNCTION, &HttpSpider::curl_debug_callback) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HTTP_TRANSFER_DECODING, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_WRITEFUNCTION, &HttpSpider::_ProcessRecvString) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADERFUNCTION, &HttpSpider::_ProcessRecvHeader) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_PROGRESSFUNCTION, &HttpSpider::_ProcessRecvProgress) == CURLE_OK);

//这里是其他的,

result = curl_easy_setopt(inst->handle_, CURLOPT_HTTPGET, 1);
result = curl_easy_setopt(inst->handle_, CURLOPT_PROGRESSDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEHEADER, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_URL, *url);

printf("/**********     HTTP GET     **********/\n");
//try to perform a post action
result = curl_easy_perform(inst->handle_);  

根据需要声明回调,有块缓冲区中的长度传递给 debugfunction ,但不在 writefunction 中,我如何在 writefunction 中获取它

recently i am building a program grabbing web pages from internet with libcurl, i found that when the response using chunked encoding, libcurl can't get the chunk header.then i looked into libcurl online documentations, it says chunk header tackles by WriteFunction,i am using libcurl version 2.18, and i've set a callback for CURLOPT_WRITEFUNCTION and CURLOPT_HEADERFUNCTION, they've gotten anything but a single char about chunk header, is there problems about libcurl with chunk encoding? how can i make it work properly?thanks p.s. the web i am trying to grab is http://list.taobao.com/browse/cat-0.htm, it's a Chinese web site using gbk encoding

here are settings i made to cliburl

int32_t progress = 0;
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_NOPROGRESS, progress) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADER, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_DEBUGFUNCTION, &HttpSpider::curl_debug_callback) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HTTP_TRANSFER_DECODING, 1) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_WRITEFUNCTION, &HttpSpider::_ProcessRecvString) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_HEADERFUNCTION, &HttpSpider::_ProcessRecvHeader) == CURLE_OK);
PROCESS_ERROR(curl_easy_setopt(handle_, CURLOPT_PROGRESSFUNCTION, &HttpSpider::_ProcessRecvProgress) == CURLE_OK);

//here's somthing else

result = curl_easy_setopt(inst->handle_, CURLOPT_HTTPGET, 1);
result = curl_easy_setopt(inst->handle_, CURLOPT_PROGRESSDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEDATA, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_WRITEHEADER, param);
result = curl_easy_setopt(inst->handle_, CURLOPT_URL, *url);

printf("/**********     HTTP GET     **********/\n");
//try to perform a post action
result = curl_easy_perform(inst->handle_);  

callbacks are declared as required, there's chunk length in buffer passed to debugfunction , but not in writefunction, how can i get it in writefunction

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

简单气质女生网名 2024-12-22 00:48:07

libcurl 自动且无条件地支持分块编码,无需应用程序执行任何操作。

如果您仍然没有获得任何数据,则存在某种问题/错误/问题......

libcurl supports chunked encoding automatically and unconditionally without the application having to do anything.

If you don't get any data still, there's some kind of problem/bug/issue...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文