PHP - 检测 gzip 服务器响应
我正在使用curl 来获取网页,我需要检测响应是否是gzip。
如果在响应标头中指定了 Content-Encoding,则效果非常好,但某些服务器会返回“Transfer-Encoding”:“Chunked”并且没有 Content-Encoding 标头。
有没有办法检测 gzip 或获取原始(编码)服务器响应?
我尝试查看curl_getinfo,但也未指定content_encoding。
谢谢。
I'm using curl to fetch a webpage, I need to detect if the response is gzip or not.
This works perfectly fine if Content-Encoding is specified in the response headers, but some servers instead return "Transfer-Encoding": "Chunked" and no Content-Encoding header.
Is there any way to detect gzip or get the raw (encoded) server response?
I tried looking at curl_getinfo but the content_encoding isn't specified either.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以检查响应是否以 gzip 幻数开头,特别是
1f 8b
。You can check if response starts with gzip magic numbers, specifically
1f 8b
.是的。您可以使用 cURLs 标头函数。例如,您可以定义一个函数来处理标头响应。将
curl_setopt()
与CURLOPT_HEADERFUNCTION
选项结合使用。或者使用CURLOPT_WRITEHEADER
选项将其写入文件(您使用fopen()
创建的文件)。您可能还有更多选项可以使用。查看 curl_setopt() 手册中的可能性。您要查找的标头名称为:Content-Encoding。
如果文件中有输出,您还可以使用 PHP finfo< /a> 及其一些预定义常量。或者 mime_content_type() (已弃用! )如果您无法使用 finfo。
是的。您可以指定接受编码标头。您要寻找的值是身份。
因此,您可以发送:
可能会查看 HTTP/1.1 RFC
获取未编码/未压缩的输出(例如直接将其写入文件)。
为此,请使用
CURLOPT_ENCODING
。您也可以使用curl_setopt 来设置它。Yes. You can use cURLs Header functions. For example you can define an function, which handles the header responses. Use
curl_setopt()
with theCURLOPT_HEADERFUNCTION
option. Or write it to an file (which you have created withfopen()
) with theCURLOPT_WRITEHEADER
option.There may are more options you could use. Look out the possibilities at the curl_setopt() manual. The header you are looking for have the name: Content-Encoding.
If you have the output in a file, you could also use PHPs finfo with some of its predefined constants. Or mime_content_type() (DEPRECATED!) if finfo is not available to you.
Yes. You can specify the accept-encoding header. The value you are look for is identity.
So you can send:
May have look to the HTTP/1.1 RFC
To get an unencoded/uncompressed output (for example to directly write it into a file).
Use
CURLOPT_ENCODING
for this purpose. You can set it also with curl_setopt.您可以发出单独的 HEAD 请求:
或请求将标头添加到原始请求的前缀:
但是,如果您只想获取(已解码的)HTML,则可以使用:
CURL 将自动与服务器协商并对其进行解码为你。
You can either issue a separate HEAD request:
Or request the header to be prefixed to your original request:
But, if you just want to get the (decoded) HTML, you can use:
And CURL will automatically negotiate with the server and decode it for you.