使用 file_get_contents 时忽略 Content-Length 标头
我需要获取页面的内容,该页面始终发送 Content-Length: 0
标头,但页面永远不会为空。
file_get_contents(url)
仅返回一个空字符串。
页面返回的整个标头是:
HTTP/1.1 200 OK
X-Powered-By: PHP/5.3.10
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Sat, 18 Feb 2012 18:14:59 GMT
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Pragma: no-cache
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Date: Sat, 18 Feb 2012 18:14:59 GMT
Server: lighttpd
是否可以使用 file_get_contents 并忽略标头,还是需要使用curl?
编辑
get_headers(url)
输出(使用print_r
):
Array
(
[0] => HTTP/1.0 200 OK
[1] => X-Powered-By: PHP/5.3.10
[2] => Content-type: text/html
[3] => Content-Length: 0
[4] => Connection: close
[5] => Date: Sat, 18 Feb 2012 22:39:52 GMT
[6] => Server: lighttpd
)
I need to get the contents of a page, which always sends a Content-Length: 0
header, however the page is never empty.
The file_get_contents(url)
just returns an empty string.
The whole header returned by the page is:
HTTP/1.1 200 OK
X-Powered-By: PHP/5.3.10
Expires: Mon, 26 Jul 1997 05:00:00 GMT
Last-Modified: Sat, 18 Feb 2012 18:14:59 GMT
Cache-Control: no-store, no-cache, must-revalidate
Cache-Control: post-check=0, pre-check=0
Pragma: no-cache
Content-Type: text/html; charset=UTF-8
Content-Length: 0
Date: Sat, 18 Feb 2012 18:14:59 GMT
Server: lighttpd
Would it be possible to use file_get_contents and ignore the header or do I need to use curl?
Edit
get_headers(url)
output (using print_r
):
Array
(
[0] => HTTP/1.0 200 OK
[1] => X-Powered-By: PHP/5.3.10
[2] => Content-type: text/html
[3] => Content-Length: 0
[4] => Connection: close
[5] => Date: Sat, 18 Feb 2012 22:39:52 GMT
[6] => Server: lighttpd
)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我相信,没有任何 HTTP 级别的函数无法读取这样的答案。因为它是不正确的 HTTP 答案,所以它说“我的正文是空的,不要读取它”
您肯定需要自己的基于 fread 的函数,它将以物理方式读取套接字。像这样:
然后只需剪切标题即可。
I believe, that none of HTTP-level functions can not read such an answer. Because it is incorrect HTTP answer, it says "my body is empty, dont read it"
You definitely need your own function based on fread, which will phisically read the socket. Something like this:
Then just cut the headers.
正如 Optimist 所指出的,该问题与标头无关,而是我没有向服务器发送任何 User-Agent 标头。
发送 User-Agent 标头后,
file_get_contents
工作正常,即使服务器始终返回Content-Length: 0
。诡异的。
As noted by Optimist the problem had nothing to do with the headers, but rather that I didn't send any User-Agent header to the server.
file_get_contents
worked perfectly after sending User-Agent headers, even though the server always returnsContent-Length: 0
.Weird.