使用 PHP 获取 HTTP 响应标头/重定向状态

发布于 2024-08-03 05:27:04 字数 640 浏览 1 评论 0原文

Hej,

我目前正在开发一个基于 PHP 的工具来监视相当大量的 URL 及其重定向状态。我花了相当多的时间来寻找获取 HTTP 响应标头内容以提取当前重定向代码和位置的最佳方法。目前的做法是这样的:

$resource = fopen( $url, 'r' );
$metadata = stream_get_meta_data( $resource );
$metadata = $metadata['wrapper_data'];

// Looping through the array to find the necessary fields

这适用于我正在监控的 95% 的 URL。对于更多的问题,我通过在执行重定向之前解析网站返回的实际 HTML 来解决这个问题,因为它包含类似“此网站已移至此处”的内容。这似乎不是一个非常强大的解决方案,但它在某些情况下有所帮助。

这仍然给我留下了许多无法自动检查的 URL。

Ask Apache HTTP Headers Tool 这样的工具似乎更可靠,我想知道获取重定向信息的更好方法是什么?

Hej there,

I am currently working on a PHP-based Tool to monitor a rather large number of URLs and their redirect status. I have spent quite some time on finding the best way to fetch the content of the HTTP response headers to extract the current redirect code and location. This is how it's done at the moment:

$resource = fopen( $url, 'r' );
$metadata = stream_get_meta_data( $resource );
$metadata = $metadata['wrapper_data'];

// Looping through the array to find the necessary fields

This works on 95% of the URLs I'm monitoring. For a few more I have solved it by parsing the actual HTML the website returns before the redirect is executed since it contained something like "This website has been moved here". This does not seem to be a very robust solution, but it helped in a few cases.

This still leaves me with a number of URLs I can not check automatically.

Tools like Ask Apache HTTP Headers Tool seem to be more reliable and I was wondering what could be a better way to obtain the redirect information?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

贪恋 2024-08-10 05:27:04

您还可以尝试curl,它是检索所有标头的最短示例,如下所示:

<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://stackoverflow.com');
curl_setopt($ch, CURLOPT_HEADERFUNCTION, 'read_header');
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_exec($ch);

function read_header($ch, $string) {
    print "Received header: $string";
    return strlen($string);
}

输出:

[~]> php headers.php 
Received header: HTTP/1.1 200 OK
Received header: Cache-Control: private
Received header: Content-Type: text/html; charset=utf-8
Received header: Expires: Mon, 31 Aug 2009 09:38:45 GMT
Received header: Server: Microsoft-IIS/7.0
Received header: Date: Mon, 31 Aug 2009 09:38:45 GMT
Received header: Content-Length: 118666
Received header: 

当然,这只是您想要的标头,然后fsockopen 也可以工作。除了 GET 之外,您应该使用 HEAD,因为您只需要标头,而不是内容。

此外,curl 也适用于 https url-s(前提是您已使用 ssl 支持编译它)。

You could also try out curl, shortest possible example that retrieves all the headers looks like this:

<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://stackoverflow.com');
curl_setopt($ch, CURLOPT_HEADERFUNCTION, 'read_header');
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_exec($ch);

function read_header($ch, $string) {
    print "Received header: $string";
    return strlen($string);
}

Output:

[~]> php headers.php 
Received header: HTTP/1.1 200 OK
Received header: Cache-Control: private
Received header: Content-Type: text/html; charset=utf-8
Received header: Expires: Mon, 31 Aug 2009 09:38:45 GMT
Received header: Server: Microsoft-IIS/7.0
Received header: Date: Mon, 31 Aug 2009 09:38:45 GMT
Received header: Content-Length: 118666
Received header: 

Of course, it is just headers you want, then fsockopen works just as well. Except that instead of GET, you should use HEAD, because you just want the headers, not the content.

Also, curl works (provided you have compiled it with ssl support) for https url-s as well.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文