当前位置：文江博客话题详情

是否可以使用 PHP 计算出网站索引的文件名？

发布于 2024-12-06 05:25:27 字数 687 浏览 4 评论 0原文

以此场景为例：

用户在我的表单中输入“http://example.com/index.html”
表单被发送到后端脚本，该脚本执行 file_get_contents("http:// example.com/index.html")
PHP 脚本将返回的 html 保存到名为“site.html”的文件中（文件扩展名基于给定地址的扩展名

）考虑第二个例子：

用户类型“http://example.com”进入我的表单
表单被发送到后端脚本，该脚本 file_get_contents("http://example.com")
PHP 脚本保存将 html 返回到名为“site.com”的文件（文件扩展名基于给定地址的扩展名）

显然，此方法并不理想，因为文件 “site.com”< /code> 现在几乎没用了。

我的问题是，PHP 有没有办法确定它正在获取什么类型的文件？在第二个示例中，它可以是从 "index.html" 到 "default.asp" 的任何内容，具体取决于服务器设置。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

深海蓝天 2024-12-13 05:25:27

您可以查看 Content-Type HTTP 标头来确定您获取的文件的类型 - 但您无法找出服务器上使用的文件名是什么（或者即使有文件名），并且（在大多数情况下）index.html 和default.asp 都将返回一个HTML 文档。

回复收藏 0 原文

猫九 2024-12-13 05:25:27

如果 example.com 与运行 PHP 的服务器不同，则不能。
选项：您可以暴力破解，即尝试不同的可能文件名（index.htm、index.html、index.php、index.asp、default.html 等...）

回复收藏 0 原文

月下凄凉 2024-12-13 05:25:27

好吧，无论如何它都是 HTML 文件。所以总是使用 HTML 扩展。

回复收藏 0 原文

嘿嘿嘿 2024-12-13 05:25:27

这里有两点：

首先，如果您只是请求目录的根目录，则不可能计算出所提供的文件的名称。这是由 Web 服务器内部处理的，它不会告诉客户端它是如何处理的。对不起。
其次 - 如果没有指定文件名，你当然可以给所有文件添加 .html 扩展名吗？在 99% 的情况下，提供的默认文件是 HTML，即使它是 .asp 或 .php 扩展名，它吐出的所有内容都是动态生成的 HTML。你没有得到源代码，只有结果。

编辑

这是我能想到的最好的解决方案，用于纯粹根据 URL 确定合理的文件扩展名：

$urlParts = parse_url($url);
if (!isset($urlParts['path'])) $ext = 'html'; else {
  $pathParts = explode('/',$urlParts['path']);
  $ext = (count($fileParts = explode('.',array_pop($pathParts))) > 1) ? array_pop($fileParts) : 'html';
}

Two points here:

Firstly, no it is not possible to work out the name of the file that was served if you just requested the root of a directory. This is handled internally by the web server, and it doesn't tell the client how it was handled. Sorry.
Secondly - surely you can just give all the files a .html extension if no file name was specified? In 99% of cases the default file that is served is HTML, even if it is a .asp or .php extension, all that it spits out is dynamically generated HTML. You don't get the source code, only the result.

EDIT

This is the best solution I can come up with for determining a sensible file extension based purely on the URL:

$urlParts = parse_url($url);
if (!isset($urlParts['path'])) $ext = 'html'; else {
  $pathParts = explode('/',$urlParts['path']);
  $ext = (count($fileParts = explode('.',array_pop($pathParts))) > 1) ? array_pop($fileParts) : 'html';
}

回复收藏 0 原文

爱的故事 2024-12-13 05:25:27

您无法真正使用 URL 来确定您获得的响应类型。您需要的是 MIME 类型。 w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.17" rel="nofollow noreferrer">Content-Type 响应标头。

您可以从自动填充的 $http_response_header 变量中提取此标头。下面是一个示例，它将获取 URL 的内容，并将响应的 Content-Type 映射到文件扩展名......

$typeMap=array(
        'text/html'  =>'.html',
        'text/plain' =>'.txt',
        'image/jpeg' =>'.jpeg',
        #you get the idea...
);

$html=file_get_contents("http://www.google.com");

$ext='.html';//assume html, and prove otherwise....

//examine the headers
foreach($http_response_header as $hdr)
{
        list($name,$value)=explode(':', $hdr, 2);
        if ($name=='Content-Type')
        {
                #naive parse of content type
                list($type,$extra)=explode(';', $value, 2);
                if (isset($typeMap[$type]))
                        $ext=$typeMap[$type];

                //no need to look at more headers
                break;
        }
}

You can't really use the URL to determine the type of response you get. What you need is the MIME type from the Content-Type response header.

You can extract this header from the automatically populated $http_response_header variable. Here's an example which will get the contents of a URL, and maps the Content-Type of the response to a file extension....

$typeMap=array(
        'text/html'  =>'.html',
        'text/plain' =>'.txt',
        'image/jpeg' =>'.jpeg',
        #you get the idea...
);

$html=file_get_contents("http://www.google.com");

$ext='.html';//assume html, and prove otherwise....

//examine the headers
foreach($http_response_header as $hdr)
{
        list($name,$value)=explode(':', $hdr, 2);
        if ($name=='Content-Type')
        {
                #naive parse of content type
                list($type,$extra)=explode(';', $value, 2);
                if (isset($typeMap[$type]))
                        $ext=$typeMap[$type];

                //no need to look at more headers
                break;
        }
}

回复收藏 0 原文

~没有更多了~