是否可以使用 PHP 计算出网站索引的文件名?

发布于 2024-12-06 05:25:27 字数 687 浏览 0 评论 0原文

以此场景为例:

  1. 用户在我的表单中输入“http://example.com/index.html
  2. 表单被发送到后端脚本,该脚本执行 file_get_contents("http:// example.com/index.html")
  3. PHP 脚本将返回的 html 保存到名为“site.html”的文件中(文件扩展名基于给定地址的扩展名

)考虑第二个例子:

  1. 用户类型“http://example.com”进入我的表单
  2. 表单被发送到后端脚本,该脚本 file_get_contents("http://example.com")
  3. PHP 脚本保存将 html 返回到名为“site.com”的文件(文件扩展名基于给定地址的扩展名)

显然,此方法并不理想,因为文件 “site.com”< /code> 现在几乎没用了。

我的问题是,PHP 有没有办法确定它正在获取什么类型的文件?在第二个示例中,它可以是从 "index.html""default.asp" 的任何内容,具体取决于服务器设置。

Take this scenario for an example:

  1. User types "http://example.com/index.html" into my form
  2. Form is sent to backend script which does file_get_contents("http://example.com/index.html")
  3. PHP script saves the returned html to a file with the name "site.html" (file extension based on the extension of the given address)

Now consider this second example:

  1. User types "http://example.com" into my form
  2. Form is sent to backend script which does file_get_contents("http://example.com")
  3. PHP script saves the returned html to a file with the name "site.com" (file extension based on the extension of the given address)

Clearly this method is not ideal because the file "site.com" is now pretty useless.

My question is, is there a way for PHP to work out what type of file it is getting? In the second example, it could be anything from "index.html" to "default.asp" depending on the server settings.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

深海蓝天 2024-12-13 05:25:27

您可以查看 Content-Type HTTP 标头来确定您获取的文件的类型 - 但您无法找出服务器上使用的文件名是什么(或者即使有文件名),并且(在大多数情况下)index.html 和default.asp 都将返回一个HTML 文档。

You can look at the Content-Type HTTP header to figure out what type of file you are getting — but you can't find out what the filename used on the server is (or even if there is a filename), and (in most cases) both index.html and default.asp will return an HTML document.

猫九 2024-12-13 05:25:27

如果 example.com 与运行 PHP 的服务器不同,则不能。
选项:您可以暴力破解,即尝试不同的可能文件名(index.htm、index.html、index.php、index.asp、default.html 等...)

If example.com is a different server than the one PHP is running on, you can't.
OPTION : you can bruteforce, that is trying different possible filenames (index.htm, index.html, index.php, index.asp, default.html, etc...)

月下凄凉 2024-12-13 05:25:27

好吧,无论如何它都是 HTML 文件。所以总是使用 HTML 扩展。

Well, anyway it will be HTML file. So always use HTML extension.

嘿嘿嘿 2024-12-13 05:25:27

这里有两点:

  • 首先,如果您只是请求目录的根目录,则不可能计算出所提供的文件的名称。这是由 Web 服务器内部处理的,它不会告诉客户端它是如何处理的。对不起。
  • 其次 - 如果没有指定文件名,你当然可以给所有文件添加 .html 扩展名吗?在 99% 的情况下,提供的默认文件是 HTML,即使它是 .asp.php 扩展名,它吐出的所有内容都是动态生成的 HTML。你没有得到源代码,只有结果。

编辑

这是我能想到的最好的解决方案,用于纯粹根据 URL 确定合理的文件扩展名:

$urlParts = parse_url($url);
if (!isset($urlParts['path'])) $ext = 'html'; else {
  $pathParts = explode('/',$urlParts['path']);
  $ext = (count($fileParts = explode('.',array_pop($pathParts))) > 1) ? array_pop($fileParts) : 'html';
}

Two points here:

  • Firstly, no it is not possible to work out the name of the file that was served if you just requested the root of a directory. This is handled internally by the web server, and it doesn't tell the client how it was handled. Sorry.
  • Secondly - surely you can just give all the files a .html extension if no file name was specified? In 99% of cases the default file that is served is HTML, even if it is a .asp or .php extension, all that it spits out is dynamically generated HTML. You don't get the source code, only the result.

EDIT

This is the best solution I can come up with for determining a sensible file extension based purely on the URL:

$urlParts = parse_url($url);
if (!isset($urlParts['path'])) $ext = 'html'; else {
  $pathParts = explode('/',$urlParts['path']);
  $ext = (count($fileParts = explode('.',array_pop($pathParts))) > 1) ? array_pop($fileParts) : 'html';
}
爱的故事 2024-12-13 05:25:27

您无法真正使用 URL 来确定您获得的响应类型。您需要的是 MIME 类型。 w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.17" rel="nofollow noreferrer">Content-Type 响应标头。

您可以从自动填充的 $http_response_header 变量中提取此标头。下面是一个示例,它将获取 URL 的内容,并将响应的 Content-Type 映射到文件扩展名......

$typeMap=array(
        'text/html'  =>'.html',
        'text/plain' =>'.txt',
        'image/jpeg' =>'.jpeg',
        #you get the idea...
);

$html=file_get_contents("http://www.google.com");

$ext='.html';//assume html, and prove otherwise....

//examine the headers
foreach($http_response_header as $hdr)
{
        list($name,$value)=explode(':', $hdr, 2);
        if ($name=='Content-Type')
        {
                #naive parse of content type
                list($type,$extra)=explode(';', $value, 2);
                if (isset($typeMap[$type]))
                        $ext=$typeMap[$type];

                //no need to look at more headers
                break;
        }
}

You can't really use the URL to determine the type of response you get. What you need is the MIME type from the Content-Type response header.

You can extract this header from the automatically populated $http_response_header variable. Here's an example which will get the contents of a URL, and maps the Content-Type of the response to a file extension....

$typeMap=array(
        'text/html'  =>'.html',
        'text/plain' =>'.txt',
        'image/jpeg' =>'.jpeg',
        #you get the idea...
);

$html=file_get_contents("http://www.google.com");

$ext='.html';//assume html, and prove otherwise....

//examine the headers
foreach($http_response_header as $hdr)
{
        list($name,$value)=explode(':', $hdr, 2);
        if ($name=='Content-Type')
        {
                #naive parse of content type
                list($type,$extra)=explode(';', $value, 2);
                if (isset($typeMap[$type]))
                        $ext=$typeMap[$type];

                //no need to look at more headers
                break;
        }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文