阻止人们通过表单上传恶意 PHP 文件

发布于 2024-07-14 13:19:46 字数 829 浏览 5 评论 0原文

我在我的网站上用 php 创建了一个上传表单，人们可以在其中上传 zip 文件。然后提取 zip 文件并将所有文件位置添加到数据库中。上传表单仅供人们上传图片，显然，文件位于 zip 文件夹内，在提取文件之前我无法检查正在上传的文件。我需要一段代码来删除所有非图像格式的文件（.png、.jpeg 等）。我真的很担心人们能够上传恶意的 php 文件，这会带来很大的安全风险！我还需要注意有人更改 php 文件的扩展名试图绕过此安全功能。

这是我使用的原始脚本 http:// /net.tutsplus.com/videos/screencasts/how-to-open-zip-files-with-php/

这是实际提取 .zip 文件的代码：

function openZip($file_to_open) {
    global $target;

    $zip = new ZipArchive();
    $x = $zip->open($file_to_open);
    if($x === true) {
        $zip->extractTo($target);
        $zip->close();

        unlink($file_to_open);
    } else {
        die("There was a problem. Please try again!");
    }
}

谢谢，Ben。

原文

I have an upload form created in php on my website where people are able to upload a zip file. The zip file is then extracted and all file locations are added to a database. The upload form is for people to upload pictures only, obviously, with the files being inside the zip folder I cant check what files are being uploaded until the file has been extracted. I need a piece of code which will delete all the files which aren't image formats (.png, .jpeg, etc). I'm really worried about people being able to upload malicious php files, big security risk! I also need to be aware of people changing the extensions of php files trying to get around this security feature.

This is the original script I used http://net.tutsplus.com/videos/screencasts/how-to-open-zip-files-with-php/

This is the code which actually extracts the .zip file:

function openZip($file_to_open) {
    global $target;

    $zip = new ZipArchive();
    $x = $zip->open($file_to_open);
    if($x === true) {
        $zip->extractTo($target);
        $zip->close();

        unlink($file_to_open);
    } else {
        die("There was a problem. Please try again!");
    }
}

Thanks, Ben.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

爱给你人给你 2024-07-21 13:19:47

那么，您可能不应该仅仅依赖文件扩展名。尝试将每个文件传递到图像库以验证它是否确实是图像。

回复收藏 0 原文

空名 2024-07-21 13:19:47

我没有看到在数据库中重命名 php 文件的风险......
只要您不将它们作为 PHP 文件进行评估（或者根本不将它们评估为 PHP 文件），它们就不会造成太大伤害，并且由于没有 .php 扩展名，因此 php 引擎不会触及它们。

我想您还可以在文件中搜索 ...

另外：假设上传到您计算机的文件最坏的情况。将保存它们的文件夹重命名为“病毒”并进行相应处理。不要公开，不要授予任何文件启动权限（特别是php用户）等。

回复收藏 0 原文

愁杀 2024-07-21 13:19:47

您可能还想考虑使用以下库进行 MIME 类型检测：

http:// /ca.php.net/manual/en/ref.fileinfo.php

回复收藏 0 原文

古镇旧梦 2024-07-21 13:19:47

现在您需要依靠硬盘空间来进行解压。您可以检查文件头以确定它们是什么类型。可能有相关的库。

offtopic：让用户选择几张图像而不是上传 zip 文件不是更好吗？对于不知道 zip 是什么的人来说更好（是的，它们存在）

回复收藏 0 原文

寄居人 2024-07-21 13:19:47

如果您将 php 设置为仅解析以 .php 结尾的文件，那么您只需将文件从 somename.php 重命名为 somename.php.jpeg 即可，这样就安全了。

如果你确实想删除这些文件，可以使用 php 的 zip 库。您可以使用它来检查上传的 zip 存档内所有文件的名称和扩展名，如果它包含 php 文件，则向用户提供错误消息。

回复收藏 0 原文

清醇 2024-07-21 13:19:47

就我个人而言，我会在 Apache 配置中添加一些内容，以确保它从文件上传到的位置将 PHP 文件作为文本提供，这样您就安全了，并且可以允许将来上传其他文件类型。

回复收藏 0 原文

大姐，你呐 2024-07-21 13:19:47

请注意通过 getimagesize() 传递恶意 PHP

通过尝试确保图像的图像函数注入 PHP
使用 getimagesize() 函数是安全的

更好地为您的用户徽标使用 gravatar，就像 Stackoverflow 使用的那样；）

回复收藏 0 原文

烟─花易冷 2024-07-21 13:19:47

使用 getimagesize 函数。
完整程序：-
1.) 提取图像/上传文件的扩展名，然后将扩展名与允许的扩展名进行比较。
2.) 现在创建一个随机字符串来重命名上传的文件。最好的想法是md5(session_id().microtime())。它不能重复，如果您的服务器非常快并且可以处理不到一微秒，那么使用增量变量并将它们与字符串相加。
现在移动该文件。

提示
在上传目录中禁用 PHP 文件处理，它将始终防止您受到任何服务器端攻击，如果可能的话，将您的 htaccess 添加到根目录或 httpd 配置文件中，并从那里禁用 htaccess 文件，现在它可以解决您的最大问题

回复收藏 0 原文

国粹 2024-07-21 13:19:46

我真的很担心人们能够上传恶意的 php 文件，这是很大的安全风险！

冰山一角！

我还需要注意有人更改 php 文件的扩展名试图绕过此安全功能。

通常，更改扩展名会阻止 PHP 将这些文件解释为脚本。但这并不是唯一的问题。除了“...php”之外，还有更多的东西可以损坏服务器端； “.htaccess”和设置了 X 位的文件是显而易见的，但绝不是您需要担心的全部。即使忽略服务器端的东西，客户端也存在巨大的问题。

例如，如果有人可以上传“.html”文件，他们可以包含一个

另外，由于某些浏览器（主要是 IE）的“内容嗅探”行为，以“.gif”形式上传的文件实际上可能包含诸如此类的恶意 HTML。如果 IE 看到类似（但不限于）“”的提示信息在文件开头附近，它可以忽略所提供的“Content-Type”并显示为 HTML，从而导致 XSS。

另外，还可以制作一个文件，该文件既是您的图像解析器将接受的有效图像，并且包含嵌入的 HTML。根据用户浏览器的确切版本和图像文件的确切格式（特别是 JPEG，其可能的标头格式的变化很大），可能会出现各种结果。有缓解措施出现在 IE8 中，但现在没用，你一定想知道为什么他们不能简单地停止内容嗅探，你们这些白痴 MS，而不是用蹩脚的非标准扩展给我们带来负担到最初应该是 Just Worked 的 HTTP 标头。

我又陷入了咆哮之中。我会停下来。安全地提供用户提供的图像的策略：

1：切勿使用从用户输入获取的文件名将文件存储在服务器的文件系统上。这可以防止错误和攻击：不同的文件系统对于文件名中允许使用哪些字符有不同的规则，并且“清理”文件名比您想象的要困难得多。

即使您采取了非常严格的限制，例如“仅 ASCII 字母”，您仍然需要担心太长、太短和保留名称：尝试使用“com.txt”这样无害的名称保存文件。 Windows 服务器并观察您的应用程序崩溃。您认为您知道您的应用程序可能运行的每个文件系统的路径名的所有奇怪缺陷吗？自信的？

相反，将文件详细信息（例如名称和媒体类型）存储在数据库中，并使用主键作为文件存储中的名称（例如“74293.dat”）。然后，您需要一种方法来为它们提供不同的明显文件名，例如吐出文件的下载器脚本、执行 Web 服务器内部重定向或 URL 重写的下载器脚本。

2：使用 ZipArchive 时要非常非常小心。 extractTo 中存在同类遍历漏洞，影响最大基于路径的简单 ZIP 提取器。此外，您还容易受到ZIP 炸弹的攻击。最好通过逐步浏览存档中的每个文件条目来避免任何错误文件名的危险（例如，使用 zip_read/zip_entry_*）并检查其详细信息，然后手动将其流解压到具有已知良好名称和模式标志的文件（该文件是您在没有存档帮助的情况下生成的）。忽略 ZIP 内的文件夹路径。

3：如果您可以加载图像文件并再次将其保存，特别是如果您在中间以某种方式处理它（例如调整大小/缩略图，或添加水印），您可以相当确定结果是干净的。理论上，可以制作一个针对特定图像压缩器的图像，这样当它被压缩时，结果也看起来像 HTML，但这对我来说似乎是一个非常困难的攻击。

4：如果您可以将所有图像作为下载提供（即在下载器脚本中使用“Content-Disposition：附件”），那么您可能是安全的。但这对于用户来说可能会带来很大的不便。不过，这可以与 (3) 协同工作，内联提供较小的经过处理的图像，并且仅可下载原始的较高质量图像。

5：如果您必须内联提供未更改的图像，则可以通过从不同的域提供它们来消除跨站点脚本风险。例如，将“images.example.com”用于不受信任的图像，将“www.example.com”用于包含所有逻辑的主站点。确保 cookie 仅限于正确的虚拟主机，并且虚拟主机已设置为除了正确名称之外无法响应任何内容（另请参阅：DNS 重新绑定攻击）。这就是许多网络邮件服务所做的事情。

综上所述，用户提交的媒体内容是一个问题。

总而言之，AAAARRRRRRRGGGGHHH。

预计到达时间重新评论：

在顶部您提到“设置了 X 位的文件”，这是什么意思？

我不能代表 ZipArchive.extractTo() ，因为我还没有测试过它，但是许多提取器在被要求从存档中转储文件时，会重新创建[一些] Unix 文件模式与每个文件关联的标志（如果存档是在 Unix 上创建的，因此实际上具有模式标志）。如果缺少所有者读取权限，这可能会导致您出现权限问题。但如果您的服务器启用了 CGI，这也可能是一个安全问题：X 位可以允许文件被解释为脚本并传递到第一行 hashbang 中列出的任何脚本解释器。

我认为 .htaccess 必须位于主根目录中，事实不是这样吗？

取决于 Apache 的设置方式，特别是 AllowOverride 指令。对于通用主机来说，在任何目录上进行AllowOverride 是很常见的。

如果有人仍然上传像 ../var/www/wr_dir/evil.php 这样的文件会发生什么？

我预计领先的“..”将被丢弃，这就是遭受相同漏洞的其他工具所做的事情。

但我仍然不相信 extractTo() 能够抵御恶意输入，有太多奇怪的小文件名/目录树可能会出错 - 特别是如果您希望在 Windows 服务器上运行。 zip_read() 让您可以更好地控制解档过程，从而减少攻击者的攻击。

Im really worried about people being able to upload malicious php files, big security risk!

Tip of the iceberg!

i also need to be aware of people changing the extensions of php files trying to get around this security feature.

Generally changing the extensions will stop PHP from interpreting those files as scripts. But that's not the only problem. There are more things than ‘...php’ that can damage the server-side; ‘.htaccess’ and files with the X bit set are the obvious ones, but by no means all you have to worry about. Even ignoring the server-side stuff, there's a huge client-side problem.

For example if someone can upload an ‘.html’ file, they can include a <script> tag in it that hijacks a third-party user's session, and deletes all their uploaded files or changes their password or something. This is a classic cross-site-scripting (XSS) attack.

Plus, thanks to the ‘content-sniffing’ behaviours of some browsers (primarily IE), a file that is uploaded as ‘.gif’ can actually contain malicious HTML such as this. If IE sees telltales like (but not limited to) ‘<html>’ near the start of the file it can ignore the served ‘Content-Type’ and display as HTML, resulting in XSS.

Plus, it's possible to craft a file that is both a valid image your image parser will accept, and contains embedded HTML. There are various possible outcomes depending on the exact version of the user's browser and the exact format of the image file (JPEGs in particular have a very variable set of possible header formats). There are mitigations coming in IE8, but that's no use for now, and you have to wonder why they can't simply stop doing content-sniffing, you idiots MS instead of burdening us with shonky non-standard extensions to HTTP headers that should have Just Worked in the first place.

I'm falling into a rant again. I'll stop. Tactics for serving user-supplied images securely:

1: Never store a file on your server's filesystem using a filename taken from user input. This prevents bugs as well as attacks: different filesystems have different rules about what characters are allowable where in a filename, and it's much more difficult than you might think to ‘sanitise’ filenames.

Even if you took something very restrictive like “only ASCII letters”, you still have to worry about too-long, too-short, and reserved names: try to save a file with as innocuous a name as “com.txt” on a Windows server and watch your app go down. Think you know all the weird foibles of path names of every filesystem on which your app might run? Confident?

Instead, store file details (such as name and media-type) in the database, and use the primary key as a name in your filestore (eg. “74293.dat”). You then need a way to serve them with different apparent filenames, such as a downloader script spitting the file out, a downloader script doing a web server internal redirect, or URL rewriting.

2: Be very, very careful using ZipArchive. There have been traversal vulnerabilities in extractTo of the same sort that have affected most naive path-based ZIP extractors. In addition, you lay yourself open to attack from ZIP bombs. Best to avoid any danger of bad filenames, by stepping through each file entry in the archive (eg. using zip_read/zip_entry_*) and checking its details before manually unpacking its stream to a file with known-good name and mode flags, that you generated without the archive's help. Ignore the folder paths inside the ZIP.

3: If you can load an image file and save it back out again, especially if you process it in some way in between (such as to resize/thumbnail it, or add a watermark) you can be reasonably certain that the results will be clean. Theoretically it might be possible to make an image that targeted a particular image compressor, so that when it was compressed the results would also look like HTML, but that seems like a very difficult attack to me.

4: If you can get away with serving all your images as downloads (ie. using ‘Content-Disposition: attachment’ in a downloader script), you're probably safe. But that might be too much of an inconvenience for users. This can work in tandem with (3), though, serving smaller, processed images inline and having the original higher-quality images available as a download only.

5: If you must serve unaltered images inline, you can remove the cross-site-scripting risk by serving them from a different domain. For example use ‘images.example.com’ for untrusted images and ‘www.example.com’ for the main site that holds all the logic. Make sure that cookies are limited to only the correct virtual host, and that the virtual hosts are set up so they cannot respond on anything but their proper names (see also: DNS rebinding attacks). This is what many webmail services do.

In summary, user-submitted media content is a problem.

In summary of the summary, AAAARRRRRRRGGGGHHH.

ETA re comment:

at the top you mentioned about 'files with the X bit set', what do you mean by that?

I can't speak for ZipArchive.extractTo() as I haven't tested it, but many extractors, when asked to dump files out of an archive, will recreate [some of] the Unix file mode flags associated with each file (if the archive was created on a Unix and so actually has mode flags). This can cause you permissions problems if, say, owner read permission is missing. But it can also be a security problem if your server is CGI-enabled: an X bit can allow the file to be interpreted as a script and passed to any script interpreter listed in the hashbang on the first line.

i thought .htaccess had to be in the main root directory, is this not the case?

Depends how Apache is set up, in particular the AllowOverride directive. It is common for general-purpose hosts to AllowOverride on any directory.

what would happen if someone still uploaded a file like ../var/www/wr_dir/evil.php?

I would expect the leading ‘..’ would be discarded, that's what other tools that have suffered the same vulnerability have done.

But I still wouldn't trust extractTo() against hostile input, there are too many weird little filename/directory-tree things that can go wrong — especially if you're expecting ever to run on Windows servers. zip_read() gives you much greater control over the dearchiving process, and hence the attacker much less.

回复收藏 0 原文