当前位置：文江博客话题详情

在服务器上获取文件，使用 PHP GD2 调整大小，安全考虑

发布于 2024-12-22 12:55:12 字数 982 浏览 2 评论 0原文

当服务器从不受信任的域获取文件时，有哪些安全注意事项？

调整您不信任 PHP GD2 库的图像大小时，有哪些安全注意事项？

该文件将存储在服务器计算机上，并提供下载。我知道我不能信任 MIME-Type 标头。还有什么我应该注意的吗？

我有一个网络服务，如下所示：

输入

一个http-URL（或一个字符串，预计是一个URL）

输出

文件的元描述，如果有则为错误。

元描述有两种形式之一：

它是图像 + 我的域中图像的 URL + 图像的缩略图（在我的服务器上生成并托管）
它不是图像 + 我的域中文件的 URL

<强>更新

我可以想到的问题：

远程服务器是一个恶意服务器，它会发送少量信息，足以保持套接字打开，但不会做任何有用的事情 - 就像slowloris 。我不知道这个威胁有多真实。我想通过超时+进度检查可以很容易地避免这种情况。
远程服务器提供看起来像图像的东西（标题、mime 类型），但当我使用 GD2 加载它时会导致 PHP 崩溃。
服务器发送无用或错误的 MIME 类型标头。就像二进制文件的 text-plain 一样。
远程服务器提供带有病毒的映像。我认为调整图像大小可以消除病毒，但如果没有理由缩放，我将提供原始图像。
远程服务器提供一个带有病毒的文件。该文件不会被视为图像，因此我的服务器不会对其执行任何操作。在用户下载并运行它之前，什么都不会发生。

另外，我认为我可以信任我的服务的用户。这是一个私人应用程序，用户可能会对不良行为负责。我认为他们不会故意尝试破坏它。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

烟柳画桥 2024-12-29 12:55:12

当服务器从不受信任的域获取文件时，有哪些安全注意事项？

域（主机）和文件不可信。这分为两点：

传输
数据

要安全地传输数据，请使用超时和大小限制。现代 HTTP 客户端库提供了这两种功能。如果无法及时请求文件，请断开连接。如果文件太大，请删除数据。告诉用户获取文件时出现问题。或者让用户使用用户浏览器和 JavaScript 来处理到该服务器的传输以获取文件。然后发布它。使用脚本设置帖子限制。

只要数据不可信，您就需要谨慎处理。这意味着，您自己实现一个进程，能够在将文件标记为“安全”之前对文件运行不同的安全检查。

调整您不信任 PHP GD2 库的图像大小时，有哪些安全注意事项？

然后不要将不受信任的数据传递到图像库。参见上面的步骤，先使其进入安全状态。

该文件将存储在服务器计算机上，并提供下载。我知道我不能信任 MIME-Type 标头。还有什么我应该注意的吗？

我认为你仍然处于上面的观点。如何从不受信任转向安全。当然，您不能信任 Content-Type 标头，但是理解它也是有好处的。

您想要防范无限制文件上传漏洞^{OWASP< /sup>}。

检查文件名。如果您将数据存储在服务器上，请为其指定一个安全的临时名称，该名称无法预先猜出并且无法通过网络访问。
检查与文件名关联的数据，例如该文件源的 URL 信息。正确处理编码。
丢弃任何不符合您期望的内容，因此请严格检查您制定的前提条件。
在继续之前验证文件数据，例如使用病毒检查程序。
在继续之前验证图像数据。这包括文件头（幻数）以及文件大小和文件内容是否有效。您应该使用专门用于该工作的库，例如图像文件格式畸形检查器。这是专门的软件，因此如果您的这部分业务投入使用。存在许多免费软件图像文件代码，我留下这个只是为了提供信息，无论如何你都不能相信任何推荐，需要进入主题。
如果您打算自己调整图像大小，则需要使一切都双重安全，因为除了托管之外，您还计划处理数据。因此，首先要了解如何处理数据，以找到潜在的问题领域。
进行日志记录和监控。
为一切都出错的情况制定一个计划。
考虑对现有文件重复该过程，因此，如果您更改过程，您也可以自动将这些原则应用于过去完成的上传。
为每种类型的工作创建一个可以在工作完成后进行清洁的系统。一个系统进行下载，一个系统获取元数据等。每次操作后，从映像恢复系统。如果单个组件发生故障，它不会处于被利用状态。此外，如果您检测到故障，您可以让整个系统停止运行，直到找到缺陷为止。

所有这一切都取决于您想要做的程度，但我认为您已经明白了。创建一个适合您的流程，让您知道可以在哪里添加改进，但首先创建一个足够模块化的基础架构，可以处理错误情况，并且可能足够封装该流程以处理任何结果。

您可以将关键部分委托给您不需要关心的系统，例如将处理与托管分开。此外，当您托管图像时，网络服务器一定不能太聪明。系统越愚蠢，它的可利用性就越低（通常）。

如果托管不是您业务的一部分，为什么不将其交给亚马逊 s3 或类似的商店呢？您的域名可以通过 DNS 设置保留。

保持用于验证图像的库是最新的（这意味着您知道使用了哪些库及其版本，例如 PHP exif 扩展正在使用 mbstring 等。第 3 页 - 跟踪整个树）。请注意，您能够以有用的方式向库维护人员报告缺陷，例如通过日志记录、存储上传数据以重现内容等。

了解过去确实存在哪些图像漏洞以及哪些系统/组件/libraries（示例，请参阅免责声明那里）受到影响。

一起（我相信您知道，但是重新阅读一些内容总是好的）：

还要进入主题，这是利用某些东西的常见方法，将基础知识放在 scanit.be/uploads/php-file-upload.pdf" rel="nofollow noreferrer">PHP Web 应用程序中的安全文件上传（Alla Bezroutchko；2007 年 6 月 13 日；PDF）

一些相关问题，什锦：

What are the security considerations when a server fetches a file from an untrusted domain?

The domain (host) and the file is not to be trusted. This spreads over two points:

Transport
Data

To transport the data safely, use a timeout and a size limit. Modern HTTP client libraries offer both of that. If the file could not be requested in time, drop the connection. If the file is too large, drop the data. Tell the user that there was a problem getting the file. Alternatively let the user handle the transport to that server by using the users browser and javascript to obtain the file. Then post it. Set the post limit with your script.

As long as the data is untrusted you need to handle it with caution. That means, you implement yourself a process that is able to run different security checks on the file before you mark it as "safe".

What are the security considerations when resizing an image that you don't trust with PHPs GD2 library?

Do not pass untrusted data to the image library then. See the step above, bring it into a safe state first.

The file will be stored on the server machine, and will be offered for download. I know I can't trust the MIME-Type header. Is there anything else I should be aware of?

I think you're still at the point above. How to come to safe from untrusted. Sure you can't trust the Content-Type header, however it's good to understand it as well.

You want to protect against the Unrestricted File Upload Vulnerability^OWASP.

Check the filename. If you store the data on your server, give it a safe temporary name that can not be guessed upfront and that is not accessible via the web.
Check the data associated with the filename, e.g. the URL information of the source of that file. Properly handle encoding.
Drop anything that does not meet your expectations, so check the pre-conditions you formulate strictly.
Validate the file data before you continue, for example by using a virus checker.
Validate the image data before you continue. This includes file-headers (magic numbers) as well as that the file-size and file-content is valid. You should use a library that has specialized for the job, e.g. an image-file-format-malformation-checker. This is specialized software, so if this part of your business get into business. Many free software image file code exists, I leave this just for the info, you can't trust any recommendation anyway and need to get into the topic.
If you plan to resize the image yourself, you need to make everything double-safe, because next to hosting you plan to process the data. So know what you do with the data first to locate potential fields of problems.
Do logging and monitoring.
Have a plan for the case that everything get's wrong.
Consider to repeat the process for already existing files, so if you change your procedure, you are able to automatically apply the principles to uploads that were done in the past as well.
Create a system for each type of work that is able to be cleaned after the work has been done. One system to do the download, one system to obtain the meta data etc.. After each action, restore the system from an image. If a single components fails, it won't be left over in an exploited state. Additionally if you detect a fail, you can take your whole system out of business until you have found the flaw.

All this depends a bit how much you want to do, but I think you get the idea. Create a process that works for you knowing where improvement can be added, but first create an infrastructure that is modular enough to deal with error-cases and which probably encapsulates the process enough to deal with any outcome.

You could delegate critical parts to a system that you don't need to care about, e.g. to separate processing from hosting. Additionally, when you host the images the webserver must not be clever. The more stupid a system is, the less exploitable it is (normally).

If hosting is not part of your business, why not hand it over to amazon s3 or similar stores? Your domain can be preserved via DNS settings.

Keep the libraries you use to verify images with up-to-date (which implicates you know which libraries are used and their versio, e.g. the PHP exif extension is making use of mbstring etc. pp. - track the whole tree down). Take care you're in the position to report flaws to the library maintainers in a useful way, e.g. with logging, storing upload data to reproduce stuff etc..

Get knowledge about which exploits for images did exist in the past and which systems/components/libraries (example, see disclaimer there) were affected.

Also get into the topic which are common ways to exploit something, to get the basics together (I'm sure you are aware, however it's always good to re-read some stuff):

Secure file upload in PHP web applications (Alla Bezroutchko; June 13, 2007; PDF)

Some related questions, assorted:

回复收藏 0 原文

栀子花开つ 2024-12-29 12:55:12

您所描述的基本上可以归结为输入验证问题；您不相信应用程序正在读取的内容作为输入和处理。

为了解决这个问题，您应该做的是下载有问题的资源，然后尝试确定真正的文件类型。有多种方法可以尝试此操作，但基本上您将需要使用一些自定义代码或库来解析文件并查找某种类型的尾部标志。关于如何在 PHP 中执行此操作，这里有一个很好的讨论 - 如何以编程方式确定文件的真实扩展名/类型？ - 我会检查第二个答案，其中列出了一些特定于 PHP 的函数来执行此操作。当您的应用程序接收到文件时，它应该像这样执行一些真实的文件类型，然后将结果与远程服务器指定的 MIME 类型进行比较；如果匹配则接受该文件，如果不匹配则删除该文件。

我还建议使用允许的文件类型白名单（您的服务将支持的所有内容的列表，然后仅接受这些类型的文件）。如果您有一个非常通用的服务，那么您至少应该创建一个不允许的文件类型的黑名单（您的服务绝对不支持的所有内容的列表，并根据您的 MIME 类型比较的结果立即删除这些内容）。同样，这些的使用完全取决于您的用例。

一旦您获得了类型，就需要担心远程服务器发送给您的内容是否是针对您的服务器的错误文件（包含恶意代码、旨在使 GD2 库崩溃并运行任意代码的缓冲区溢出等）。基本上，您依赖 GD2 库不包含会导致如此成功的利用的错误。您在这里无能为力，除了自己对库进行安全审核之外，我认为这超出了范围。基本上，跟踪库中报告的任何安全错误并尽快修补；作为图书馆的消费者，您实际上依赖维护人员来查找和修复此类安全漏洞。

接下来，担心的是远程服务器向您发送了一个针对您的用户/客户端的坏文件（包含恶意代码、缓冲区溢出、病毒等）。在这里，如果图像中存在确实是恶意软件的损坏数据，则很可能 (1) 在读取时破坏或利用 GD2（请参阅上面的情况）或 (2) 在执行调整大小操作时将其消除如果 GD2 可以成功处理它，则由库处理。尽管进行了处理，它仍然有可能保留下来，但您也无能为力。如果您确实担心这一点，您可以使用专为此设计的外部产品来应用病毒扫描；我建议，如果您要这样做，请同时执行以下操作：(1) 下载之后、GD2 处理之前，然后 (2) 在提供操作文件之前执行操作。就我个人而言，我认为这样做不会得到太多好处，但如果您想为用户提供额外的检查/温暖模糊，这不会有什么坏处。

为了解决数据传输缓慢以保持连接打开的问题，请对任何连接设置超时来处理此问题；除非您正在处理对您的用例的特定威胁，否则我认为这不是一个大问题。

What you're describing basically comes down to an input validation problem; you don't trust what your application is reading in as input and processing.

To address this, what you should do is to download the resource in question and then attempt to determine a true file type. There are multiple ways to attempt this, but basically you will want to use either some custom-code or a library to parse through the file and look for the tell-tail signs of a certain type. There is a good SO discussion on how to do this in PHP here - How can I determine a file's true extension/type programatically? - I would check the second answer that lists some PHP-specific functions to do this. When your application receives a file, it should perform some true file typing like this and then compare the result to what the specified MIME type from the remote server is; if they match accept the file and if they do not, drop it.

I would also suggest using a whitelist of allowable filetypes (a list of everything your service will support and then ONLY accept files of those types). If you have a very general-purpose service, then you should at least do a blacklist of disallowed filetypes (a list of everything your service absolutely will not support and drop those immediately based on the outcome of your MIME type compares). Again, the use of these is entirely dependent on your use-cases.

Once you've got a type, the concern becomes if what the remote server has sent you is a bad file that targets your server (contains malicious code, buffer overflow designed to make the GD2 library blow up and run arbitrary code, etc). Basically, you are relying on the GD2 library to not contain bugs that would lead to such a successful exploit. There's not much you can do here, short of running security audit on the library yourself and I'm going to assume that's out-of-scope. Basically, keep up on any reported security bugs with the library and patch as soon as you can; as a consumer of the library, you are really relying on the maintainers to find and remedy security vulnerabilities like this.

Next, the concern is that the remote server has sent you a bad file that targets your users/clients (contains malicious code, buffer overflows, viruses, etc). Here, if there is corrupted data that is really malware in the image, it will most likely either (1) break or exploit GD2 when it is read (see above for that scenario) or (2) be eliminated when the resize operation is performed by the library if GD2 can successfully process it. There is still a chance it will remain despite the processing, but there's not much you can do there either. If you're really concerned about this, you can apply a virusscan using an external product designed for that; I would suggest that if you're doing that to do so both (1) after the download and before GD2 processing and then (2) on the manipulated file before you serve it out. Personally, I don't think you get much by doing this, but if you want to provide an additional check / warm fuzzies to your users, it cannot hurt.

To address the slow-feeding of data to keep a connection open, put a timeout on any connection to deal with this problem; unless you are dealing with a specific threat to your use-case here, I do not think this is a huge concern.

回复收藏 0 原文

神经大条 2024-12-29 12:55:12

1) 对于盲目地从不受信任的域获取文件，我主要关心的是如何验证该文件实际上是您期望获得的文件。不受信任的服务器是否会欺骗您的脚本下载有害文件（如病毒）或可能允许后门进入您的系统的脚本？

2) 我还没有读到任何使用 GD2 库调整图像大小的安全问题。如果它一开始就不是图像，GD2 函数将抛出错误。我认为这部分你不必担心太多。

3）如果不检查我的脚本首先下载的每个文件，我（个人）永远不会这样做。如果您想部分自动化此操作，您可以考虑对所有文件运行 magic number 测试作为预过滤器。但人类的观察是提供随机文件的最安全的方式。当你完成这个项目时 - 在你让它上线之前 - 尽可能努力地破坏/欺骗/破解它。找一些懂行的朋友帮忙。

回复收藏 0 原文