在服务器上获取文件,使用 PHP GD2 调整大小,安全考虑
当服务器从不受信任的域获取文件时,有哪些安全注意事项?
调整您不信任 PHP GD2 库的图像大小时,有哪些安全注意事项?
该文件将存储在服务器计算机上,并提供下载。我知道我不能信任 MIME-Type 标头。还有什么我应该注意的吗?
我有一个网络服务,如下所示:
输入
一个http-URL(或一个字符串,预计是一个URL)
输出
文件的元描述,如果有则为错误。
元描述有两种形式之一:
- 它是图像 + 我的域中图像的 URL + 图像的缩略图(在我的服务器上生成并托管)
- 它不是图像 + 我的域中文件的 URL
<强>更新
我可以想到的问题:
远程服务器是一个恶意服务器,它会发送少量信息,足以保持套接字打开,但不会做任何有用的事情 - 就像slowloris 。我不知道这个威胁有多真实。我想通过超时+进度检查可以很容易地避免这种情况。
远程服务器提供看起来像图像的东西(标题、mime 类型),但当我使用 GD2 加载它时会导致 PHP 崩溃。
服务器发送无用或错误的 MIME 类型标头。就像二进制文件的
text-plain
一样。远程服务器提供带有病毒的映像。我认为调整图像大小可以消除病毒,但如果没有理由缩放,我将提供原始图像。
远程服务器提供一个带有病毒的文件。该文件不会被视为图像,因此我的服务器不会对其执行任何操作。在用户下载并运行它之前,什么都不会发生。
另外,我认为我可以信任我的服务的用户。这是一个私人应用程序,用户可能会对不良行为负责。我认为他们不会故意尝试破坏它。
What are the security considerations when a server fetches a file from an untrusted domain?
What are the security considerations when resizing an image that you don't trust with PHPs GD2 library?
The file will be stored on the server machine, and will be offered for download. I know I can't trust the MIME-Type header. Is there anything else I should be aware of?
I have a webservice that looks like this:
input
An http-URL (or a String that is expected to be a URL)
output
A meta description of the file, or an error if there was one.
The meta description has one of two forms:
- It's an image + a URL to the image on my domain + a thumbnail of the image (generated on and hosted by my server)
- It's not an image + a URL to the file on my domain
update
Concerns that I can come up with:
The remote server is a malicious server that will send tiny bits of information, enough to keep the socket open, but doesn't do anything useful - like slowloris. I don't know how real of a threat this is. I suppose it could be easily avoided with timeout + progress check.
The remote server serves something that looks like an image (headers, mime-type) but causes PHP to crash when I load it with GD2.
The server sends a useless or bad MIME-type header. Like
text-plain
for binary files.The remote server serves an image with a virus in it. I assume that resizing the image will get rid of the virus, but I will serve the original image if there is no reason to scale.
The remote server serves a file with a virus in it. The file will not be treated as an image so my server will do nothing with it. Nothing will happen until the user downloads, and runs it.
Also, I assume I can trust the users of my service. This is a private application in a situation where users can be held accountable for bad behavior. I assume they wont intentionally try to break it.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
域(主机)和文件不可信。这分为两点:
要安全地传输数据,请使用超时和大小限制。现代 HTTP 客户端库提供了这两种功能。如果无法及时请求文件,请断开连接。如果文件太大,请删除数据。告诉用户获取文件时出现问题。或者让用户使用用户浏览器和 JavaScript 来处理到该服务器的传输以获取文件。然后发布它。使用脚本设置帖子限制。
只要数据不可信,您就需要谨慎处理。这意味着,您自己实现一个进程,能够在将文件标记为“安全”之前对文件运行不同的安全检查。
然后不要将不受信任的数据传递到图像库。参见上面的步骤,先使其进入安全状态。
我认为你仍然处于上面的观点。如何从不受信任转向安全。当然,您不能信任 Content-Type 标头,但是理解它也是有好处的。
您想要防范无限制文件上传漏洞OWASP< /sup>。
所有这一切都取决于您想要做的程度,但我认为您已经明白了。创建一个适合您的流程,让您知道可以在哪里添加改进,但首先创建一个足够模块化的基础架构,可以处理错误情况,并且可能足够封装该流程以处理任何结果。
您可以将关键部分委托给您不需要关心的系统,例如将处理与托管分开。此外,当您托管图像时,网络服务器一定不能太聪明。系统越愚蠢,它的可利用性就越低(通常)。
如果托管不是您业务的一部分,为什么不将其交给亚马逊 s3 或类似的商店呢?您的域名可以通过 DNS 设置保留。
保持用于验证图像的库是最新的(这意味着您知道使用了哪些库及其版本,例如 PHP exif 扩展正在使用 mbstring 等。第 3 页 - 跟踪整个树)。请注意,您能够以有用的方式向库维护人员报告缺陷,例如通过日志记录、存储上传数据以重现内容等。
了解过去确实存在哪些图像漏洞以及哪些系统/组件/libraries(示例,请参阅免责声明那里)受到影响。
一起(我相信您知道,但是重新阅读一些内容总是好的):
一些相关问题,什锦:
The domain (host) and the file is not to be trusted. This spreads over two points:
To transport the data safely, use a timeout and a size limit. Modern HTTP client libraries offer both of that. If the file could not be requested in time, drop the connection. If the file is too large, drop the data. Tell the user that there was a problem getting the file. Alternatively let the user handle the transport to that server by using the users browser and javascript to obtain the file. Then post it. Set the post limit with your script.
As long as the data is untrusted you need to handle it with caution. That means, you implement yourself a process that is able to run different security checks on the file before you mark it as "safe".
Do not pass untrusted data to the image library then. See the step above, bring it into a safe state first.
I think you're still at the point above. How to come to safe from untrusted. Sure you can't trust the Content-Type header, however it's good to understand it as well.
You want to protect against the Unrestricted File Upload VulnerabilityOWASP.
All this depends a bit how much you want to do, but I think you get the idea. Create a process that works for you knowing where improvement can be added, but first create an infrastructure that is modular enough to deal with error-cases and which probably encapsulates the process enough to deal with any outcome.
You could delegate critical parts to a system that you don't need to care about, e.g. to separate processing from hosting. Additionally, when you host the images the webserver must not be clever. The more stupid a system is, the less exploitable it is (normally).
If hosting is not part of your business, why not hand it over to amazon s3 or similar stores? Your domain can be preserved via DNS settings.
Keep the libraries you use to verify images with up-to-date (which implicates you know which libraries are used and their versio, e.g. the PHP exif extension is making use of mbstring etc. pp. - track the whole tree down). Take care you're in the position to report flaws to the library maintainers in a useful way, e.g. with logging, storing upload data to reproduce stuff etc..
Get knowledge about which exploits for images did exist in the past and which systems/components/libraries (example, see disclaimer there) were affected.
Also get into the topic which are common ways to exploit something, to get the basics together (I'm sure you are aware, however it's always good to re-read some stuff):
Some related questions, assorted:
您所描述的基本上可以归结为输入验证问题;您不相信应用程序正在读取的内容作为输入和处理。
为了解决这个问题,您应该做的是下载有问题的资源,然后尝试确定真正的文件类型。有多种方法可以尝试此操作,但基本上您将需要使用一些自定义代码或库来解析文件并查找某种类型的尾部标志。关于如何在 PHP 中执行此操作,这里有一个很好的讨论 - 如何以编程方式确定文件的真实扩展名/类型? - 我会检查第二个答案,其中列出了一些特定于 PHP 的函数来执行此操作。当您的应用程序接收到文件时,它应该像这样执行一些真实的文件类型,然后将结果与远程服务器指定的 MIME 类型进行比较;如果匹配则接受该文件,如果不匹配则删除该文件。
我还建议使用允许的文件类型白名单(您的服务将支持的所有内容的列表,然后仅接受这些类型的文件)。如果您有一个非常通用的服务,那么您至少应该创建一个不允许的文件类型的黑名单(您的服务绝对不支持的所有内容的列表,并根据您的 MIME 类型比较的结果立即删除这些内容)。同样,这些的使用完全取决于您的用例。
一旦您获得了类型,就需要担心远程服务器发送给您的内容是否是针对您的服务器的错误文件(包含恶意代码、旨在使 GD2 库崩溃并运行任意代码的缓冲区溢出等)。基本上,您依赖 GD2 库不包含会导致如此成功的利用的错误。您在这里无能为力,除了自己对库进行安全审核之外,我认为这超出了范围。基本上,跟踪库中报告的任何安全错误并尽快修补;作为图书馆的消费者,您实际上依赖维护人员来查找和修复此类安全漏洞。
接下来,担心的是远程服务器向您发送了一个针对您的用户/客户端的坏文件(包含恶意代码、缓冲区溢出、病毒等)。在这里,如果图像中存在确实是恶意软件的损坏数据,则很可能 (1) 在读取时破坏或利用 GD2(请参阅上面的情况)或 (2) 在执行调整大小操作时将其消除如果 GD2 可以成功处理它,则由库处理。尽管进行了处理,它仍然有可能保留下来,但您也无能为力。如果您确实担心这一点,您可以使用专为此设计的外部产品来应用病毒扫描;我建议,如果您要这样做,请同时执行以下操作:(1) 下载之后、GD2 处理之前,然后 (2) 在提供操作文件之前执行操作。就我个人而言,我认为这样做不会得到太多好处,但如果您想为用户提供额外的检查/温暖模糊,这不会有什么坏处。
为了解决数据传输缓慢以保持连接打开的问题,请对任何连接设置超时来处理此问题;除非您正在处理对您的用例的特定威胁,否则我认为这不是一个大问题。
What you're describing basically comes down to an input validation problem; you don't trust what your application is reading in as input and processing.
To address this, what you should do is to download the resource in question and then attempt to determine a true file type. There are multiple ways to attempt this, but basically you will want to use either some custom-code or a library to parse through the file and look for the tell-tail signs of a certain type. There is a good SO discussion on how to do this in PHP here - How can I determine a file's true extension/type programatically? - I would check the second answer that lists some PHP-specific functions to do this. When your application receives a file, it should perform some true file typing like this and then compare the result to what the specified MIME type from the remote server is; if they match accept the file and if they do not, drop it.
I would also suggest using a whitelist of allowable filetypes (a list of everything your service will support and then ONLY accept files of those types). If you have a very general-purpose service, then you should at least do a blacklist of disallowed filetypes (a list of everything your service absolutely will not support and drop those immediately based on the outcome of your MIME type compares). Again, the use of these is entirely dependent on your use-cases.
Once you've got a type, the concern becomes if what the remote server has sent you is a bad file that targets your server (contains malicious code, buffer overflow designed to make the GD2 library blow up and run arbitrary code, etc). Basically, you are relying on the GD2 library to not contain bugs that would lead to such a successful exploit. There's not much you can do here, short of running security audit on the library yourself and I'm going to assume that's out-of-scope. Basically, keep up on any reported security bugs with the library and patch as soon as you can; as a consumer of the library, you are really relying on the maintainers to find and remedy security vulnerabilities like this.
Next, the concern is that the remote server has sent you a bad file that targets your users/clients (contains malicious code, buffer overflows, viruses, etc). Here, if there is corrupted data that is really malware in the image, it will most likely either (1) break or exploit GD2 when it is read (see above for that scenario) or (2) be eliminated when the resize operation is performed by the library if GD2 can successfully process it. There is still a chance it will remain despite the processing, but there's not much you can do there either. If you're really concerned about this, you can apply a virusscan using an external product designed for that; I would suggest that if you're doing that to do so both (1) after the download and before GD2 processing and then (2) on the manipulated file before you serve it out. Personally, I don't think you get much by doing this, but if you want to provide an additional check / warm fuzzies to your users, it cannot hurt.
To address the slow-feeding of data to keep a connection open, put a timeout on any connection to deal with this problem; unless you are dealing with a specific threat to your use-case here, I do not think this is a huge concern.
1) 对于盲目地从不受信任的域获取文件,我主要关心的是如何验证该文件实际上是您期望获得的文件。不受信任的服务器是否会欺骗您的脚本下载有害文件(如病毒)或可能允许后门进入您的系统的脚本?
2) 我还没有读到任何使用 GD2 库调整图像大小的安全问题。如果它一开始就不是图像,GD2 函数将抛出错误。我认为这部分你不必担心太多。
3)如果不检查我的脚本首先下载的每个文件,我(个人)永远不会这样做。如果您想部分自动化此操作,您可以考虑对所有文件运行 magic number 测试作为预过滤器。但人类的观察是提供随机文件的最安全的方式。当你完成这个项目时 - 在你让它上线之前 - 尽可能努力地破坏/欺骗/破解它。找一些懂行的朋友帮忙。
1) My primary concern with blindly fetching a file from an untrusted domain would be how to verify that the file is, in fact, what you expected to get.; could the untrusted server trick your script into downloading a harmful file (like a virus) or possibly a script that would allow a backdoor into your system?
2) I haven't read any security issues with resizing an image with the GD2 library. If it's not an image to begin with, the GD2 functions would throw an error. I don't think you have much to worry about with this part.
3) I (personally) would not ever do this without reviewing every single file that my script downloaded first. If you want to partially automate this, you might consider running magic number tests on all the files as a pre-filter. But a human look is the safest way to serve random files. When you finish this project - before you make it live - try to break / trick / hack it as hard as you can. Get some knowledgeable friends involved to help.
当它不是图像时,无论哪种文件,您都会以任何方式存储该文件吗?这样他们就可以上传 php 文件并浏览它以在您的服务器上执行 php 代码?
when it is not an image you store the file any way regardless what kind of file? so they can upload and php file and browse to it to execute php code on your server?