原始问题
所以我正在从事的项目对文件上传非常偏执。
在这个问题的范围内,我没有使用该术语来表示有效负载;我说的是保密。
程序总是会崩溃并在文件系统中留下临时文件。这很正常。 稍微保密偏执者可以编写一个 cronjob,每隔几分钟就会访问临时文件夹,并删除 cronjob 调用之前几秒之前的所有内容(不是所有,只是因为否则它可能会捕获正在上传的文件)。
...不幸的是,我们将这种偏执更进一步:
理想情况下,我们希望永远不会在与进程相关的 RAM 中的任何地方看到文件上传的临时文件。
有没有办法教 PHP 在内存中而不是文件系统中以 blob 形式查找临时文件?我们使用 PHP-FPM 作为 CGI 处理程序,使用 Apache 作为我们的 Web 服务器,以防万一更轻松。 (另请注意:“文件系统”是此处的关键字,而不是“光盘”,因为当然有多种方法可以将文件系统映射到 RAM,但这并不能解决可访问性和自动崩溃后清理问题。 )
或者,有没有一种方法可以在这些临时文件写入光盘时立即加密,这样它们就不会在没有加密的情况下保存在文件系统中?
主题概述
不幸的是,我只能接受一个答案 - 但对于阅读本文的任何人来说,整个主题都极其有价值,并且包含许多人的集体见解。根据您希望实现的目标,接受的答案可能对您不感兴趣。如果您是通过搜索引擎来到这里的,请花点时间阅读整个帖子。
以下是我看到的用例汇编,以供快速参考:
回复:PHP 的临时文件
回复:您的文件,上传后
- 存储在数据库而不是光盘中 → <一个href="https://stackoverflow.com/questions/5701508/storing-php-php-fpm-apaches-temporary-from-upload-files-in-ram-rather-than-the/5862044#5862044">文件加密在数据库中 HowTo (Rook)
Original question
So the project I'm working on is deathly paranoid about file uploads.
In the scope of this question, I'm not using that term in regards to payloads; I'm talking confidentiality.
Programs can always crash and leave temporary files loafing around in the filesystem. That's normal. The slightly confidentiality-paranoid can write a cronjob that hits the temporary file folder every few minutes and deletes anything older than a few seconds prior to the cronjob call (not everything, simply because otherwise it might catch a file in process of being uploaded).
...unfortunately, we take this paranoid a step further:
Ideally, we'd love to never see temporary files from file uploads anywhere but in process-associated RAM.
Is there a way to teach PHP to look for temporary file as blobs in memory rather than in the filesystem? We use PHP-FPM as a CGI handler and Apache as our webserver, in case that makes it any easier. (Note also: 'Filesystem' is the keyword here, rather than 'disc', since there are of course ways to map the filesystem to RAM, but that doesn't fix the accessibility and automatic post-crash-clean-up issue.)
Alternatively, is there a way these temporary files can be encrypted immediately when they're being written to disc, so that they're never held in the file system without encryption?
Thread overview
I can unfortunately only accept one answer - but to anyone reading this, the entire thread is extremely valuable and contains the collective insights of many people. Depending on what you are hoping to achieve, the accepted answer may not be interesting to you. If you've come here through a search engine, please take a moment to read the whole thread.
Here is a compilation of usecases as I see them for quick reference:
Re: PHP's temporary files
-
RAM instead of disc (e.g. due to I/O concerns) → RAMdisk/comparable (plasmid87, Joe Hopfgartner)
-
Immediate (per-filesystem-user) encryption → encFS (ADW) (+ a gotcha as per Sander Marechal)
-
Secure file permissions → restrictive native Linux permissions (optionally per vhost) (Gilles) or SELinux (see various comments)
-
Process-attached memory instead of filesystem (so a process crash removes the files) (originally intended by the question)
-
don't let the file data reach PHP directly → reverse-proxy (Cal)
-
disable PHP writing to the filesystem → see PHP bug link in this answer (Stephan B) or run PHP in CGI mode (Phil Lello)
-
write-only files → /dev/null
filesystem (Phil Lello) (this is useful if you have access to the data as a stream additionally but cannot turn off the file-writing functionality that runs in parallel; whether PHP allows this is unclear)
Re: your files, post-upload
发布评论
评论(10)
您是否考虑过在用户和 Web 服务器之间放置一个层?在网络服务器前面使用诸如 perlbal 之类的东西以及一些自定义代码将允许您拦截上传的文件它们被写入任何地方,对其进行加密,将它们写入本地 ramdisk,然后在适当的 Web 服务器上代理请求(使用文件的文件名和解密密钥)。
如果 PHP 进程崩溃,加密的文件会保留下来,但无法解密。没有未加密的数据被写入(ram)磁盘。
Have you considered putting a layer between the user and the web server? Using something like perlbal with some custom code in front of the web server would allow you to intercept uploaded files before they are written anywhere, encrypt them, write them to a local ramdisk and then proxy the request on the the web server proper (with the filename and decryption key to the files).
If the PHP process crashes, the encrypted file is left around but can't be decrypted. No unencrypted data gets written to (ram)disk.
CGI 来救援!
如果您创建一个 cgi-bin 目录并进行适当配置,您将通过 stdin 收到消息(据我所知,文件根本不会写入磁盘)这边走)。
因此,在您的 apache 配置中添加
然后编写一个 CGI 模式 PHP 脚本来解析 post 数据。根据我的(有限的)测试,似乎没有创建本地文件。该示例转储从标准输入读取的内容以及环境变量,以便您了解可以使用的内容。
示例脚本安装为 /var/www//cgi-bin/test
示例输出
这是我上传纯文本文件时的输出(源):
CGI to the rescue!
If you create a cgi-bin directory, and configure appropriately, you'll get the message via stdin (as far as I can tell, files aren't written to disk at all this way).
So, in your apache config add
Then write a CGI-mode PHP script to parse the post data. From my (limited) testing, no local files appear to be created. The sample dumps what it reads from stdin as well as the environment variables, to give you an idea of what's there to work with.
Sample script installed as /var/www//cgi-bin/test
Sample output
This is the output (source) when I upload a plain-text file:
我突然灵光一现:黑洞文件系统。
本质上,这是一个假文件系统,其中数据永远不会被写入,但所有文件都存在,并且没有内容。
unix.se 上有关于这些的讨论,并且 一个答案涉及 FUSE 实现(此处引用):
我还没有机会对此进行测试 但是如果您将 upload_tmp_dir 设置为黑洞位置,则上传(将|应该)永远不会写入磁盘,但仍可在 $HTTP_RAW_POST_DATA(或 php://input)。如果有效的话,这比给 PHP 打补丁更好
I had a flash of inspiration on this: black-hole filesystems.
Essentially, this is a fake filesystem, where data never gets written, but all files exists, and have no content.
There's a discussion on unix.se about these, and one answer involves a FUSE implementation of just this (quoted here):
I haven't had a chance to test this but if you set the upload_tmp_dir to a black-hole location, the upload (would|should) never be written to disk, but still be available in $HTTP_RAW_POST_DATA (or php://input). If it works, it's better than patching PHP
我对 PHP 不熟悉,所以我的答案不会直接映射到操作方法,但我认为您对各种系统功能提供的保护存在一些误解,这导致您拒绝有效的解决方案具有完全相同安全属性的解决方案。从你的评论中,我推测你正在运行 Linux;我的大部分答案适用于其他 unice,但不适用于其他系统,例如 Windows。
据我所知,您担心三种攻击场景:
第一种攻击者可以读取磁盘上未加密的所有内容,也无法读取使用她不知道的密钥加密的任何内容。
第二种攻击者可以做什么取决于她是否可以作为运行 CGI 脚本的同一用户来运行代码。
如果她只能像其他用户一样运行代码,那么保护文件的工具就是权限。您应该有一个模式为 700 (=
drwx-----
) 的目录,即只能由用户访问,并且由运行 CGI 脚本的用户拥有。其他用户将无法访问该目录下的文件。您不需要任何额外的加密或其他保护。如果她可以以 CGI 用户身份运行代码(当然,包括以 root 身份运行代码),那么您就已经输了。如果您以同一用户身份运行代码,您可以看到另一个进程的内存 - 调试器一直在这样做!在 Linux 下,您可以通过 探索
/proc/$pid/mem
。与读取文件相比,读取进程的内存在技术上更具挑战性,但从安全角度来看,没有什么区别。因此,将数据存储在文件中本身并不是安全问题。
现在我们来看看第三个问题。令人担忧的是,CGI 中的错误允许攻击者窥探文件,但无法运行任意代码。这与可靠性问题有关——如果 CGI 进程终止,它可能会留下临时文件。但它更一般:该文件可能由同时运行的脚本读取。
防止这种情况的最佳方法确实是完全避免将数据存储在文件中。这应该在 PHP 或其库的级别完成,我对此无能为力。如果不可能,则 nullfs 作为 Phil Lello建议的是一个合理的解决方法:PHP进程会认为它正在将数据写入文件,但该文件实际上永远不会包含任何数据。
还有另一个常见的 UNIX 技巧在这里可能很有用:创建文件后,您可以取消链接(删除)它并继续使用它。一旦取消链接,该文件就无法通过其以前的名称进行访问,但只要该文件在至少一个进程中打开,数据就会保留在文件系统中。然而,这对于可靠性来说最有用,可以让操作系统在进程因任何原因终止时删除数据。能够使用进程权限打开任意文件的攻击者可以通过
/proc/$pid/fd/$fd
访问数据。可以随时打开文件的攻击者在创建文件和取消链接之间有一个小窗口:如果她可以打开该文件,那么她就可以查看随后添加到该文件的数据。尽管如此,这可能是有用的保护,因为它将攻击转变为时间敏感的攻击,并且可能需要许多并发连接,因此可以通过连接速率限制器来反击或至少使其变得更加困难。I'm not familiar with PHP, so my answer won't map directly into a how-to, but I think you're laboring under some misconceptions about what protection various system features provide, which have led you to reject valid solutions in favor of solutions that have exactly the same security properties. From your comments, I gather you're running Linux; most of my answer applies to other unices, but not to other systems such as Windows.
As far as I can see, you're concerned about three attack scenarios:
The first kind of attacker can read everything that's unencrypted on the disk, and nothing that's encrypted with a key she doesn't have.
What the second kind of attacker can do depends on whether she can run code as the same user that's running your CGI scripts.
If she can only run code as other users, then the tool to protect the files is permissions. You should have a directory that's mode 700 (=
drwx------
), i.e. only accessible by a user, and owned by the user running the CGI scripts. Other users won't be able to access the files under this directory. You don't need any additional encryption or other protection.If she can run code as the CGI user (which, of course, includes running code as root), then you've lost already. You can see the memory of another process if you're running code as the same user — debuggers do it all the time! Under Linux, you can easily see it for yourself by exploring
/proc/$pid/mem
. Compared with reading a file, reading a process's memory is a little more technically challenging, but security-wise, there's no difference.Thus having the data in files is not in itself a security problem.
Let's now examine the third concern. The worry is that a bug in the CGI allows the attacker to snoop on files but not to run arbitrary code. This is related to a reliability issue − if the CGI process dies, it may leave temporary files behind. But it's more general: the file might be read by a concurrently-running script.
The best way to protect against this is indeed to avoid having the data stored in a file at all. This should be done at the level of PHP or its libraries, and I can't help with that. If it's not possible, then nullfs as suggested by Phil Lello is a reasonable workaround: the PHP process will be thinking it's writing data to a file, but the file will never actually contain any data.
There's another common unix trick which might be useful here: once you've created a file, you can unlink (remove) it and continue working with it. As soon as it's unlinked, the file can't be accessed by its former name, but the data remains in the filesystem as long as the file is open in at least one process. However, this is mostly useful for reliability, to get the OS to remove the data when the process dies for any reason. An attacker who can open arbitrary files with the process's permissions can access the data through
/proc/$pid/fd/$fd
. And an attacker who can open files at any time has a small window between the creation of the file and its unlinking: if she can open the file then, she can watch data that's added to it subsequently. This may nonetheless be useful protection, as it turns the attack into one that is time-sensitive and may require many concurrent connections, so could countered or at least made much more difficult by a connection rate limiter.您是否考虑过使用 FUSE 创建只能由特定用户访问的加密目录?
http://www.arg0.net/encfs
内存不会与特定进程关联,但这些文件只能由特定用户访问(与您的网络服务器运行的用户相同!)这可能就足够了?
Have you looked into using FUSE to create an encrypted directory which can only be accessed by a specific user?
http://www.arg0.net/encfs
The memory won't be associated with a specific process but the files will only be accessible to a specific user (the same one your web server runs as to be useful!) which might be enough?
在您的脚本有机会拦截数据 php://input 之前,PHP 会将上传的文件存储到文件系统,在这种情况下 $HTTP_RAW_POST_DATA 将为空(即使您设置了
file_uploads = Off
)。对于小文件,您可以尝试设置
Pro:根本没有文件上传,数据在 php://input 中
缺点:重新编译,没有供应商支持
PHP will store uploaded files to the filesystem before your script has the chance to intercept the data php://input and $HTTP_RAW_POST_DATA will be empty in this case (even when you set
file_uploads = Off
).For small files, you could try to set
<form enctype="application/x-www-form-urlencoded" ...
but I did not succeed using this. I suggest you recompile php and comment out the part handling file-uploads like in this Bug report (Comment by [email protected]).Pro: No file uploads at all, Data in php://input
Con: Recompile, no Vendor support
你的担忧是有道理的。这个问题有几种解决方案。一种是将文件存储在数据库中。 MongoDB 或 CouchDB 等 NoSQL 数据库是为了高效存储文件而构建的。 MySQL 是另一种选择,它比 NoSQL 具有优势。像 MySQL 这样的关系数据库使得植入访问控制变得非常容易,因为您可以通过主键将
files
和users
表关联起来。在 MySQL 中,您可以使用
longblob
数据类型保存 2^32 位或大约 500mb。您可以使用 MEMORY 引擎创建驻留在内存中的表:CREATE TABLE files ENGINE=MEMORY ...
。此外。 MySQL 确实有 aes_encrypt() 和 des_encrypt() 形式的加密,但它们都使用 ECB 模式是垃圾。只需选择该文件并像
$sensi_file
一样使用它。请记住,您正在转义输入以获取字符文字,从而存储原始二进制文件。Your concerns are valid. There are a few solutions to this problem. One is to store the file in a database. A NoSQL database like MongoDB or CouchDB are built to efficiently store files. MySQL is another option and has advantages over NoSQL. A relational database like MySQL makes it very easy to implant access control because you can relate the
files
andusers
table by a primary key.In MySQL you can use the
longblob
datatype holds 2^32 bits or about ~500mb. You can create a table that is memory resident by using the MEMORY engine:CREATE TABLE files ENGINE=MEMORY ...
. Further more. MySQL does have encryption in the form ofaes_encrypt()
anddes_encrypt()
but they both use ECB mode which is garbage.Just select out the file and use it just like you would
$sensi_file
. Keep in mind that you are escaping the input to obtain he character literals and thus storing the raw binary.您可以创建一个
tmpfs
并使用适当的 umask 挂载它。这样,唯一可以从中读取文件的进程就是创建它的用户。而且,因为这是一个 tmpfs,所以磁盘上不会存储任何内容。我建议反对 ADW 的
encfs
解决方案。 Encfs 不会加密卷,但它会逐个文件地加密文件,但仍然会暴露大量元数据。You can create a
tmpfs
and mount it with a proper umask. That way, the only processes that can read files off it are the users who created it. And, because this is atmpfs
, nothing is ever stored on disk.I would advise against ADW's
encfs
solution. Encfs does not encrypt volumes but it encrypts files on a file-by-file basis, still leaving a lot of metadata exposed.最明显的方法是:
我没有看到任何问题。只要确保分配足够的空间即可。
您可以使用 LUKS 或 truecrypt 进行实时加密
编辑:
在您发表评论后,我想我现在了解您的问题
apache/php 不支持此功能。
但是,您可以编写自己的守护程序,打开套接字连接以按照您想要的任何方式侦听和处理传入数据。基本上用 php 编写你自己的网络服务器。
不应该有太多的工作。还有一些不错的课程。 zend 有一些有助于 http 处理的服务器库。
但在 Perl 中你可以更容易地做到这一点。您可以将发布数据逐块归档以处理关联的内存。 php 只是有不同的工作流程。
the most obvious approach would be:
i don't see any problems with that. just be sure you allocate enough space tough.
you can real time encrypt with LUKS or truecrypt
edit:
after your comment i think i now understand your problem
apache/php doesnt support this.
you can however write your own deamon, open a socket connection to listen on and handle incoming data in whatevery way you want. basically writing your own webserver in php.
shouldnt be too much work. there are also some nice classes available. zend has some server libraries that faciliate http handling.
but you could do this far easier in perl. you can just file post data chunk by chunk in to process associated memory. php just has a different work flow.
您考虑过在 Linux 下创建 RAMdisk 吗?
http://www.vanemery.com/Linux/Ramdisk/ramdisk.html
由于这将显示为本机文件系统位置,因此您只需将 PHP 实例(假设它具有正确的权限)指向该位置。
我不确定如果磁盘出现故障会产生什么后果,我想该位置将变得不可写或不存在。围绕性能和大文件可能会有进一步的影响,但由于这不是我的专业领域,我不能告诉你太多。
希望这对您有所帮助。
Have you considered creating a RAMdisk under Linux?
http://www.vanemery.com/Linux/Ramdisk/ramdisk.html
Since this will appear as a native file-system location, you need only point your PHP instance (assuming it has the correct permissions) at this location.
I'm not sure what the ramifications are if the disk fails, I would imagine the location would become unwritable or absent. There may be further ramifications surrounding performance and large files, but as this is not my area of expertise I cannot tell you much about that.
Hope this is of some help.