上传 PDF 或 .doc 以及安全性

发布于 2024-09-04 00:37:17 字数 159 浏览 2 评论 0原文

我有一个脚本,允许用户将文本文件(PDF 或 doc)上传到服务器,然后计划将它们转换为原始文本。但在文件转换之前,它都是原始格式,这让我担心病毒和各种讨厌的东西。

我需要做什么来最大限度地降低这些未知文件的风险有什么想法。如何检查它是否干净,或者是否是它声称的格式并且不会使服务器崩溃。

I have a script that lets the user upload text files (PDF or doc) to the server, then the plan is to convert them to raw text. But until the file is converted, it's in its raw format, which makes me worried about viruses and all kinds of nasty things.

Any ideas what I need to do to minimize the risk of these unknown files. How to check if it's clean, or if it's even the format it claims to be and that it does not crash the server.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

浅笑依然 2024-09-11 00:37:17

正如我对 Aerik 的评论,但这确实是问题的答案。

如果您的 PHP >= 5.3,请使用 finfo_file()。如果您使用旧版本的 PHP,您可以使用 mime_content_type() (不太可靠)或从 PECL 加载 Fileinfo 扩展。

这两个函数都返回文件的 mime 类型(通过查看其中的数据类型)。对于 PDF,它应该是

text/pdf

对于 Word 文档,它可能是一些东西。一般来说,

application/msword

如果您的服务器正在运行 *nix,那么请确保您保存的文件不可执行。更好的是:将它们保存到 Web 服务器无法访问的文件夹中。您仍然可以编写代码来访问这些文件,但请求网页的人将根本无法访问它们。

As I commented to Aerik but it's really the answer to the question.

If you have PHP >= 5.3 use finfo_file(). If you have an older version of PHP you can use mime_content_type() (less reliable) or load the Fileinfo extension from PECL.

Both of these functions return the mime type of the file (by looking at the type of data inside them). For PDF it should be

text/pdf

For a word doc it could be a few things. Generally it should be

application/msword

If your server is running *nix then make sure the files you're saving aren't executable. Even better: save them to a folder that isn't accessible by the web server. You can still write code to access the files but someone requesting a web page won't be able to access them at all.

年华零落成诗 2024-09-11 00:37:17

如果您曾经在服务器上打开或执行过任何用户上传的文件,那么您应该预料到您的服务器现在已受到威胁。

即使是 JPG 也可以包含可执行的 php。如果您在脚本中以任何方式包含require该文件,也可能会危及您的服务器。您在网络上偶然发现的图像就像这样......

header('Content-type: image/jpeg');
header('Content-Disposition: inline; filename="test.jpg"');

echo file_get_contents('/some_image.jpg');
echo '<?php phpinfo(); ?>';

您保存并重新托管在您自己的服务器上,就像这样......

$q = $_GET['q']; // pretend this is sanitized for the moment
header('Content-type: '.mime_content_type($q));
header('Content-Disposition: inline; filename="'.$_GET['q'].'"');

include $q;

将在您的服务器上执行 phpinfo()服务器。然后,您的站点用户只需将图像保存到桌面并使用记事本打开即可查看您的服务器设置。只需将文件转换为另一种格式就会丢弃该脚本,并且不会触发附加到该文件的任何实际病毒。

最好在上传时进行病毒搜索。您应该能够对检查器执行内联系统命令并解析其输出以查看是否找到任何内容。无论如何,您的网站用户都应该检查他们下载的文件。

否则,即使是病毒感染的用户上传的文件只是放在您的服务器上也不应该造成任何损害......据我所知。

If you've ever opened or executed any user-uploaded file on the server, you should expect that your server is now compromised.

Even a JPG can contain executable php. If you include or require the file in any way in your script, that can also compromise your server. An image you stumble upon on the web served like so...

header('Content-type: image/jpeg');
header('Content-Disposition: inline; filename="test.jpg"');

echo file_get_contents('/some_image.jpg');
echo '<?php phpinfo(); ?>';

... which you save and re-host on your own server like so...

$q = $_GET['q']; // pretend this is sanitized for the moment
header('Content-type: '.mime_content_type($q));
header('Content-Disposition: inline; filename="'.$_GET['q'].'"');

include $q;

...will execute phpinfo() on your server. Your site users can then simply save the image to their desktop and open it with notepad to see your server settings. Simply converting the file to another format will discard that script, and should not trigger any actual virus attached to the file.

It might also be best to do a virus search on upload. You should be able to do an inline system command to a checker and parse its output to see if it finds any. Your site users should be checking files they download anyway.

Otherwise, even a virus laiden user uploaded file just sitting there on your server shouldn't harm anything... as far as I know.

鹤舞 2024-09-11 00:37:17

嗯 - 恕我直言,您实际上不必担心文档类型或其他问题;如果您使用一个好的转换器来转换为原始文本,那么这个转换器应该可以执行这些检查而不会导致服务器崩溃。

正如您的客户端计算机所知,服务器应始终受到保护,免受病毒和攻击 - 因此在处理新上传的文件之前应对其进行检查。

我从未见过网络应用程序会自行进行此类检查 - 你是吗?

Hum - imho you should not really have to worry about the document type or something; if you use a good converter to convert to raw text then this one ought to do those checks without crashing the server.

As known from your client computer, servers should always be protected against viruses and attacks - so the newly uploaded file is to be checked before processing it.

I've never seen a web app doing those kinda checks itself - hav you?

无声情话 2024-09-11 00:37:17

如果您正在查看 PDF,除了安装防病毒软件并祈祷它能够捕获恶意形成的 PDF 之外,您别无选择。

不过,转换软件通常不是目标,因此如果您只是转换它并查看文本格式输出,那么应该会更安全一些。


哦,您担心服务器。只是不要执行上传的文件...

If you're viewing the PDF, there is nothing you can do besides get antivirus and pray that it catches maliciously a formed PDF.

Conversion software normally isn't targeted though, so if you just convert it and view the text format output, you should be somewhat safer.


Oh, you are worried about the server. Just don't execute the uploaded files...

江挽川 2024-09-11 00:37:17

恕我直言,在尝试执行它之前,它只是一个文件。但是,您绝对可以检查(但不要依赖,如下所述)文件扩展名,并且还可以研究文件格式以查看文件头中是否存在可以验证的任何特征字节序列。

IMHO, until something tries to execute it, it's just a file. However, you can definitely check (but do not rely upon, as clarified below) the file extension, and could also research the file formats to see if there are any characteristic sequences of bytes in the header of the file that you could verify.

你另情深 2024-09-11 00:37:17

上传的文件有3种安全方式:
最好:将文件放在另一台最安全的服务器中
更好:将它们放在 WWW 文件夹之外,这意味着没有人可以通过 URL 访问它们,并且您必须使用 readfile() 或 get_content 来读取和显示文件
最后:将文件放在WWW中,并在文件夹中使用.htaccess,以防止其他人执行文件或放置未知文件
这就是我通过上传文件所做的事情;
将它们放在网络根目录之外并重命名它们甚至将假名称保存在数据库中并通过算法创建文件的真实名称。

在 Web 根目录之外上传文件后,您可以像我在这里一样访问它。这是文件 caleed getfile.php 的内容:

    <?php

    define('DS', DIRECTORY_SEPARATOR);
//fake name of file
    $uniqueid = $_GET['uniqueid'];
//file extension
    $ext = $_GET['ext'];
    if (isset($_GET['dir']))
//check address doenot contain ..
        $addrss = str_replace('..', '_', $_GET['dir']);
    $baseaddress = '..' . DS . 'foldername outside of web root';
    if ((isset($_GET['uniqueid']) and strlen($uniqueid) === 32) and ( isset($_GET['ext']) and strlen($ext) === 3 )) {
        $path = $baseaddress . DS . $addrss . DS;
        $path .= md5($uniqueid . $uniqueid . $uniqueid . $ext.'*#$%^&') .'.'. $ext;
        if (file_exists($path)) {
    //you can check for all your accessible extension i just use for img
            switch ($ext) {
                case 'jpg':
                    $content_type = 'image/jpeg';
                    break;
                case 'png':
                    $content_type = 'image/png';
                    break;
                case 'gif':
                    $content_type = 'image/gif';
                    break;
            }
            header('Content-type: ' . $content_type . ' ');
            $file = readfile($path);
        }

在文件 src 或您需要显示文件的每个位置执行此操作(这是针对我的图像):

<img src="/getfile.php?uniqueid=put fake file name here&ext=put extension here&dir=put rest of file address here" >

希望它帮助您。请随时提出更多问题

in uploaded file there are 3 way of security :
best : put file in another server most secure one
better:put them outside of your WWW folder it means no body could access them by URL and you must use readfile() or get_content to read and show files
last:put files in WWW and use .htaccess in folder that prevent other from executing file or put unknown files
it's what i do by uploading files;
put them out side of web root and rename them even save fake name in database and create real name of file by algorithm.

after uploading file outside of web root you can access it as i do here .here is content of file caleed getfile.php:

    <?php

    define('DS', DIRECTORY_SEPARATOR);
//fake name of file
    $uniqueid = $_GET['uniqueid'];
//file extension
    $ext = $_GET['ext'];
    if (isset($_GET['dir']))
//check address doenot contain ..
        $addrss = str_replace('..', '_', $_GET['dir']);
    $baseaddress = '..' . DS . 'foldername outside of web root';
    if ((isset($_GET['uniqueid']) and strlen($uniqueid) === 32) and ( isset($_GET['ext']) and strlen($ext) === 3 )) {
        $path = $baseaddress . DS . $addrss . DS;
        $path .= md5($uniqueid . $uniqueid . $uniqueid . $ext.'*#$%^&') .'.'. $ext;
        if (file_exists($path)) {
    //you can check for all your accessible extension i just use for img
            switch ($ext) {
                case 'jpg':
                    $content_type = 'image/jpeg';
                    break;
                case 'png':
                    $content_type = 'image/png';
                    break;
                case 'gif':
                    $content_type = 'image/gif';
                    break;
            }
            header('Content-type: ' . $content_type . ' ');
            $file = readfile($path);
        }

in file src or every where you need to show the file do this(this is for my images) :

<img src="/getfile.php?uniqueid=put fake file name here&ext=put extension here&dir=put rest of file address here" >

hope it help you .do not hesitate to ask more question

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文