Adobe在线阅读器无法阅读所有pdf文件？

发布于 2024-12-27 06:08:48 字数 949 浏览 6 评论 0原文

正如标题所说，我编写了一个读取pdf文件的脚本。只能打开特定文件。 2008 年 9 月 29 日之前修改的所有文件都可以打开。之后的所有文件都不能。

这是我的代码：

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Stienser Omroeper</title>
</head>

<body>

<?php
$file = 'E:/Omrop/'.$_GET['y'].'/'.$_GET['f'];
$filename = $_GET['f'];

header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . filesize($file));
header('Accept-Ranges: bytes');
@readfile($file);
?>
</body>
</html>

$_GET 包含 y（地图结构的年份）和 f（文件名）。如果我在之后 echo $file 并使用在我的电脑上运行的链接，它会完美运行。在浏览器中，我收到消息此文件已损坏且无法修复。

有人有想法吗？

原文

As the title says i made a script to read pdf files. Only specifical files can be opened. All files last modified till 29-09-2008 can be opened. All files after can't.

Here is my code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"   http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Stienser Omroeper</title>
</head>

<body>

<?php
$file = 'E:/Omrop/'.$_GET['y'].'/'.$_GET['f'];
$filename = $_GET['f'];

header('Content-type: application/pdf');
header('Content-Disposition: inline; filename="' . $filename . '"');
header('Content-Transfer-Encoding: binary');
header('Content-Length: ' . filesize($file));
header('Accept-Ranges: bytes');
@readfile($file);
?>
</body>
</html>

The $_GET contains y (year for map structure) and f (the filename). If i echo $file after and use the link in run on my pc it works perfectly. In browser i get the message This file is broken and can't be repaired..

Anybody ideas?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

红颜悴 2025-01-03 06:08:48

此代码包含文件系统遍历漏洞。您没有对导致该文件的参数进行验证。磁盘上的文件被盲目地打开并提供给客户端。

如果你在 Unix 系统上怎么办？如果有人提交 ?y=&f=../../../etc/passwd 会发生什么？

这甚至没有触及您没有对用户所需的文件名进行任何类型的清理的事实。用户可以在那里提交完全虚假的数据并获得完全虚假的文件名。

此代码不执行错误检查，甚至在使用 readfile 将文件扔给用户时明确关闭错误。这是你问题的根源。没有人知道出了什么问题。

所以，我们可以解决这个问题。

首先，您需要对 y 和 f 进行一些验证。您提到 y 是一年，所以

$year = (int)$_GET['y'];

应该可以解决问题。通过将其强制转换为整数，您可以消除其中的任何可怕之处。

f 会有点棘手。您还没有告诉我们这些文件的名称。您将需要添加一些模式匹配验证以确保仅查找有效的文件名。例如，如果所有 PDF 都被命名为“report_something_0000.pdf”，那么您需要进行验证，比如说

$file = null;
if(preg_match('/^report_something_\d{4}\.pdf$/', $_GET['f'])) {
    $file = $_GET['f'];
}

现在我们已经有了有效的文件名和有效的年份目录，下一步是确保文件存在。

$path = 'E:/Omrop/' . $year . '/' . $file;
if(!$file || !file_exists($path) || !is_readable($path)) {
    header('HTTP/1.0 404 File Not Found', true, 404);
    header('Content-type: text/html');
    echo "<h1>404 File Not Found</h1>";
    exit;
}

如果由于模式匹配失败而导致 $file 最终未设置，或者未找到生成的文件路径，则脚本将退出并显示错误消息。

我猜测您在打开较旧的 PDF 时遇到的问题是由文件不存在或权限错误引起的。您向 Adobe Reader 提供了正确的标题，但没有提供任何数据。

您还需要对用户提供的所需文件名执行相同类型的健全性检查。再说一次，我不知道你的要求，但要确保没有任何伪造的东西可以潜入。

接下来，去掉readfile前面的@。它会抑制任何实际错误，并且您会希望看到它们。因为您可能不想在输出中看到它们，所以请确保设置错误日志改为。

最后...这段代码是如何工作的？ 您在 HTML 中间发出标头！不仅如此，您还在此过程中给出了明确的内容长度。你应该会从中得到很多错误。您确定您没有意外地在此处复制/粘贴了一些错误的代码吗？也许您忘记了顶部调用 ob_start() 的部分？无论如何，请放弃开始 标记之前的所有内容。

This code contains a filesystem traversal vulnerability. You are performing no validation of the arguments that lead to the file. Files on disk are blindly opened and fed to the client.

What if you were on a Unix system? What would happen if someone submitted ?y=&f=../../../etc/passwd?

That doesn't even touch the fact that you aren't doing any sort of sanitization on the user's desired filename for the file. The user could submit entirely bogus data there and get an entirely bogus filename.

This code performs no error checking, and even expressly turns errors off when throwing the file at the user using readfile. This is the root of your problem. Nobody has any idea what's going wrong.

So, we can fix this.

First things first, you're going to want to do some validation on y and f. You mentioned that y is a year, so

$year = (int)$_GET['y'];

should do the trick. By forcing it into an integer, you remove any horibleness there.

f is going to be a bit more tricky. You haven't given us an idea about what the files are named. You're going to want to add some pattern matching validation to ensure that only valid filenames are looked for. For example, if all the PDFs are named "report_something_0000.pdf", then you'd want to validate against, say

$file = null;
if(preg_match('/^report_something_\d{4}\.pdf$/', $_GET['f'])) {
    $file = $_GET['f'];
}

Now that we've got a valid filename and a valid year directory, the next step is making sure the file exists.

$path = 'E:/Omrop/' . $year . '/' . $file;
if(!$file || !file_exists($path) || !is_readable($path)) {
    header('HTTP/1.0 404 File Not Found', true, 404);
    header('Content-type: text/html');
    echo "<h1>404 File Not Found</h1>";
    exit;
}

If $file ended up not being set because the pattern match failed, or if the resulting file path wasn't found, then the script will bail with an error message.

I'm going to guess that your problems opening older PDFs are caused by the files not existing or having bad permissions. You're feeding Adobe Reader the right headers and then no data.

You'll also want to perform the same kind of sanity checking on the user-supplied desired filename. Again, I don't know your requirements here, but make sure that nothing bogus can sneak in.

Next, get rid of the @ in front of readfile. It's suppressing any actual errors, and you're going to want to see them. Because you probably don't want to see them in the output, make sure to set up an error log instead.

Finally... how is this code even working? You're emitting headers in the middle of HTML! Not only that, you're giving explicit content-lengths while doing so. You should be getting a hell of a lot of errors from this. Are you sure that you didn't accidentally copy/paste some code wrong here? Maybe you forgot a section at the top where you're calling ob_start()? Regardless, ditch everything before the opening <?php tag.

回复收藏 0 原文

~没有更多了~