防止 PHP 中的目录遍历但允许路径

发布于 2024-10-03 07:28:44 字数 181 浏览 9 评论 0原文

我有一个基本路径 /whatever/foo/

$_GET['path'] 应该是相对于它的。

但是,如何在不允许目录遍历的情况下完成此操作(读取目录)?

例如。

/\.\.|\.\./

无法正确过滤。

I have a base path /whatever/foo/

and
$_GET['path'] should be relative to it.

However how do I accomplish this (reading the directory), without allowing directory traversal?

eg.

/\.\.|\.\./

Will not filter properly.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

迷离° 2024-10-10 07:28:44

好吧,一种选择是比较真实路径:

$basepath = '/foo/bar/baz/';
$realBase = realpath($basepath);

$userpath = $basepath . $_GET['path'];
$realUserPath = realpath($userpath);

if ($realUserPath === false || strpos($realUserPath, $realBase) !== 0) {
    //Directory Traversal!
} else {
    //Good path!
}

基本上, realpath()< /code>会将提供的路径解析为实际的硬物理路径(解析符号链接、.../// 等)...因此,如果真实用户路径不是以真实基本路径开头,则它会尝试进行遍历。请注意,realpath 的输出不会有任何“虚拟目录”,例如 ..... 。

Well, one option would be to compare the real paths:

$basepath = '/foo/bar/baz/';
$realBase = realpath($basepath);

$userpath = $basepath . $_GET['path'];
$realUserPath = realpath($userpath);

if ($realUserPath === false || strpos($realUserPath, $realBase) !== 0) {
    //Directory Traversal!
} else {
    //Good path!
}

Basically, realpath() will resolve the provided path to an actual hard physical path (resolving symlinks, .., ., /, //, etc)... So if the real user path does not start with the real base path, it is trying to do a traversal. Note that the output of realpath will not have any "virtual directories" such as . or .....

乱了心跳 2024-10-10 07:28:44

ircmaxell 的答案并不完全正确。我在几个片段中看到了该解决方案,但它有一个与 realpath() 的输出相关的错误。 realpath() 函数删除尾随目录分隔符,因此想象两个连续的目录,例如:

/foo/bar/baz/

/foo/bar/baz_baz/

由于 realpath() 会删除最后一个目录分隔符,因此您的方法将返回“good path” 如果 $_GET['path'] 等于“../baz_baz”,因为它可能类似于

strpos("/foo/bar/baz_baz", "/foo/bar/baz")

$basepath = '/foo/bar/baz/';
$realBase = realpath($basepath);

$userpath = $basepath . $_GET['path'];
$realUserPath = realpath($userpath);

if ($realUserPath === false || strcmp($realUserPath, $realBase) !== 0 || strpos($realUserPath, $realBase . DIRECTORY_SEPARATOR) !== 0) {
    //Directory Traversal!
} else {
    //Good path!
}

ircmaxell's answer wasn't fully correct. I've seen that solution in several snippets but it has a bug which is related to the output of realpath(). The realpath() function removes the trailing directory separator, so imagine two contiguous directories such as:

/foo/bar/baz/

/foo/bar/baz_baz/

As realpath() would remove the last directory separator, your method would return "good path" if $_GET['path'] was equal to "../baz_baz" as it would be something like

strpos("/foo/bar/baz_baz", "/foo/bar/baz")

Maybe:

$basepath = '/foo/bar/baz/';
$realBase = realpath($basepath);

$userpath = $basepath . $_GET['path'];
$realUserPath = realpath($userpath);

if ($realUserPath === false || strcmp($realUserPath, $realBase) !== 0 || strpos($realUserPath, $realBase . DIRECTORY_SEPARATOR) !== 0) {
    //Directory Traversal!
} else {
    //Good path!
}
温柔少女心 2024-10-10 07:28:44

仅检查 ../ 等模式是不够的。以“../”为例,URI 编码为“%2e%2e%2f”。如果您的模式检查发生在解码之前,您将错过这次遍历尝试。黑客还可以使用一些其他技巧来绕过模式检查器,特别是在使用编码字符串时。

我最成功地阻止了这些问题,方法是使用 ircmaxwell 建议的 realpath() 之类的方法将任何路径字符串规范化为其绝对路径。只有这样,我才开始通过将遍历攻击与我预定义的基本路径进行匹配来检查遍历攻击。

It is not sufficient to check for patterns like ../ or the likes. Take "../" for instance which URI encodes to "%2e%2e%2f". If your pattern check happens before a decode, you would miss this traversal attempt. There are some other tricks hackers can do to circumvent a pattern checker especially when using encoded strings.

I've had the most success stopping these by canonicalizing any path string to its absolute path using something like realpath() as ircmaxwell suggests. Only then do I begin checking for traversal attacks by matching them against a base path I've predefined.

往事风中埋 2024-10-10 07:28:44

在研究新文件或文件夹的创建时,我认为可以使用两阶段方法:

首先使用类似函数的自定义实现来检查遍历尝试,但是该函数可以工作对于任意路径,而不仅仅是现有文件。 这里是一个很好的起点。使用 urldecode() 以及您认为值得检查的其他内容来扩展它。

现在使用这种粗略的方法,您可以过滤掉一些遍历尝试,但您可能会错过一些特殊字符、符号链接、转义序列等的黑客组合。但是因为您确定目标文件不存在(使用 < code>file_exists) 没有人可以覆盖任何内容。最坏的情况是,有人可以让您的代码在某处创建文件或文件夹,这在大多数情况下可能是可接受的风险,前提是您的代码不允许他们立即写入该文件/文件夹。

最后,路径现在指向现有位置,因此您现在可以使用上面建议的方法利用 realpath() 进行正确的检查。如果此时发现发生了遍历,只要确保防止任何写入目标路径的尝试,您或多或少仍然是安全的。现在您还可以删除目标文件/目录并说这是一次遍历尝试。

我并不是说它不能被黑客攻击,因为毕竟它仍然可能允许对 FS 进行非法更改,但仍然比仅进行自定义检查要好,因为自定义检查不能利用 realpath(),并且通过在某处创建临时的空文件或文件夹而打开的滥用窗口比允许他们将其永久化甚至写入其中的窗口要低,因为只有自定义检查才会发生这种情况,可能会错过一些边缘情况。

如果我错了也请纠正我!

When looking into the creation of new files or folders, I've figured I can use a two stage approach:

First check for traversal attempts using a custom implementation of a realpath() like function, which however works for arbitrary paths, not just existing files. There's a good starting point here. Extend it with urldecode() and whatever else you think may worth checking.

Now using this crude method you can filter out some traversal attempts, but it may be possible that you miss some hackish combination of special characters, symlinks, escaping sequences etc. But since you know for sure the target file does not exist (check using file_exists) noone can overwrite anything. The worst case scenario would be that someone can get your code creating a file or folder somewhere, which may be an acceptable risk in most cases, provided your code does not allow them to write into that file/folder straight away.

Finally so the path now points to an existing location, therefore you can now do the proper check using the methods suggested above utilising realpath(). If at this point it turns out a traversal has happened, you are still safe more or less, as long as you make sure to prevent any attempts writing into the target path. Also right now you can delete the target file/dir and say it was a traversal attempt.

I'm not saying it cannot be hacked, since after all still it may allow illegitimate changes to be done to the FS, but still better than only doing custom checks, that cannot utilise realpath(), and the window for abuse left open by making a temporary and empty file or folder somewhere is lower, than allowing them to make it permanent and even write into it, as it would happen with only a custom check that may miss some edge cases.

Also correct me if I'm wrong pls!

倚栏听风 2024-10-10 07:28:44

您可能会尝试使用正则表达式来删除所有 ../s,但是 PHP 中内置了一些不错的函数,可以做得更好:

$page = basename(realpath($_GET));

basename - 从路径中删除所有目录信息,例如 ../ pages/about.php 将变为 about.php

realpath - 返回文件的完整路径,例如 about.php 将变为 /home/www /pages/about.php,但前提是该文件存在。

组合起来,它们仅返回文件名,但前提是文件存在。

You may be tempted to try and use regex to remove all ../s but there are some nice functions built into PHP that will do a much better job:

$page = basename(realpath($_GET));

basename - strips out all directory information from the path e.g. ../pages/about.php would become about.php

realpath - returns a full path to the file e.g. about.php would become /home/www/pages/about.php, but only if the file exists.

Combined they return just the files name but only if the file exists.

起风了 2024-10-10 07:28:44

我编写了一个函数来检查遍历:

function isTraversal($basePath, $fileName)
{
    if (strpos(urldecode($fileName), '..') !== false)
        return true;
    $realBase = realpath($basePath);
    $userPath = $basePath.$fileName;
    $realUserPath = realpath($userPath);
    while ($realUserPath === false)
    {
        $userPath = dirname($userPath);
        $realUserPath = realpath($userPath);
    }
    return strpos($realUserPath, $realBase) !== 0;
}

仅此行 if (strpos(urldecode($fileName), '..') !== false) 应该足以防止遍历,但是,有黑客可以通过多种不同的方式遍历目录,因此最好确保用户从真实的基本路径开始。

仅检查用户以真实基本路径开头是不够的,因为黑客可以遍历到当前目录并发现目录结构。

while 允许代码在 $fileName 不存在时运行。

I have written a function to check for traversal:

function isTraversal($basePath, $fileName)
{
    if (strpos(urldecode($fileName), '..') !== false)
        return true;
    $realBase = realpath($basePath);
    $userPath = $basePath.$fileName;
    $realUserPath = realpath($userPath);
    while ($realUserPath === false)
    {
        $userPath = dirname($userPath);
        $realUserPath = realpath($userPath);
    }
    return strpos($realUserPath, $realBase) !== 0;
}

This line alone if (strpos(urldecode($fileName), '..') !== false) should be enough to prevent traversal, however, there are many different ways hackers can traverse directories so its better to make sure the user starts with the real base path.

Just checking the user starts with the real base path is not enough because a hacker could traverse to the current directory and discover the directory structure.

The while allows the code to work when $fileName does not exist.

慕巷 2024-10-10 07:28:44

在我的版本中,我将 $_GET['file'] 替换为 $_SERVER['REQUEST_URI'] 以从服务器变量中获取请求的 URI。然后,我使用 parse_url() 和 PHP_URL_PATH 常量从 URI 中提取路径部分,不包括任何查询参数。

脚本的其余部分执行路径规范化,根据基目录验证文件路径,检查文件是否存在,并将文件提供给用户。

通过使用 $_SERVER['REQUEST_URI'],即使未将文件参数显式设置为查询参数,您也可以处理从 URL 提取文件路径。

$baseDirectory = '/path/to/file/'; // Define the base directory where the file(s) is/are located

if ($_SERVER['HTTP_HOST'] == 'localhost') { 
    $baseDirectory = 'C:\xampp\htdocs'; // For localhost use
}

// Check if the 'REQUEST_URI' is set in the server variables
if (!isset($_SERVER['REQUEST_URI'])) {
    $this->logger->logEvent('REQUEST_URI NOT SET: '.__LINE__.' '.__FILE__);
    die('Invalid file request');
}

// Get the requested URI from the server variables
$requestedUri = $_SERVER['REQUEST_URI'];

// Extract the file path from the requested URI
$requestedFile = parse_url($requestedUri, PHP_URL_PATH);

// Normalize the file path to remove any relative components
$requestedFile = realpath($baseDirectory . $requestedFile);

// Check if the normalized file path starts with the base directory
if (strpos($requestedFile, $baseDirectory) !== 0) {
    $this->logger->logEvent('Directory Traversal: '.$requestedUri);
    die('Invalid file path');
}

// Check if the requested file exists
if (!file_exists($requestedFile)) {
    $this->logger->logEvent($requestedFile. ' not found: '.__LINE__.' '.__FILE__);
    die('File not found');
}
// Serve the file to the user

In my version, I replaced $_GET['file'] with $_SERVER['REQUEST_URI'] to get the requested URI from the server variables. I then used parse_url() with the PHP_URL_PATH constant to extract the path component from the URI, excluding any query parameters.

The rest of the script performs path normalization, validating the file path against the base directory, checking file existence, and serving the file to the user.

By using $_SERVER['REQUEST_URI'], you can handle the file path extraction from the URL even when the file parameter is not explicitly set as a query parameter.

$baseDirectory = '/path/to/file/'; // Define the base directory where the file(s) is/are located

if ($_SERVER['HTTP_HOST'] == 'localhost') { 
    $baseDirectory = 'C:\xampp\htdocs'; // For localhost use
}

// Check if the 'REQUEST_URI' is set in the server variables
if (!isset($_SERVER['REQUEST_URI'])) {
    $this->logger->logEvent('REQUEST_URI NOT SET: '.__LINE__.' '.__FILE__);
    die('Invalid file request');
}

// Get the requested URI from the server variables
$requestedUri = $_SERVER['REQUEST_URI'];

// Extract the file path from the requested URI
$requestedFile = parse_url($requestedUri, PHP_URL_PATH);

// Normalize the file path to remove any relative components
$requestedFile = realpath($baseDirectory . $requestedFile);

// Check if the normalized file path starts with the base directory
if (strpos($requestedFile, $baseDirectory) !== 0) {
    $this->logger->logEvent('Directory Traversal: '.$requestedUri);
    die('Invalid file path');
}

// Check if the requested file exists
if (!file_exists($requestedFile)) {
    $this->logger->logEvent($requestedFile. ' not found: '.__LINE__.' '.__FILE__);
    die('File not found');
}
// Serve the file to the user
无风消散 2024-10-10 07:28:44

使用此或其他解决方案的风险由您自行承担。您应该检查测试代码中的期望&然后修改路径遍历代码以满足您的需要。这也可能会错过一些我不想测试的边缘情况。

请参阅先前版本的编辑。我之前的版本设计过度了。

删除路径遍历

<?php
/**
 * Remove path traversal from a string path. Path expected to be from a url, so it is urldecoded
 */
function remove_path_traversal(string $path): string {
    $path = str_replace("\\","/",$path);
    $has_lead = substr($path,0,1)=='/';
    $has_trail = substr($path,-1)=='/';
    $path = '/'.$path.'/';
    while (strpos($path, '/../')!==false || strpos($path, '/./')!==false){
        $path = '/'.str_replace(['/../','/./'],'/',$path);
    }
    $path = str_replace(['////','///','//'],'/',$path);
    if (!$has_trail&&substr($path,-1)=='/')$path = substr($path, 0,-1);
    if (!$has_lead)$path = substr($path,1);
    return $path;
}

测试功能

创建一个.php文件并使用php traversaltest.php运行它

<?php
require(__DIR__.'/../src/functions.php'); // or wherever you defined the function

$urls = [
    "/" => "/",
    "/test" => "/test",
    "/test.html" => "/test.html",
    "/test/./" => "/test/",
    "/test/./some-file.txt" => "/test/some-file.txt",
    "/test/../some-file.txt" => "/test/some-file.txt",
    "//../test/../some-file.txt" => "/test/some-file.txt",
    "/dir/../../../../test.html" => "/dir/test.html",
    "/dir/./../.../..../...../....../test.html" => "/dir/.../..../...../....../test.html",
    "../abc/def/" => "abc/def/",
    "/abc/def/.." => "/abc/def",
    "abc/def/.." => "abc/def",
];

echo "\n\n";
foreach ($urls as $url => $target_file){
    $safe_path = \Bear\remove_path_traversal($url);

    if ($safe_path == $target_file)$status = "pass";
    else $status = "fail";
    echo "\n($status) Input($url), output($safe_path), expected($target_file)";
}

echo "\n\n";

Use this or other solutions at your own risk. You should check the expectations in the test code & then modify the path traversal code to fit your needs. This may also miss some edge cases I didn't think to test.

See edits for previous version. My prior version was way over-engineered.

Remove path traversal

<?php
/**
 * Remove path traversal from a string path. Path expected to be from a url, so it is urldecoded
 */
function remove_path_traversal(string $path): string {
    $path = str_replace("\\","/",$path);
    $has_lead = substr($path,0,1)=='/';
    $has_trail = substr($path,-1)=='/';
    $path = '/'.$path.'/';
    while (strpos($path, '/../')!==false || strpos($path, '/./')!==false){
        $path = '/'.str_replace(['/../','/./'],'/',$path);
    }
    $path = str_replace(['////','///','//'],'/',$path);
    if (!$has_trail&&substr($path,-1)=='/')$path = substr($path, 0,-1);
    if (!$has_lead)$path = substr($path,1);
    return $path;
}

Test the function

Create a .php file and run it with php traversaltest.php

<?php
require(__DIR__.'/../src/functions.php'); // or wherever you defined the function

$urls = [
    "/" => "/",
    "/test" => "/test",
    "/test.html" => "/test.html",
    "/test/./" => "/test/",
    "/test/./some-file.txt" => "/test/some-file.txt",
    "/test/../some-file.txt" => "/test/some-file.txt",
    "//../test/../some-file.txt" => "/test/some-file.txt",
    "/dir/../../../../test.html" => "/dir/test.html",
    "/dir/./../.../..../...../....../test.html" => "/dir/.../..../...../....../test.html",
    "../abc/def/" => "abc/def/",
    "/abc/def/.." => "/abc/def",
    "abc/def/.." => "abc/def",
];

echo "\n\n";
foreach ($urls as $url => $target_file){
    $safe_path = \Bear\remove_path_traversal($url);

    if ($safe_path == $target_file)$status = "pass";
    else $status = "fail";
    echo "\n($status) Input($url), output($safe_path), expected($target_file)";
}

echo "\n\n";
哥,最终变帅啦 2024-10-10 07:28:44

1

为 -Index 块放置一个 null index.htm

2

在启动时过滤 sQS

// Path Traversal Attack
if( strpos($_SERVER["QUERY_STRING"], "../") ){
    exit("P.T.A. B-(");
}

1

put a null index.htm for -Index block

2

filter sQS on start

// Path Traversal Attack
if( strpos($_SERVER["QUERY_STRING"], "../") ){
    exit("P.T.A. B-(");
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文