有什么好的 Perl 正则表达式可以消除绝对路径的污染?

发布于 2024-08-07 20:42:06 字数 881 浏览 9 评论 0原文

好吧,我尝试过但失败了,所以我又来了。

我需要匹配我的腹肌路径模式。

 /public_html/mystuff/10000001/001/10/01.cnt

我处于污点模式等。

#!/usr/bin/perl -Tw
use CGI::Carp qw(fatalsToBrowser);
use strict;
use warnings;
$ENV{PATH} = "bin:/usr/bin";
delete ($ENV{qw(IFS CDPATH BASH_ENV ENV)});

我需要打开同一个文件几次或更多次,污点迫使我每次都取消文件名的污点。尽管我可能做错了其他事情,但我仍然需要帮助构建此模式以供将来参考。

my $file = "$var[5]";
if ($file =~ /(\w{1}[\w-\/]*)/) {
$under = "/$1\.cnt";
} else {
ErroR();
}

从我的初学者尝试中你可以看出我几乎一无所知。

由于我的正则表达式构造不良但有效,因此我必须将正斜杠和扩展名添加到 $1 中。

因此,我需要帮助学习如何修复我的表情,以便 $1 代表 /public_html/mystuff/10000001/001/10/01.cnt

有人可以在这里握住我的手并展示我如何制作:

$file =~ /(\w{1}[\w-\/]*)/ 匹配我的绝对路径 /public_html/mystuff/10000001/001/10 /01.cnt

感谢您的任何帮助。

Well, I tried and failed so, here I am again.

I need to match my abs path pattern.

 /public_html/mystuff/10000001/001/10/01.cnt

I am in taint mode etc..

#!/usr/bin/perl -Tw
use CGI::Carp qw(fatalsToBrowser);
use strict;
use warnings;
$ENV{PATH} = "bin:/usr/bin";
delete ($ENV{qw(IFS CDPATH BASH_ENV ENV)});

I need to open the same file a couple times or more and taint forces me to untaint the file name every time. Although I may be doing something else wrong, I still need help constructing this pattern for future reference.

my $file = "$var[5]";
if ($file =~ /(\w{1}[\w-\/]*)/) {
$under = "/$1\.cnt";
} else {
ErroR();
}

You can see by my beginner attempt that I am close to clueless.

I had to add the forward slash and extension to $1 due to my poorly constructed, but working, regex.

So, I need help learning how to fix my expression so $1 represents /public_html/mystuff/10000001/001/10/01.cnt

Could someone hold my hand here and show me how to make:

$file =~ /(\w{1}[\w-\/]*)/ match my absolute path /public_html/mystuff/10000001/001/10/01.cnt ?

Thanks for any assistance.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

╄→承喏 2024-08-14 20:42:06

编辑:在模式中使用$(就像我之前所做的那样)在这里是不可取的,因为它可以匹配文件名末尾的\n 。使用 \z 代替,因为它明确匹配字符串的结尾。

匹配的内容尽可能具体:

my $fn = '/public_html/mystuff/10000001/001/10/01.cnt';

if ( $fn =~ m!
    ^(
        /public_html
        /mystuff
        /[0-9]{8}
        /[0-9]{3}
        /[0-9]{2}
        /[0-9]{2}\.cnt
     )\z!x ) {
     print $1, "\n";
 }

或者,您可以通过将我认为是公共前缀 '/public_html/mystuff' 的内容放入变量中来减少代码占用的垂直空间并在 qr// 构造中组合各种组件(请参阅 perldoc perlop),然后使用条件运算符 ?:< /code>

#!/usr/bin/perl

use strict;
use warnings;

my $fn = '/public_html/mystuff/10000001/001/10/01.cnt';
my $prefix = '/public_html/mystuff';
my $re = qr!^($prefix/[0-9]{8}/[0-9]{3}/[0-9]{2}/[0-9]{2}\.cnt)\z!;

$fn = $fn =~ $re ? $1 : undef;

die "Filename did not match the requirements" unless defined $fn;
print $fn, "\n";

那样使用相对路径进行协调

$ENV{PATH} = "bin:/usr/bin";

此外,我无法像使用污点模式 。你的意思

$ENV{PATH} = "/bin:/usr/bin";

Edit: Using $ in the pattern (as I did before) is not advisable here because it can match \n at the end of the filename. Use \z instead because it unambiguously matches the end of the string.

Be as specific as possible in what you are matching:

my $fn = '/public_html/mystuff/10000001/001/10/01.cnt';

if ( $fn =~ m!
    ^(
        /public_html
        /mystuff
        /[0-9]{8}
        /[0-9]{3}
        /[0-9]{2}
        /[0-9]{2}\.cnt
     )\z!x ) {
     print $1, "\n";
 }

Alternatively, you can reduce the vertical space taken by the code by putting the what I assume to be a common prefix '/public_html/mystuff' in a variable and combining various components in a qr// construct (see perldoc perlop) and then use the conditional operator ?::

#!/usr/bin/perl

use strict;
use warnings;

my $fn = '/public_html/mystuff/10000001/001/10/01.cnt';
my $prefix = '/public_html/mystuff';
my $re = qr!^($prefix/[0-9]{8}/[0-9]{3}/[0-9]{2}/[0-9]{2}\.cnt)\z!;

$fn = $fn =~ $re ? $1 : undef;

die "Filename did not match the requirements" unless defined $fn;
print $fn, "\n";

Also, I cannot reconcile using a relative path as you do in

$ENV{PATH} = "bin:/usr/bin";

with using taint mode. Did you mean

$ENV{PATH} = "/bin:/usr/bin";
你在看孤独的风景 2024-08-14 20:42:06

您每次都谈到不污染文件路径。这可能是因为您没有划分程序步骤。

一般来说,我将此类计划分为几个阶段。早期阶段之一是数据验证。在让程序继续运行之前,我会验证所有可以验证的数据。如果其中任何一个不符合我的期望,我就不会让该计划继续进行。我不想在重要的事情(比如将东西插入数据库)进行到一半时才发现出了问题。

因此,当您获取数据时,请清除所有数据并将这些值存储在新的数据结构中。之后不要使用原始数据或 CGI 函数。 CGI 模块只是将数据传递给您的程序。之后,程序的其余部分应该尽可能少地了解 CGI。

我不知道你在做什么,但以实际文件名作为输入几乎总是一种设计味道。

You talk about untainting the file path every time. That's probably because you aren't compartmentalizing your program steps.

In general, I break up these sort of programs into stages. One of the earlier stages is data validation. Before I let the program continue, I validate all the data that I can. If any of it doesn't fit what I expect, I don't let the program continue. I don't want to get half-way through something important (like inserting stuff into a database) only to discover something is wrong.

So, when you get the data, untaint all of it and store the values in a new data structure. Don't use the original data or the CGI functions after that. The CGI module is just there to hand data to your program. After that, the rest of the program should know as little about CGI as possible.

I don't know what you are doing, but it's almost always a design smell to take actual filenames as input.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文