在哪些语言中,使用用户提供的正则表达式会存在安全漏洞?

发布于 2024-10-04 20:15:44 字数 255 浏览 5 评论 0原文

编辑:tchrist 告诉我,我最初对 Perl 不安全的指控是没有根据的。然而,问题仍然存在。

我知道在 Perl 中,您可以在正则表达式中嵌入任意代码,因此显然接受用户提供的正则表达式并匹配它允许任意代码执行,并且是一种明显的安全性 但这对于所有使用正则表达式的语言都是如此吗?对于所有使用“Perl 兼容”正则表达式的语言都是如此吗?用户提供的正则表达式在哪些语言中可以安全使用,哪些语言允许任意代码执行或其他安全漏洞?

Edit: tchrist has informed me that my original accusations about Perl's insecurity are unfounded. However, the question still stands.

I know that in Perl, you can embed arbitrary code in a regular expression, so obviously accepting a user-supplied regex and matching it allows arbitrary code execution and is a clear security hole. But is this true for all languages that use regular expressions? Is it true for all languages that use "Perl-compatible" regular expressions? In which languages are user-supplied regexes safe to use, and in which languages do they allow arbitrary code execution or other security holes?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

攀登最高峰 2024-10-11 20:15:44

在大多数语言中,允许用户提供正则表达式意味着您允许拒绝服务攻击。

某些类型的正则表达式的执行非常消耗 CPU 资源。因此,一般来说,允许用户输入将在远程系统上执行的正则表达式是一个坏主意。

有关详细信息,请阅读此页面:http://www.regular-expressions.info/catastropic.html

In most languages allowing users to supply regular expression means that you allow for a denial of service attack.

Some types of regular expressions are extremely cpu intensive to execute. So in general it's a bad idea to allow users to enter regular expressions that will be executed on a remote system.

For more info, read this page: http://www.regular-expressions.info/catastrophic.html

带上头具痛哭 2024-10-11 20:15:44

这是不正确的:您不能通过将代码回调隐藏在评估的正则表达式中来执行 Perl 中的代码回调。这是禁止的。 则必须使用词法范围专门覆盖它

use re "eval";

如果您希望在同一模式中同时发生插值和代码转义,

。手表:

% perl -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
Eval-group not allowed at runtime, use re 'eval' in regex m/(?{ die naughty })/ at -e line 1.
Exit 255

% perl -Mre=eval -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
naughty at (re_eval 1) line 1.
Exit 255

This is not true: you cannot execute code callbacks in Perl by sneaking them in an evaluated regex. This is forbidden. You have to specifically override that with a lexically scoped

use re "eval";

if you expect to have both interpolation and code escapes happening in the same pattern.

Watch:

% perl -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
Eval-group not allowed at runtime, use re 'eval' in regex m/(?{ die naughty })/ at -e line 1.
Exit 255

% perl -Mre=eval -le '$x = "(?{ die 'naughty' })"; "aaa" =~ /$x/'
naughty at (re_eval 1) line 1.
Exit 255
旧话新听 2024-10-11 20:15:44

它通常是带有 eval 工具的动态语言,往往能够从正则表达式执行代码。在静态语言(即需要单独编译步骤的语言)中,通常无法执行未编译的代码,因此不可能从正则表达式中评估代码。

如果没有办法在正则表达式中嵌入代码,用户最糟糕的做法就是编写一个需要很长时间来评估的正则表达式。

It's generally dynamic languages with an eval facility that tend to have the ability to execute code from regular expressions. In static languages (i.e. those requiring a separate compilation step) there is generally no way to execute code that wasn't compiled, so evaluating code from within a regex is impossible.

Without a way to embed code in a regex, the worst a user can do is write a regex that takes a long time to evaluate.

相思故 2024-10-11 20:15:44

1)在正则表达式库中发现了漏洞,例如这个影响Webkit的缓冲区溢出并允许任何攻击者通过从 javascript 访问正则表达式库来获得远程代码执行。

2)这是 C# 中的 DoS 条件。

3) 由于修饰符,用户提供的正则表达式可以用于 php 。添加 /e 修饰符会评估匹配。在这种情况下,系统将被 eval() 化。

preg_replace("/.*/e","system('echo /etc/passwd')");

或者以漏洞的形式:

preg_replace($_GET['regex' ],$_GET['检查']);

1)Vulnerabilities are found in regex libraries, such as this buffer overflow that affects Webkit and allows any attacker to gain remote code execution by accessing the regex library from javascript.

2)It is a DoS condition in C#.

3)User supplied regex's can be for php because of modifiers. Adding the /e modifier evals the match. In this case system will be eval()'ed.

preg_replace("/.*/e","system('echo /etc/passwd')");

Or in the form of a vulnerability:

preg_replace($_GET['regex'],$_GET['check']);

凤舞天涯 2024-10-11 20:15:44

用户提供的正则表达式,或者一般来说,用户输入,永远不应该被视为安全的 - 无论编程语言如何。如果您的程序未能这样做,则很容易受到故意设计的输入的攻击。

对于正则表达式,它可以是ReDos:正则表达式拒绝服务。基本上,正则表达式会消耗过多的 CPU 和内存来进行处理。

例如:如果您尝试

^(([a-z])+.)+[A-Z]([a-z])+$

在此输入上评估此正则表达式:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!

您会注意到它可能会挂起 - 这称为灾难性回溯。在这里亲自查看:https://regex101.com/r/Qhn3Vb/1

阅读有关正则表达式 DoS 的更多信息: https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_ -_ReDoS


底线:永远不要假设用户输入是安全的!

User-supplied regex, or in general, user input, should never be treated as safe - regardless of the programming language. If your program fails to do so, it is vulnerable to attacks by deliberately crafted inputs.

In the case of Regex, it can be ReDos: Regex Denial of Service. Basically, a regex which consumes an excessive amount of CPU and memory to process.

For e.g: if you try to evaluate this regex

^(([a-z])+.)+[A-Z]([a-z])+$

on this input:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!

you'll notice it may hang - it's called catastrophic backtrack. See it for yourself here: https://regex101.com/r/Qhn3Vb/1

Read more about Regex DoS: https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS


Bottomline: never assume user input is safe!

梦里寻她 2024-10-11 20:15:44

正则表达式一种编程语言。我不认为它们完全是图灵完备的,但它们足够接近,允许您的用户将它们输入您的网站,就允许其他人在您的服务器上运行代码。 QED,是的,这是一个安全漏洞。

您也许可以允许您想要使用的任何正则表达式语言的子集,将一组特定的构造列入白名单,使其成为一个不太大的漏洞......其他人已经提到过嵌套和 * 可能带来的厄运。您愿意让人们加载您的服务器的负载量取决于您。就我个人而言,我很愿意让他们有一个 SQL“CONTAINS”语句,也许还有一个“BETWEEN()”。 :)

Regular expressions are a programming language. I don't think they're quite Turing-complete, but they're close enough that allowing your users to enter them into your web site IS allowing other people to run code on your server. QED, yes, it's a security hole.

You might be able to get away with allowing a subset of whatever regexp language you want to use, whitelist a particular set of constructs to make it a not-big-enough-to-sweat-over hole... other people have already mentioned the possible dooms of nesting and * . How much you're willing to let people load down your server is up to you. Personally, I'd be comfortable with letting 'em have one SQL "CONTAINS" statement and maybe a "BETWEEN()". :)

爱,才寂寞 2024-10-11 20:15:44

我怀疑 ruby​​ 会允许 /#{system("rm -rf real_important_directory")}/ - 这是您担心的事情吗?

I suspect ruby would allow /#{system("rm -rf really_important_directory")}/ - is that the kind of thing you're worried about?

天赋异禀 2024-10-11 20:15:44

AFAIK,您可以在 C# 中安全地完成此操作:您可以将正则表达式字符串提供给 Regex 构造函数,如果解析失败,则会抛出异常。我不确定其他人的情况。

AFAIK, you can do it safely in C#: you can supply the regex string to the Regex constructor, and if it fails to parse it'll throw. I'm not sure about others.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文