在哪些语言中,使用用户提供的正则表达式会存在安全漏洞?
编辑:tchrist 告诉我,我最初对 Perl 不安全的指控是没有根据的。然而,问题仍然存在。
我知道在 Perl 中,您可以在正则表达式中嵌入任意代码,因此显然接受用户提供的正则表达式并匹配它允许任意代码执行,并且是一种明显的安全性 但这对于所有使用正则表达式的语言都是如此吗?对于所有使用“Perl 兼容”正则表达式的语言都是如此吗?用户提供的正则表达式在哪些语言中可以安全使用,哪些语言允许任意代码执行或其他安全漏洞?
Edit: tchrist has informed me that my original accusations about Perl's insecurity are unfounded. However, the question still stands.
I know that in Perl, you can embed arbitrary code in a regular expression, so obviously accepting a user-supplied regex and matching it allows arbitrary code execution and is a clear security hole. But is this true for all languages that use regular expressions? Is it true for all languages that use "Perl-compatible" regular expressions? In which languages are user-supplied regexes safe to use, and in which languages do they allow arbitrary code execution or other security holes?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在大多数语言中,允许用户提供正则表达式意味着您允许拒绝服务攻击。
某些类型的正则表达式的执行非常消耗 CPU 资源。因此,一般来说,允许用户输入将在远程系统上执行的正则表达式是一个坏主意。
有关详细信息,请阅读此页面:http://www.regular-expressions.info/catastropic.html
In most languages allowing users to supply regular expression means that you allow for a denial of service attack.
Some types of regular expressions are extremely cpu intensive to execute. So in general it's a bad idea to allow users to enter regular expressions that will be executed on a remote system.
For more info, read this page: http://www.regular-expressions.info/catastrophic.html
这是不正确的:您不能通过将代码回调隐藏在评估的正则表达式中来执行 Perl 中的代码回调。这是禁止的。 则必须使用词法范围专门覆盖它
如果您希望在同一模式中同时发生插值和代码转义,
。手表:
This is not true: you cannot execute code callbacks in Perl by sneaking them in an evaluated regex. This is forbidden. You have to specifically override that with a lexically scoped
if you expect to have both interpolation and code escapes happening in the same pattern.
Watch:
它通常是带有
eval
工具的动态语言,往往能够从正则表达式执行代码。在静态语言(即需要单独编译步骤的语言)中,通常无法执行未编译的代码,因此不可能从正则表达式中评估代码。如果没有办法在正则表达式中嵌入代码,用户最糟糕的做法就是编写一个需要很长时间来评估的正则表达式。
It's generally dynamic languages with an
eval
facility that tend to have the ability to execute code from regular expressions. In static languages (i.e. those requiring a separate compilation step) there is generally no way to execute code that wasn't compiled, so evaluating code from within a regex is impossible.Without a way to embed code in a regex, the worst a user can do is write a regex that takes a long time to evaluate.
1)在正则表达式库中发现了漏洞,例如这个影响Webkit的缓冲区溢出并允许任何攻击者通过从 javascript 访问正则表达式库来获得远程代码执行。
2)这是 C# 中的 DoS 条件。
3) 由于修饰符,用户提供的正则表达式可以用于 php 。添加 /e 修饰符会评估匹配。在这种情况下,系统将被 eval() 化。
preg_replace("/.*/e","system('echo /etc/passwd')");
或者以漏洞的形式:
preg_replace($_GET['regex' ],$_GET['检查']);
1)Vulnerabilities are found in regex libraries, such as this buffer overflow that affects Webkit and allows any attacker to gain remote code execution by accessing the regex library from javascript.
2)It is a DoS condition in C#.
3)User supplied regex's can be for php because of modifiers. Adding the /e modifier evals the match. In this case system will be eval()'ed.
preg_replace("/.*/e","system('echo /etc/passwd')");
Or in the form of a vulnerability:
preg_replace($_GET['regex'],$_GET['check']);
用户提供的正则表达式,或者一般来说,用户输入,永远不应该被视为安全的 - 无论编程语言如何。如果您的程序未能这样做,则很容易受到故意设计的输入的攻击。
对于正则表达式,它可以是
ReDos
:正则表达式拒绝服务。基本上,正则表达式会消耗过多的 CPU 和内存来进行处理。例如:如果您尝试
在此输入上评估此正则表达式:
您会注意到它可能会挂起 - 这称为灾难性回溯。在这里亲自查看:https://regex101.com/r/Qhn3Vb/1
阅读有关正则表达式 DoS 的更多信息: https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_ -_ReDoS
底线:永远不要假设用户输入是安全的!
User-supplied regex, or in general, user input, should never be treated as safe - regardless of the programming language. If your program fails to do so, it is vulnerable to attacks by deliberately crafted inputs.
In the case of Regex, it can be
ReDos
: Regex Denial of Service. Basically, a regex which consumes an excessive amount of CPU and memory to process.For e.g: if you try to evaluate this regex
on this input:
you'll notice it may hang - it's called catastrophic backtrack. See it for yourself here: https://regex101.com/r/Qhn3Vb/1
Read more about Regex DoS: https://www.owasp.org/index.php/Regular_expression_Denial_of_Service_-_ReDoS
Bottomline: never assume user input is safe!
正则表达式是一种编程语言。我不认为它们完全是图灵完备的,但它们足够接近,允许您的用户将它们输入您的网站,就允许其他人在您的服务器上运行代码。 QED,是的,这是一个安全漏洞。
您也许可以允许您想要使用的任何正则表达式语言的子集,将一组特定的构造列入白名单,使其成为一个不太大的漏洞......其他人已经提到过嵌套和 * 可能带来的厄运。您愿意让人们加载您的服务器的负载量取决于您。就我个人而言,我很愿意让他们有一个 SQL“CONTAINS”语句,也许还有一个“BETWEEN()”。 :)
Regular expressions are a programming language. I don't think they're quite Turing-complete, but they're close enough that allowing your users to enter them into your web site IS allowing other people to run code on your server. QED, yes, it's a security hole.
You might be able to get away with allowing a subset of whatever regexp language you want to use, whitelist a particular set of constructs to make it a not-big-enough-to-sweat-over hole... other people have already mentioned the possible dooms of nesting and * . How much you're willing to let people load down your server is up to you. Personally, I'd be comfortable with letting 'em have one SQL "CONTAINS" statement and maybe a "BETWEEN()". :)
我怀疑 ruby 会允许
/#{system("rm -rf real_important_directory")}/
- 这是您担心的事情吗?I suspect ruby would allow
/#{system("rm -rf really_important_directory")}/
- is that the kind of thing you're worried about?AFAIK,您可以在 C# 中安全地完成此操作:您可以将正则表达式字符串提供给 Regex 构造函数,如果解析失败,则会抛出异常。我不确定其他人的情况。
AFAIK, you can do it safely in C#: you can supply the regex string to the Regex constructor, and if it fails to parse it'll throw. I'm not sure about others.