如何安全地使用用户输入中的正则表达式？

发布于 2024-08-20 01:21:04 字数 538 浏览 8 评论 0原文

我的（基于 Perl 的）应用程序需要让用户输入正则表达式，以在幕后匹配各种字符串。到目前为止，我的计划是获取字符串并将其包装在类似

$regex = eval { qr/$text/ };
if (my $error = $@) { 
   # mangle $error to extract user-facing message

($text 的内容中，已提前删除换行符，因为它实际上是多行文本字段中的多个正则表达式我分裂）。

这样做是否存在任何潜在的安全风险 - 一些奇怪的输入可能导致任意代码执行？（除了 CVE-2007-5116 等正则表达式引擎中的缓冲区溢出漏洞）。如果是这样，有什么方法可以减轻它们吗？

有更好的方法吗？任何有助于抽象将用户输入转换为正则表达式的操作的 Perl 模块（例如提取错误消息......或提供像 /i 这样的修饰符，我在这里并不严格需要，但会是好的）？我搜索了 CPAN，并没有发现太多有希望的内容，但请考虑我错过了一些东西的可能性。

原文

My (Perl-based) application needs to let users input regular expressions, to match various strings behind the scenes. My plan so far has been to take the string and wrap it in something like

$regex = eval { qr/$text/ };
if (my $error = $@) { 
   # mangle $error to extract user-facing message

($text having been stripped of newlines ahead of time, since it's actually multiple regular expressions in a multi-line text-field that I split).

Are there any potential security risks with doing this - some weird input that could lead to arbitrary code execution? (Besides the buffer overflow vulnarabilities in the regular expression engines like CVE-2007-5116). If so, are there ways to mitigate them?

Is there a better way to do this? Any Perl modules which help abstract the operations of turning user input into regular expressions (such as extracting error messages ... or providing modifiers like /i, which I don't strictly need here, but would be nice)? I searched CPAN and didn't find much that was promising, but entertain the possibility that I missed something.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

留蓝 2024-08-27 01:21:04

使用不受信任的输入作为正则表达式会产生拒绝服务漏洞，如 perlsec 中所述：

正则表达式 - Perl 的正则表达式引擎被称为 NFA（非确定性有限自动机），其中包括
意味着如果正则表达式可以通过多种方式匹配，那么它很容易消耗大量的时间和空间。
精心设计正则表达式会有所帮助，但通常情况下，人们确实无能为力（《掌握正则表达式》一书
表达式”是必读内容，请参阅 perlfaq2）。空间不足表现为 Perl 内存不足。

回复收藏 0 原文

平安喜乐 2024-08-27 01:21:04

使用 (?{ code }) 构造，用户输入可用于执行任意代码。请参阅 perlre#code 中的示例，其中显示

local $cnt = $cnt + 1,

将其替换为表达式

system("rm -rf /home/fennec"); print "Ha ha.\n";

（实际上，不要那样做。）

With the (?{ code }) construct, user input could be used to execute arbitrary code. See the example in perlre#code and where it says

local $cnt = $cnt + 1,

replace it with the expression

system("rm -rf /home/fennec"); print "Ha ha.\n";

(Actually, don't do that.)

回复收藏 0 原文

温折酒 2024-08-27 01:21:04

修道院对此有一些讨论。

TLDR：使用 re::engine::RE2 -strict => 1;

确保添加 -strict => 1 到你的 use 语句或 re::engine::RE2 将回退到 Perl 的 re.1 。

以下是 GitHub 上项目的所有者 Paul Wankadia (junyer) 的引文< /a>:

RE2 的设计和实现的明确目标是能够毫无风险地处理来自不受信任用户的正则表达式。其主要保证之一是匹配时间与输入字符串的长度呈线性关系。它还在编写时考虑到了生产问题：解析器、编译器和执行引擎通过在可配置的预算内工作来限制其内存使用——耗尽时优雅地失败——并且它们通过避免递归来避免堆栈溢出。

总结一下要点：

默认情况下，任意代码执行是安全的，但要添加“no re 'eval';”防止 PERL5OPT 或其他什么？从把它设置在你身上。我不确定这样做是否会阻止一切。
使用带有 BSD::Resource 的子进程（fork）（甚至在 Linux 上）来限制内存并在超时后杀死子进程。

回复收藏 0 原文

埋情葬爱 2024-08-27 01:21:04

最好的办法，就是不要让用户拥有太多的权限。提供一个足以让用户做他们想做的事情的界面。（就像 ATM 机只有各种选项的按钮，不需要键盘输入）。当然，如果您需要用户键入输入，则提供文本框，然后在后端使用 Perl 来处理请求（例如清理等）。让用户输入正则表达式的动机是搜索字符串模式，对吧？那么在这种情况下，最简单、最安全的方法就是告诉他们只输入字符串。然后在后端，您使用 Perl 的正则表达式来搜索它。还有其他令人信服的理由让用户自己输入正则表达式吗？

回复收藏 0 原文