正则表达式中大十进制数发生灾难性回溯的原因是什么?

发布于 2025-01-19 10:46:25 字数 814 浏览 0 评论 0原文

我是正则表达式的新手,我一直在使用Python的RE模块进行一个项目,并且需要使用嵌套的数学表达式。

到目前为止,这一直相对较好,但是我注意到使用Re.Search()工作时,带有9个或更多小数点(作为字符串)的浮动数字会引起灾难性的回溯。

例如,这一行不会引起任何问题。

match = re.search(r"\(((?:[^()]+|R)+)\)", "(1.00000000+cos(1.0-3.0))")

但是,这一行确实会引起问题。

match = re.search(r"\(((?:[^()]+|R)+)\)", "(1.000000000+cos(1.0-3.0))")

对于任何想知道为什么我不删除第一个括号以删除问题的人,请注意这是一组孤立的行:在程序中,您可能会遇到这样的事情:

expression = "cos(tan(1)+sin(2)+tan(3)+sin(4)+tan(5))"
# during parsing, this becomes "cos(1.5574077246549023+0.9092974268256817+tan(3)+sin(4)+tan(5))"
# re.search(r"\(((?:[^()]+|R)+)\)", expression) would then cause a catastrophic backtrack

I'm new to regular expressions and I've been working on a project using Python's re module and have needed to work with nested mathematical expressions.

This has been going relatively fine up until now, but I've noticed that floating numbers with nine or more decimal places (as strings) cause catastrophic backtracking when working with re.search().

For example, this line causes no issue.

match = re.search(r"\(((?:[^()]+|R)+)\)", "(1.00000000+cos(1.0-3.0))")

However, this line does cause an issue.

match = re.search(r"\(((?:[^()]+|R)+)\)", "(1.000000000+cos(1.0-3.0))")

For anyone wondering why I don't remove the first parentheses to remove the issue, note that this is an isolated set of lines: in the program, you may come across something like this:

expression = "cos(tan(1)+sin(2)+tan(3)+sin(4)+tan(5))"
# during parsing, this becomes "cos(1.5574077246549023+0.9092974268256817+tan(3)+sin(4)+tan(5))"
# re.search(r"\(((?:[^()]+|R)+)\)", expression) would then cause a catastrophic backtrack

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

总攻大人 2025-01-26 10:46:25

python re不支持递归。但是,在您的模式下,您正在使用r char中的嵌套量词。

如果模式无法匹配,它将在导致灾难性回溯之前尝试所有排列。

如果您可以使用 python pyphon pypi模块您可以使用(( ?R)

\(((?:[^()]+|(?R))+)\)

See a regex demo with pcre selected that does support recursion.

Python re does not support recursion. But in your pattern you are using nested quantifiers with an R char in the alternation.

If the pattern can not match, it will try all permutations before failing causing catastrophic backtracking.

If you can make use of the Python pypi module you could recurse the whole pattern using (?R)

\(((?:[^()]+|(?R))+)\)

See a regex demo with pcre selected that does support recursion.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文