正则表达式的最坏情况分析
是否有任何工具可以采用特定的正则表达式并根据正则表达式匹配的一定数量的字符所需的操作数返回最坏的情况?
例如,给定一个 (f|a)oo.*[ ]baz
,引擎可能需要执行多少步才能匹配 100 个字符?
如果有一个工具可以获取大量文本样本并显示每次运行的平均操作,我也会感兴趣。
我意识到这在很大程度上取决于所使用的引擎和实现——但我不知道这种情况有多常见。因此,如果它对于许多语言都很常见(使我的问题太模糊),我会对 Perl 和 Python 特别感兴趣。
Are there any tools that will take a particular regular expression and return the worst case scenario in terms of the number of operations required for a certain number of characters that the regular expression is matched against?
So for example, given a (f|a)oo.*[ ]baz
, how many steps might the engine possibly go though through to match 100 characters?
I would also be interested if there is a tool that can take a bunch of text samples and show the average operations for each run.
I realize this will depend a lot on the engine used and the implementation -- but I am ignorant as to how common this is. So if it is common for many languages (making my question too vague) I would be particularly interested in Perl and Python.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
Regexbuddy 的 调试器显示引擎在给定样本上得出匹配或不匹配的结论所需的步骤。有关灾难性回溯和调试正则表达式。
PS:它不是免费的,但他们提供 3 个月的退款保证。
Regexbuddy's debugger shows how many steps engine would take to conclude match or not on a given sample. More information on catastrophic backtracking and debugging regular expressions.
PS: It is not free but they offer a 3-month money-back guarantee.
请注意,这取决于引擎。虽然正则表达式理论基于直接自动机理论,但大多数引擎并不是这些理论的严格翻译。例如,出于这个原因,某些引擎会在指数时间内发生,而严格的 NFA 处理则不会。
Note that it depends on the engine. While regex theory is based on straight automata theory, most of the engines are not strict translations of those theories. For this reason, for instance, some engines incur in exponential time while strict NFA processing would not.
您可能会得到您正在寻找的东西,例如将
re.compile
与re.DEBUG
结合使用。请参阅 优秀答案 .com/q/101268/172322">Python 隐藏功能 社区 wiki 以获取详细说明。You might get what you're looking for something like using
re.compile
withre.DEBUG
. See this excellent answer from the Python Hidden Features Community wiki for an extensive explanation.