正则表达式的复杂度是多少?
对字符串执行正则表达式比较所需的字符串长度的复杂性是多少?
What is the complexity with respect to the string length that takes to perform a regular expression comparison on a string?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
答案取决于“正则表达式”的确切含义。经典正则表达式可以编译到确定性有限自动机,可以在
O(N)
O(N)N 的字符串代码>时间。正则表达式语言的某些扩展使情况变得更糟。您可能会发现以下感兴趣的文档:正则表达式匹配可以简单而快速。
The answer depends on what exactly you mean by "regular expressions." Classic regexes can be compiled into Deterministic Finite Automata that can match a string of length
N
inO(N)
time. Certain extensions to the regex language change that for the worse.You may find the following document of interest: Regular Expression Matching Can Be Simple And Fast.
无界 - 您可以在空输入字符串上创建永不终止的正则表达式。
unbounded - you can create a regular expression that never terminates, on an empty input string.
如果您使用普通(TCS:无反向引用、串联、交替、Kleene 星)正则表达式并且正则表达式已编译,则其时间复杂度为 O(n)。
If you use normal (TCS:no backreference, concatenation,alternation,Kleene star) regexp and regexp is already compiled then it's O(n).
如果您正在 RegEx 上寻找严格的渐近界限(不考虑表达式本身),那么就没有这样的界限。正如 Alex 指出的,您可以创建一个 O(1) 的正则表达式或一个 Omega(infinity) 的正则表达式。作为纯粹的数学算法,正则表达式引擎过于复杂,无法执行任何形式的渐近分析(除了这样的分析基本上毫无价值的事实)。
特定表达式的增长率(因为无论如何,它实际上构成了一种算法)将更有意义,尽管不一定更容易分析。
If you're looking for tight asymptotic bounds on RegEx (without respect to the expression itself), then there isn't one. As Alex points out, you can create a regex that is O(1) or a regex that is Omega(infinity). As a purely mathematical algorithm, a regular expression engine would be far too complicated to perform any sort of formal asymptotic analysis (aside from the fact that such analysis would be basically worthless).
The growth rate of a particular expression (since that, really, constitutes an algorithm, anyway) would be far more meaningful, though not necessarily any easier to analyze.