BNF 到正则表达式
如何
A → AA | ( A ) | ε
使用正则表达式描述生成的语言?
How can I describe the language
A → AA | ( A ) | ε
generates using regular expressions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
正则表达式接受来自正则语言的字符串。 FSM 也可以接受常规语言。
您的语言中可能有无限多个括号需要匹配。 这意味着您需要一个无限状态,这在任何有限状态机中显然是不可能的。 因此,您的语言不是常规的,无法与正则表达式匹配。
Regular expressions accept strings from regular languages. Regular languages can also be accepted by an FSM.
There's an potentially infinite number of brackets in your language that you have to match up. That means you need an infinite state, obviously impossible in any Finite State Machine. Therefore, your language isn't regular and can't be matched with a regex.
正则表达式不能匹配嵌套括号。
Regular expressions cannot match nesting brackets.
我不确定如何标记该语言,但该语言不是常规语言,可以使用常规语言的泵引理来显示(因此,不能通过正则表达式来标记)。 一个直观的解释是,接受该语言中的单词将要求 FDA “记住”每次开始读取右括号时刚刚读取的左括号的数量,而这对他们来说是不可能的,因为他们没有“记忆” '。 另一方面,下推自动机...
该语言是否可以记为 {(n)n}*,对于任何n?
I'm not sure about how you could notate that language, but that language isn't regular, as can be shown using the pumping lemma for regular languages (and thus, it can't be noted by a regex). An intuitive explanation is that accepting words from that language would require the FDA to 'remember' the number of opening parenthesis that it just read every time it begins to read closing parenthesis, and that isn't possible for them as they have no 'memory'. A push-down automaton, on the other hand...
Could that language be noted as {(n)n}*, for any n?
你不能 - 正则表达式只能识别可能语言的一小部分。 特别是,非正式地,任何需要无限量内存才能识别的语言都不可被 RE 识别。
在这里,您需要无限量的内存来记住您看到过多少个左括号,以确保右括号的数量相同。
您需要某种能够解析上下文无关语法的机制,以便能够识别一般由 BNF 描述的语言。 现代解析器非常擅长这一点!
You can't - regular expressions can only recognise a small subset of possible languages. In particular, informally, any language that requires an unbounded amount of memory potentially to recognise is not RE recognisable.
Here, you'd need an unbounded amount of memory to remember how many opening parentheses you've seen in order to make sure the number of closing parentheses are the same.
You'll need some mechanism that is capable of parsing Context-Free Grammars to be able to recognise languages described by BNF in general. Modern parsers are very good at this!
正如其他人所说,您不能使用单个正则表达式来执行此操作,但是您可以使用两个“\(”和“\)”对其进行标记。 看到你的语言只能有括号,尽管我不确定这是否非常有用。
注意:您还需要一个传递器来确保括号正确配对。 所以“(()()”会标记但不会解析。
As others have said you cannot do this with a single regular expression, however you can tokenize it with two "\(" and "\)". Seeing that your language can only have brackets in it though I'm not sure this is going to be very useful.
Note: You will also need a passer to ensure the brackets are paired up correctly. So “(()()” will tokenize but will not parse.