Scala 解析器组合器:如何解析“if(x)”如果 x 可以包含“)”

发布于 2024-09-02 14:29:14 字数 978 浏览 5 评论 0原文

我正在尝试让它工作:

def emptyCond: Parser[Cond] = ("if" ~ "(") ~> regularStr <~ ")" ^^ { case s => Cond("",Nil,Nil) }

其中 regularStr 被定义为接受许多内容,包括“)”。当然,我希望这是一个可接受的输入:if(foo())。但对于任何 if(x) ,它都会将“)”作为 regularStr 的一部分,因此这个解析器永远不会成功。

我缺少什么?

编辑

regularStr 不是正则表达式。它是这样定义的:

  def regularStr = rep(ident | numericLit | decimalLit | stringLit | stmtSymbol) ^^ { case s => s.mkString(" ") }

符号是:

  val stmtSymbol = "*" | "&" | "." | "::" | "(" | ")" | "*" | ">=" | "<=" | "=" | 
               "<" | ">" | "|" | "-" | "," | "^" | "[" | "]" | "?" | ":" | "+" |
               "-=" | "+=" | "*=" | "/=" | "&&" | "||" | "&=" | "|="

我不需要详尽的语言检查,只需要控制结构。所以我并不关心 if() 中的“()”里面有什么,我想接受任何标识符、符号等序列。因此,就我的目的而言,即使 if())) 也应该有效,其中“))”是 if 的“条件”。

I'm trying to get this to work:

def emptyCond: Parser[Cond] = ("if" ~ "(") ~> regularStr <~ ")" ^^ { case s => Cond("",Nil,Nil) }

where regularStr is defined to accept a number of things, including ")". Of course, I want this to be an acceptable input: if(foo()). But for any if(x) it is taking the ")" as part of the regularStr and so this parser never succeeds.

What am I missing?

Edit:

regularStr is not a regular expression. It is defined thus:

  def regularStr = rep(ident | numericLit | decimalLit | stringLit | stmtSymbol) ^^ { case s => s.mkString(" ") }

and the symbols are:

  val stmtSymbol = "*" | "&" | "." | "::" | "(" | ")" | "*" | ">=" | "<=" | "=" | 
               "<" | ">" | "|" | "-" | "," | "^" | "[" | "]" | "?" | ":" | "+" |
               "-=" | "+=" | "*=" | "/=" | "&&" | "||" | "&=" | "|="

I don't need exhaustive language check, just the control structures. So I don't really care what's inside "()" in if(), I want to accept any sequence of identifiers, symbols, etc. So, for my purposes even if())) should be valid, where "))" is the if's "condition".

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

故事和酒 2024-09-09 14:29:14

正则表达式无法识别具有嵌套、平衡结构的语言,例如 (...)[...]{...}< /code> 等。因此,您将需要使用进一步的上下文无关产生式(不是正则表达式)来匹配 regularStr 部分。

A regular expression cannot recognize a language that has nested, balanced constructs such as (...), [...], {...}, etc. So you're going to need to use further context-free productions (not regular expressions) to match the regularStr portions.

烟沫凡尘 2024-09-09 14:29:14

好吧,接受 if())) 并不是真正的要求,只是我愿意接受的一个例子,以便使我的解析尽可能便宜,只需要担心捕获控制结构。

但看来我不能这么便宜而且仍然可以工作。因此,由于 if() 构造有括号,所以我所要做的就是期望里面的括号具有良好平衡的括号。不期望的结尾“)”不能成为条件的一部分。

我这样做了:

  val regularNoParens = ident | numericLit | decimalLit | stringLit | stmtSymbol 
  def regularParens: Parser[String] = "(" ~ rep(regularNoParens | regularParens) ~ ")" ^^ { case l ~ s ~ r => l + s.mkString(" ") + r } 
  def regularStr = rep(regularNoParens | regularParens) ^^ { case s => s.mkString(" ") }

我从 stmtSymbol 中取出了“(”和“)”。作品!

编辑:它不支持嵌套,已修复。

OK, accepting if())) was not really a requirement, just an example of what I would be willing to accept in order to make my parsing as cheap as possible, to just worry about capturing control structures.

However it appears I can't be so cheap and still have it work. So, since the if() construct has parenthesis, all I have to do is expect what's inside to have well balanced parenthesis. A closing ")" where one isn't expected cannot be part of the condition.

I did this:

  val regularNoParens = ident | numericLit | decimalLit | stringLit | stmtSymbol 
  def regularParens: Parser[String] = "(" ~ rep(regularNoParens | regularParens) ~ ")" ^^ { case l ~ s ~ r => l + s.mkString(" ") + r } 
  def regularStr = rep(regularNoParens | regularParens) ^^ { case s => s.mkString(" ") }

And I took out "(" and ")" from stmtSymbol. Works!

Edit: it didn't support nesting, fixed it.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文