解析 scala 中的递归结构

发布于 2024-08-12 17:22:26 字数 599 浏览 7 评论 0原文

我正在尝试在 scala 中构建一个解析器,它可以解析简单的类似 SQL 的字符串。我已经掌握了基础知识,可以解析类似的内容:

select id from users where name = "peter" and age = 30 order by lastname

但现在我想知道如何解析嵌套和类,即

select name from users where name = "peter" and (age = 29 or age = 30)

我的“combinedPredicate”的当前产生方式如下:

def combinedPredicate = predicate ~ ("and"|"or") ~ predicate ^^ {
  case l ~ "and" ~ r => And(l,r)
  case l ~ "or" ~ r => Or(l,r)
}

我尝试递归地引用自身内的combinedPredicate产生方式,但这会导致堆栈溢出。

顺便说一句,我只是在这里进行实验......没有实现整个 ansi-99 规范;)

I'm trying to contruct a parser in scala which can parse simple SQL-like strings. I've got the basics working and can parse something like:

select id from users where name = "peter" and age = 30 order by lastname

But now I wondered how to parse nested and classes, i.e.

select name from users where name = "peter" and (age = 29 or age = 30)

The current production of my 'combinedPredicate' looks like this:

def combinedPredicate = predicate ~ ("and"|"or") ~ predicate ^^ {
  case l ~ "and" ~ r => And(l,r)
  case l ~ "or" ~ r => Or(l,r)
}

I tried recursively referencing the combinedPredicate production within itself but that results in a stackoverflow.

btw, I'm just experimenting here... not implementing the entire ansi-99 spec ;)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

野鹿林 2024-08-19 17:22:26

好吧,必须以某种方式界定递归。在这种情况下,您可以这样做:

def combinedPredicate = predicate ~ rep( ("and" | "or" ) ~ predicate )
def predicate = "(" ~ combinedPredicate ~ ")" | simplePredicate
def simplePredicate = ...

因此它永远不会堆栈溢出,因为要递归,它首先必须接受一个字符。这是重要的部分 - 如果您始终确保在没有首先接受字符的情况下不会发生递归,那么您将永远不会陷入无限递归。当然,除非你有无限的输入。 :-)

Well, recursion has to be delimited somehow. In this case, you could do this:

def combinedPredicate = predicate ~ rep( ("and" | "or" ) ~ predicate )
def predicate = "(" ~ combinedPredicate ~ ")" | simplePredicate
def simplePredicate = ...

So it will never stack overflow because, to recurse, it first has to accept a character. This is the important part -- if you always ensure recursion won't happen without first accepting a character, you'll never get into an infinite recursion. Unless, of course, you have infinite input. :-)

预谋 2024-08-19 17:22:26

您遇到的堆栈溢出可能是左递归语言的结果:

def combinedPredicate = predicate ~ ...
def predicate = combinedPrediacate | ...

Scala 2.7 中的解析器组合器是递归下降解析器。递归下降解析器在这些方面存在问题,因为它们不知道第一次遇到终止符时它是什么样的。其他类型的解析器(例如 Scala 2.8 的 Packrat 解析器组合器)对此没有问题,尽管您需要使用 Lazy val 而不是方法来定义语法,如下所示:

lazy val combinedPredicate = predicate ~ ...
lazy val predicate = combinedPrediacate | ...

或者,您可以重构语言以避免左递归。从你给我的例子来看,在这种语言中需要括号可以有效地解决问题。

def combinedPredicate = predicate ~ ...
def predicate = "(" ~> combinedPrediacate <~ ")" | ...

现在,每个更深层次的递归都对应于另一个解析的括号。您知道当您用完括号时不必更深层次地递归。

The stack overflow you're experiencing is probably the result of a left-recursive language:

def combinedPredicate = predicate ~ ...
def predicate = combinedPrediacate | ...

The parser combinators in Scala 2.7 are recursive descent parsers. Recursive descent parsers have problems with these because they have no idea how the terminal symbol is when they first encounter it. Other kinds of parsers like Scala 2.8's packrat parser combinators will have no problem with this, though you'll need to define the grammar using lazy vals instead of methods, like so:

lazy val combinedPredicate = predicate ~ ...
lazy val predicate = combinedPrediacate | ...

Alternatively, you could refactor the language to avoid left recursion. From the example you're giving me, requiring parentheses in this language could solve the problem effectively.

def combinedPredicate = predicate ~ ...
def predicate = "(" ~> combinedPrediacate <~ ")" | ...

Now each deeper level of recursion corresponds to another parentheses parsed. You know you don't have to recurse deeper when you run out of parentheses.

月下凄凉 2024-08-19 17:22:26

在阅读了有关运算符优先级的解决方案并提出以下内容后:

  def clause:Parser[Clause] = (predicate|parens) * (
            "and" ^^^ { (a:Clause, b:Clause) => And(a,b) } |
            "or" ^^^ { (a:Clause, b:Clause) => Or(a,b) } )

  def parens:Parser[Clause] = "(" ~> clause  <~ ")"

Wich 可能只是@Daniel 所写内容的另一种方式;)

After reading about solutions for operator precedence and came up with the following:

  def clause:Parser[Clause] = (predicate|parens) * (
            "and" ^^^ { (a:Clause, b:Clause) => And(a,b) } |
            "or" ^^^ { (a:Clause, b:Clause) => Or(a,b) } )

  def parens:Parser[Clause] = "(" ~> clause  <~ ")"

Wich is probably just another way writing what @Daniel wrote ;)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文