Scala 解析器 - 消息长度

发布于 2024-11-15 03:22:00 字数 397 浏览 2 评论 0原文

我正在玩弄 Scala 的解析器库。我正在尝试为一种格式编写一个解析器,其中指定了长度,后跟该长度的消息。例如:

x.parseAll(x.message, "5helloworld") // result: "hello", remaining: "world"

我不知道如何使用组合器来做到这一点。我的想法首先是:

def message = length ~ body

但显然主体取决于长度,我不知道该怎么做:p

相反,您可以将消息解析器定义为单个解析器(而不是解析器的组合),我认为这是可行的(虽然我还没有看过单个解析器是否可以提取多个元素?)。

无论如何,我是一个 scala 菜鸟,我觉得这太棒了:)

I'm toying with Scala's Parser library. I am trying to write a parser for a format where a length is specified followed by a message of that length. For example:

x.parseAll(x.message, "5helloworld") // result: "hello", remaining: "world"

I'm not sure how to do this using combinators. My mind first goes to:

def message = length ~ body

But obviously body depends on length, and I don't know how to do that :p

Instead you could just define a message Parser as a single Parser (not combination of Parsers) and I think that is doable (although I haven't looked if a single Parser can pull several elem?).

Anyways, I'm a scala noob, I just find this awesome :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

荭秂 2024-11-22 03:22:00

您应该使用 into 或其缩写 >>

scala> object T extends RegexParsers {
     |   def length: Parser[String] = """\d+""".r
     |   def message: Parser[String] = length >> { length => """\w{%d}""".format(length.toInt).r }
     | }
defined module T

scala> T.parseAll(T.message, "5helloworld")
res0: T.ParseResult[String] =
[1.7] failure: string matching regex `\z' expected but `w' found

5helloworld
      ^

scala> T.parse(T.message, "5helloworld")
res1: T.ParseResult[String] = [1.7] parsed: hello

使用它时要小心优先级。例如,如果在上面的函数后面添加“~ 余数”,Scala 会将其解释为 length >>> ({ length => ...} ~ 余数) 而不是 (length >> { length => ...}) ~ 余数

You should use into for that, or its abbreviation, >>:

scala> object T extends RegexParsers {
     |   def length: Parser[String] = """\d+""".r
     |   def message: Parser[String] = length >> { length => """\w{%d}""".format(length.toInt).r }
     | }
defined module T

scala> T.parseAll(T.message, "5helloworld")
res0: T.ParseResult[String] =
[1.7] failure: string matching regex `\z' expected but `w' found

5helloworld
      ^

scala> T.parse(T.message, "5helloworld")
res1: T.ParseResult[String] = [1.7] parsed: hello

Be careful with precedence when using it. If you add an "~ remainder" after the function above, for instance, Scala will interpret it as length >> ({ length => ...} ~ remainder) instead of (length >> { length => ...}) ~ remainder.

匿名。 2024-11-22 03:22:00

这听起来不像上下文无关语言,因此您需要使用 flatMap :

def message = length.flatMap(l => bodyOfLength(n))

其中 length 的类型为 Parser[Int] ,而 bodyOfLength(n) 将基于repN,例如

def bodyWithLength(n: Int) : Parser[String] 
  = repN(n, elem("any", _ => true)) ^^ {_.mkString}

This does not sound like a context free language, so you will need to use flatMap :

def message = length.flatMap(l => bodyOfLength(n))

where length is of type Parser[Int] and bodyOfLength(n) would be based on repN, such as

def bodyWithLength(n: Int) : Parser[String] 
  = repN(n, elem("any", _ => true)) ^^ {_.mkString}
我不会写诗 2024-11-22 03:22:00

我不会为此目的使用 pasrer 组合器。但如果你必须这样做或者问题变得更复杂,你可以尝试这个:

def times(x :Long,what:String) : Parser[Any] = x match {
case 1 => what;
case x => what~times(x-1,what);
}

如果你想要保留一些东西,不要使用 parseAll,使用 parse。
您可以解析长度,将结果存储在可变字段 x 中(我知道很难看,但在这里有用)并解析主体 x 次,然后解析字符串,其余部分保留在解析器中。

I wouldn´t use pasrer combinators for this purpose. But if you have to or the problem becomes more complex you could try this:

def times(x :Long,what:String) : Parser[Any] = x match {
case 1 => what;
case x => what~times(x-1,what);
}

Don´t use parseAll if you want something remained, use parse.
You could parse length, store the result in a mutable field x(I know ugly, but useful here) and parse body x times, then you get the String parsed and the rest remains in the parser.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文