Scala 解析器 - 消息长度
我正在玩弄 Scala 的解析器库。我正在尝试为一种格式编写一个解析器,其中指定了长度,后跟该长度的消息。例如:
x.parseAll(x.message, "5helloworld") // result: "hello", remaining: "world"
我不知道如何使用组合器来做到这一点。我的想法首先是:
def message = length ~ body
但显然主体取决于长度,我不知道该怎么做:p
相反,您可以将消息解析器定义为单个解析器(而不是解析器的组合),我认为这是可行的(虽然我还没有看过单个解析器是否可以提取多个元素?)。
无论如何,我是一个 scala 菜鸟,我觉得这太棒了:)
I'm toying with Scala's Parser library. I am trying to write a parser for a format where a length is specified followed by a message of that length. For example:
x.parseAll(x.message, "5helloworld") // result: "hello", remaining: "world"
I'm not sure how to do this using combinators. My mind first goes to:
def message = length ~ body
But obviously body depends on length, and I don't know how to do that :p
Instead you could just define a message Parser as a single Parser (not combination of Parsers) and I think that is doable (although I haven't looked if a single Parser can pull several elem?).
Anyways, I'm a scala noob, I just find this awesome :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您应该使用
into
或其缩写>>
:使用它时要小心优先级。例如,如果在上面的函数后面添加“~ 余数”,Scala 会将其解释为
length >>> ({ length => ...} ~ 余数)
而不是(length >> { length => ...}) ~ 余数
。You should use
into
for that, or its abbreviation,>>
:Be careful with precedence when using it. If you add an "~ remainder" after the function above, for instance, Scala will interpret it as
length >> ({ length => ...} ~ remainder)
instead of(length >> { length => ...}) ~ remainder
.这听起来不像上下文无关语言,因此您需要使用 flatMap :
其中 length 的类型为 Parser[Int] ,而 bodyOfLength(n) 将基于repN,例如
This does not sound like a context free language, so you will need to use flatMap :
where length is of type Parser[Int] and bodyOfLength(n) would be based on repN, such as
我不会为此目的使用 pasrer 组合器。但如果你必须这样做或者问题变得更复杂,你可以尝试这个:
如果你想要保留一些东西,不要使用 parseAll,使用 parse。
您可以解析长度,将结果存储在可变字段 x 中(我知道很难看,但在这里有用)并解析主体 x 次,然后解析字符串,其余部分保留在解析器中。
I wouldn´t use pasrer combinators for this purpose. But if you have to or the problem becomes more complex you could try this:
Don´t use parseAll if you want something remained, use parse.
You could parse length, store the result in a mutable field x(I know ugly, but useful here) and parse body x times, then you get the String parsed and the rest remains in the parser.