如何使用解析器组合器进行条件检查
我试图编写一个简单的 html 模板引擎(为了好玩),并且想要解析这样的结构
A. 普通行是 HTML
B. 如果一行以 $
开头,则将其视为 java 代码 C行
$ if (isSuper) {
<span>Are you wearing red underwear?</span>
$ }
。如果 ${}
包含多行,则其中的所有代码都应该是 java 代码。
D. 如果一行以 $include
开头,则对该行执行一些技巧(调用另一个模板),
$include anotherTemplate(id, name)
这将创建 anotherTemplate
的新实例,并将其称为 E.render()
方法
,除了$include
之外还会有更多的“命令”,比如$def
、$val
>。
我如何在解析器组合器中表达这一点?实际上,它是
1. 和 2. 的条件分叉,我得到了这样的结果:
'$' ~> ( '{' ~> upto('}') <~ '}' | not('{') <~ newline )
其中 upto
是从 Scalate Scamel 解析器借用的(我刚刚开始阅读但不太理解)
我使用 not('{')
来区分 $....
代码行和 ${...}
块。但这很麻烦,并且不会扩展到其他“命令”
那么我该怎么做呢?
I was trying to write a simple html template engine (for fun), and wanna parse a structure like this
A. normal lines are HTML
B. if a line starts with $
then view it as a java code line
$ if (isSuper) {
<span>Are you wearing red underwear?</span>
$ }
C. if ${}
wraps multiple lines, all code in it should be java code.
D. if a line starts with $include
then do some trick on the line (call another template)
$include anotherTemplate(id, name)
this will create a new instance of anotherTemplate
, and call it's render()
method
E. and there would be more "commands" other than $include
, such as $def
, $val
.
How can I express this in parser combinators? In effect it is a conditional fork
for 1. and 2., I got something like this:
'
where upto
is borrowed from Scalate Scamel parser (which I just start to read and can't quite understand)
I used not('{')
to distinguish $....
code line with ${...}
block. But this is cumbersome, and won't extend to other "commands"
So How can I do this?
~> ( '{' ~> upto('}') <~ '}' | not('{') <~ newline )
where upto
is borrowed from Scalate Scamel parser (which I just start to read and can't quite understand)
I used not('{')
to distinguish $....
code line with ${...}
block. But this is cumbersome, and won't extend to other "commands"
So How can I do this?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您使用
not
是多余的。|
方法实现有序选择;仅当第一件事失败时才会尝试第二件事。这应该可以解决问题:当我第一次学习使用解析库时,我发现查看 源代码
解析器
;它使其中一些内容变得更加清晰。(其他一些提示:
append
和ParseResult#append
的目的是决定应将解析替代序列中的哪些失败传播给用户。只需忽略这些另外,在您进行更多练习之前,我不会太担心>>
/flatMap
/into
;到时候,请阅读Daniel Sobral 的解释。最后,我从来没有使用过|||
,你可能赢了。也不是。快乐解析!)希望这有帮助。
~> ( '{' ~> javaStuff <~ '}' | "include" ~> includeDirective | "def" ~> defDirective | "val" ~> valDirective | javaDirective ) | htmlDirective ) def templateFile: Parser[List[Directive]] = (directive <~ '\n').*为了更快地解析和更好的错误消息,您应该尽可能频繁地“提交”解析器。我认为这就是您在使用
not('{')
时想要达到的目的。现在,如果上述解析器看到
'$'
后跟'{'
,然后没有看到javaStuff< /code>,它将回溯并按顺序考虑剩余的四个
'$'
-替代项(include
、def
、val
,最后javaDirective
),然后回溯到'$'
之前尝试htmlDirective
,然后失败并出现令人费解的错误消息。但是,如果我们看到'{'
,我们就知道其他替代方案都不可能成功,那么我们为什么要检查它们呢?同样,以'$'
开头的行永远不可能是htmlDirective
。我们希望像
'{'
这样的东西成为没有回头路的点;如果 after-'{'
解析器失败并想要回溯,我们应该停止它,并将导致回溯的失败作为错误直接传播给用户。执行此操作的方法是使用
commit
。此函数/组合器在应用于解析器p
时,会查看来自p
的ParseResult
并将其更改为Error
(完全放弃信号),如果它最初是一个Failure
(回溯信号),否则保持不变。通过适当使用commit
,directive
解析器将变为:当我第一次学习使用解析库时,我发现查看 源代码
解析器
;它使其中一些内容变得更加清晰。(其他一些提示:
append
和ParseResult#append
的目的是决定应将解析替代序列中的哪些失败传播给用户。只需忽略这些另外,在您进行更多练习之前,我不会太担心>>
/flatMap
/into
;到时候,请阅读Daniel Sobral 的解释。最后,我从来没有使用过|||
,你可能赢了。也不是。快乐解析!)希望这有帮助。
Your use of
not
is redundant. The|
method implements ordered choice; the second thing is tried only if the first has failed. This should do the trick:When I first learned to use the parsing library, I found it really helpful to look at the source code for
Parsers
; it makes some of this stuff a bit more clear.(Some other tips: The purpose of
append
andParseResult#append
is to decide which failure from a sequence of parse-alternatives should be propagated to the user. Just ignore those for now. Also, I wouldn't worry too much about>>
/flatMap
/into
until you've gotten some more practice; when it's time, read Daniel Sobral's explanation. Finally, I've never had to use|||
, and you probably won't either. Happy parsing!)Hope this helps.
~> ( '{' ~> javaStuff <~ '}' | "include" ~> includeDirective | "def" ~> defDirective | "val" ~> valDirective | javaDirective ) | htmlDirective ) def templateFile: Parser[List[Directive]] = (directive <~ '\n').*For faster parsing and better error messages, you should "commit" your parsers as often as possible. I think this is what you were trying to get at when you used
not('{')
.Right now, if the above parser sees a
'$'
followed by a'{'
and then doesn't seejavaStuff
, it'll backtrack and consider each of the four remaining'$'
-alternatives in order (include
,def
,val
, and finallyjavaDirective
), and then backtrack to before'$'
to tryhtmlDirective
, before failing with a baffling error message. But if we see a'{'
, we know that none of the other alternatives could possibly succeed, so why should we check them? Likewise, a line that starts with'$'
can never be anhtmlDirective
.We want things like
'{'
to be points of no backtrack; if the after-'{'
parser fails and wants to backtrack, we should stop it in its tracks and propagate the backtrack-causing failure directly to the user as an error.The way to do this is with
commit
. This function/combinator, when applied to a parserp
, looks at theParseResult
coming out ofp
and changes it to anError
(the give-up-entirely signal) if it was originally aFailure
(the backtrack signal), leaving it unchanged otherwise. With appropriate use ofcommit
, thedirective
parser becomes:When I first learned to use the parsing library, I found it really helpful to look at the source code for
Parsers
; it makes some of this stuff a bit more clear.(Some other tips: The purpose of
append
andParseResult#append
is to decide which failure from a sequence of parse-alternatives should be propagated to the user. Just ignore those for now. Also, I wouldn't worry too much about>>
/flatMap
/into
until you've gotten some more practice; when it's time, read Daniel Sobral's explanation. Finally, I've never had to use|||
, and you probably won't either. Happy parsing!)Hope this helps.