在 ANTLR 中使用正确的 CSS 解析约定解析 CSS 2.1
CSS2.1 语法包含一个强烈的建议,不要 以这种方式直接解析 CSS,“因为它不表达 解析约定,仅 CSS 2.1 语法。”
事实上,任何忽略这些解析约定的解析器(正如我们试图做的那样)在处理包含错误或未知结构的页面时都会遇到问题。
因此,我们希望 CSS2.1 ANTLR 解析器(目前不遵循前向兼容和错误处理解析约定)以某种方式使用由包含解析约定的基本语法生成的解析树。 (后者可能由另一个 ANTLR 解析器生成。)
这是一个合理的方法吗?是否有众所周知的技术可以做到这一点?
重申一下,我们的目标是生成一个强大的 CSS2.1 解析器,可以根据 CSS 解析约定优雅地处理错误和新构造。
The CSS2.1 grammar includes a strong advisory to not parse CSS directly this way, "since it does not express the parsing conventions, only the CSS 2.1 syntax."
Indeed, any parser that ignores these parsing conventions (as we have tried to do) runs into problems when dealing with pages containing errors or unknown constructs.
Therefore, we'd like our CSS2.1 ANTLR parser - which does not currently follow the forward-compatible and error-handling parsing conventions - to somehow use the parse tree generated by the basic grammar that does incorporate the parsing conventions. (The latter could perhaps be generated by another ANTLR parser.)
Is this a reasonable approach? Are there well understood techniques for doing this?
To reiterate, the goal is to produce a robust CSS2.1 parser that can handle errors and new constructs gracefully, in accordance with the CSS parsing conventions.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我们采用了上面我们认为可能有效的一般方法;确实如此。
简而言之,我们有两个 ANTLR 解析器:一个用于核心 CSS 语法,另一个用于 CSS2.1 语法。 CSS2.1解析器可以独立于核心CSS解析器执行。然而,这并不是它的实际使用方式。
核心 CSS 解析器用于构建基本的解析树。规则操作使用 CSS 2.1 语法的适当入口点重新解析文本,以生成与 CSS2.1 语法独立执行时生成的相同的 C# 对象。例如,核心 CSS 解析器中的规则集操作使用 CSS 2.1 语法中的规则集入口点重新解析匹配的文本,并将生成的对象添加到其结果中。
我们花了很多时间才弄清楚几个重要的点:
从外部代码调用的 ANTLR 解析器规则是 与其他规则调用的入口点相比,它们处理 EOF 的方式不同。
核心 CSS 语法需要根据实际被翻译到的 CSS 级别进行增强,而不违反解析约定。一个例子是 @media at-rule,其块包含需要使用解析约定尽可能进行解析的规则集,然后再将其移交给 CSS2.1 解析器。
希望这对其他想要做同样事情的人有帮助。
We went with the general approach above that we thought might work; it did.
Briefly, we have two ANTLR parsers: one for the core CSS grammar, and another for the CSS2.1 grammar. The CSS2.1 parser can be executed independently of the core CSS parser. However, that is not how it is actually used.
The core CSS parser is used to construct a basic parse tree. The rule actions re-parse the text using the appropriate entry points of the CSS 2.1 grammar, to produce the same C# objects that the CSS2.1 grammar would have produced when executed standalone. For example, the ruleset action in the core CSS parser re-parses the matched text using the ruleset entry point in the CSS 2.1 grammar, and adds the resulting objects to its result.
A couple of important points that took us a lot of time to figure out:
ANTLR Parser rules that are called from external code are different in the way that they handle EOF's, compared to entry points that are called by other rules.
The core CSS grammar needs to be augmented depending on which level of CSS is actually being translated to without violating the parsing conventions. One example is the @media at-rule, whose block contains rulesets that needed to be parsed as far as possible using the parsing conventions, before being handed over to the CSS2.1 parser.
Hope this is helpful to others looking to do the same thing.