需要有关 LALR(1) 解析的帮助

发布于 2024-11-16 19:43:48 字数 1170 浏览 1 评论 0原文

我正在尝试解析一种上下文无关的语言，称为上下文无关艺术。我使用类似 YACC 的 JS LALR(1) 解析器生成器 JSCC 在 Javascript 中创建了它的解析器。

以以下 CFA（Context Free Art）代码为例。此代码是有效的 CFA。

startshape A
rule A { CIRCLE { s 1} }

请注意上面的 A 和 s。 s 是缩放 CIRCLE 的命令，但 A 只是该规则的名称。在语言的语法中，我将 s 设置为标记 SCALE ，并且 A 位于标记 STRING 下（我有一个常规的表达式来匹配字符串，它位于所有标记的底部）。

这工作正常，但在下面的情况下它会崩溃。

startshape s
rule s { CIRCLE { s 1} }

这也是一个完全有效的代码，但是由于我的解析器将 rule 之后的 s 标记为 SCALE 标记，因此它错误地表示它期望 <代码>STRING。

现在我的问题是，是否有任何方法可以重写解析器的生产规则来解决这个问题？相关的生产规则是：-

rule:
    RULE STRING '{' buncha_replacements '}'    [* rule(%2, 1) *]
    |
    RULE STRING RATIONAL '{' buncha_replacements '}'  [* rule(%2, 1*%3) *]
    ;

我能想到的一个简单的解决方案是创建上述规则的副本，并将 STRING 替换为 SCALE，但这只是许多类似的解决方案之一需要这样修复的规则。此外，还有许多其他终端可以与 STRING 匹配。所以这意味着规则太多了！

原文

I am trying to parse a context-free language, called Context Free Art. I have created its parser in Javascript using a YACC-like JS LALR(1) parser generator JSCC.

Take the example of following CFA (Context Free Art) code. This code is a valid CFA.

startshape A
rule A { CIRCLE { s 1} }

Notice the A and s in above. s is a command to scale the CIRCLE, but A is just a name of this rule. In the language's grammar I have set s as token SCALE and A comes under token STRING (I have a regular expression to match string and it is at the bottom of of all tokens).

This works fine, but in the below case it breaks.

startshape s
rule s { CIRCLE { s 1} }

This too is a perfectly valid code, but since my parser marks s after rule as SCALE token so it errors out saying that it was expecting STRING.

Now my question is, if there is any way to re-write the production rules of the parser to account for this? The related production rule is:-

rule:
    RULE STRING '{' buncha_replacements '}'    [* rule(%2, 1) *]
    |
    RULE STRING RATIONAL '{' buncha_replacements '}'  [* rule(%2, 1*%3) *]
    ;

One simple solution I can think of is create a copy of above rule with STRING replaced by SCALE, but this is just one of the many similar rules which would need such fixing. Furthermore there are many other terminals which can get matched to STRING. So that means way too many rules!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

放低过去 2024-11-23 19:43:48

是的！最后，我的问题的解决方案击中了我。我需要做的就是将上面的生成修改为：-

rule:
    RULE user_string '{' buncha_replacements '}'    [* rule(%2, 1) *]
    |
    RULE user_string RATIONAL '{' buncha_replacements '}'  [* rule(%2, 1*%3) *]
    ;

user_string:
    STRING | SCALE ;

与我在问题文本中提到的解决方案相比，这是一个非常优雅的解决方案。如果有人有更好的解决方案请评论。

Yes! Finally the solution to my problem has hit me. All I need to do is modify my above production to:-

rule:
    RULE user_string '{' buncha_replacements '}'    [* rule(%2, 1) *]
    |
    RULE user_string RATIONAL '{' buncha_replacements '}'  [* rule(%2, 1*%3) *]
    ;

user_string:
    STRING | SCALE ;

This is a pretty elegant solution compared to what I mentioned in my problem text. If anybody has any better solution then please do comment.

回复收藏 0 原文

~没有更多了~