ANTLR解析问题

发布于 2024-08-14 10:10:32 字数 365 浏览 2 评论 0原文

我需要能够匹配某个字符串('[',然后是任意数量的等号,或者没有,然后是 '['),然后我需要匹配匹配的右括号 ('] ' 然后相同数量的等号然后 ']') 在一些其他匹配规则之后。 ((options{greedy=false;}:.)* 如果您必须知道)。我不知道如何在 ANTLR 中执行此操作,我该怎么做?

举个例子:我需要匹配 [====[任意文本 ]===] 但不匹配 [===[任意文本 ]==]

我还需要对任意数量的等号执行此操作,因此存在问题:如何让它在开盘时与收盘时匹配相同数量的等号?到目前为止,提供的解析器规则似乎没有任何帮助。

I need to be able to match a certain string ('[' then any number of equals signs or none then '['), then i need to match a matching close bracket (']' then the same number of equals signs then ']') after some other match rules. ((options{greedy=false;}:.)* if you must know). I have no clue how to do this in ANTLR, how can i do it?

An example: I need to match [===[whatever arbitrary text ]===] but not [===[whatever arbitrary text ]==].

I need to do it for an arbitrary number of equals signs as well, so therein lies the problem: how do i get it to match an equal number of equals signs in the open as in the close? The supplied parser rules so far dont seem to make sense as far as helping.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

哀由 2024-08-21 10:10:33

您无法轻松地为其编写词法分析器,您需要解析规则。两条规则就足够了。一个负责匹配大括号,一个负责匹配等号。

像这样的事情:

braces : '[' ']'
       | '[' equals ']'
       ;

equals : '=' equals '='
       | '=' braces '='
       ;

这应该涵盖您描述的用例。不是绝对可靠,但也许您必须在“等于”的第一条规则中使用谓词以避免模棱两可的解释。

编辑:

很难整合你的贪婪规则,同时避免词法分析器上下文切换或类似的东西(在 ANTLR 中很难)。但是如果您愿意在语法中集成一点 Java,您可以编写词法分析器规则。

以下示例语法显示了如何操作:

grammar TestLexer;

SPECIAL :   '[' { int counter = 0; } ('=' { counter++; } )+ '[' (options{greedy=false;}:.)* ']' ('=' { counter--; } )+ { if(counter != 0) throw new RecognitionException(input); } ']';

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

rule    :   ID
    |   SPECIAL
    ;

You can't easely write a lexer for it, you need parsing rules. Two rules should be sufficient. One is responsible for matching the braces, one for matching the equal signs.

Something like this:

braces : '[' ']'
       | '[' equals ']'
       ;

equals : '=' equals '='
       | '=' braces '='
       ;

This should cover the use case you described. Not absolute shure but maybe you have to use a predicate in the first rule of 'equals' to avoid ambiguous interpretations.

Edit:

It is hard to integrate your greedy rule and at the same time avoid a lexer context switch or something similar (hard in ANTLR). But if you are willing to integrate a little bit of java in your grammer you can write an lexer rule.

The following example grammar shows how:

grammar TestLexer;

SPECIAL :   '[' { int counter = 0; } ('=' { counter++; } )+ '[' (options{greedy=false;}:.)* ']' ('=' { counter--; } )+ { if(counter != 0) throw new RecognitionException(input); } ']';

ID  :   ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

rule    :   ID
    |   SPECIAL
    ;
最偏执的依靠 2024-08-21 10:10:33

您的标签提到了词法分析,但您的问题本身没有。你想做的事情是非常规的,所以我不认为它可以作为词法分析的一部分来完成(尽管我不记得 ANTLR 的词法分析器是否严格规则 - 自从我上次做以来已经有几年了使用ANTLR)。

但是,您所描述的内容应该可以在解析中实现。这是您所描述内容的语法:

thingy : LBRACKET middle RBRACKET;
middle : EQUAL middle EQUAL
       | LBRACKET RBRACKET;

Your tags mention lexing, but your question itself doesn't. What you're trying to do is non-regular, so I don't think it can be done as part of lexing (though I don't remember if ANTLR's lexer is strictly regular -- it's been a couple of years since I last used ANTLR).

What you describe should be possible in parsing, however. Here's the grammar for what you described:

thingy : LBRACKET middle RBRACKET;
middle : EQUAL middle EQUAL
       | LBRACKET RBRACKET;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文