字符串文字标记生成带有转义序列标记的 MismatchedTokenException

发布于 2024-12-10 02:28:29 字数 387 浏览 1 评论 0原文

我目前正在尝试实现一个 Antlr 解析器。
添加转义序列支持后，我在标识字符串文字的标记中获得了奇怪的 MismatchedTokenException 。

以下是导致该问题的 Antlr 解析器示例：

rule: STRING_LITERAL ;

STRING_LITERAL
  :
  '"' STRING_GUTS '"'
  ;

fragment
STRING_GUTS
  :
  ( ESC | ~('\\' | '"') )*
  ;

ESC
  :
  '\\'
  ( '\\' | '"' )
  ;

您在此代码中发现任何问题吗？

请注意，如果我从 STRING_GUTS 中删除 ESC，则字符串解析工作正常......

原文

I am currently trying to implement an Antlr parser.
I obtain strange MismatchedTokenException in a token that identifies string literals once I add escape sequence support.

Following is the Antlr parser example that causes the issue:

rule: STRING_LITERAL ;

STRING_LITERAL
  :
  '"' STRING_GUTS '"'
  ;

fragment
STRING_GUTS
  :
  ( ESC | ~('\\' | '"') )*
  ;

ESC
  :
  '\\'
  ( '\\' | '"' )
  ;

Do you seen any issue in this code?

Note that if I remove ESC from the STRING_GUTS, the string parsing is working well...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

别低头，皇冠会掉 2024-12-17 02:28:29

您必须发布出现此错误的输入、您正在使用的 ANTLR 版本以及您运行测试的方式，因为我认为该语法没有问题，如您所见：

Tg

grammar T;

rule
  :  STRING_LITERAL {System.out.println("parsed : " + $STRING_LITERAL.text);}
  ;

STRING_LITERAL 
  :  '"' STRING_GUTS '"'
  ;

fragment
STRING_GUTS
  :  (ESC | ~('\\' | '"'))*
  ;

// also a fragment rule perhaps?
ESC
  :  '\\' ('\\' | '"')
  ;

Main.java

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "\"a\\\"b\\\\c\"";
    TLexer lexer = new TLexer(new ANTLRStringStream(src));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    System.out.println("src    : " + src);
    parser.rule();
  }
}

如果我根据语法生成词法分析器和解析器 (1)，编译所有 java 源文件 (2) 并运行 Main 类 (3)：

java -cp antlr-3.3.jar org.antlr.Tool T.g    # 1
javac -cp antlr-3.3.jar *.java               # 2
java -cp .;antlr-3.3.jar Main                # 3

以下内容将打印到控制台：

src    : "a\"b\\c"
parsed : "a\"b\\c"

即：输入 < code>src 被解析正如预期的那样。

如果您在使用 ANTLRWorks 的解释器时遇到问题：不要使用它，它有一点问题。要么使用 ANTLRWorks 的调试器，要么使用自定义类，就像我上面所做的那样。

You'll have to post the input you're getting this error with, the ANTLR version you're using, and the way you're running your test(s), because I see no problem with that grammar, as you can see:

T.g

grammar T;

rule
  :  STRING_LITERAL {System.out.println("parsed : " + $STRING_LITERAL.text);}
  ;

STRING_LITERAL 
  :  '"' STRING_GUTS '"'
  ;

fragment
STRING_GUTS
  :  (ESC | ~('\\' | '"'))*
  ;

// also a fragment rule perhaps?
ESC
  :  '\\' ('\\' | '"')
  ;

Main.java

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "\"a\\\"b\\\\c\"";
    TLexer lexer = new TLexer(new ANTLRStringStream(src));
    TParser parser = new TParser(new CommonTokenStream(lexer));
    System.out.println("src    : " + src);
    parser.rule();
  }
}

If I generate a lexer and parser from you grammar (1), compile all java-source files (2) and run the Main class (3):

java -cp antlr-3.3.jar org.antlr.Tool T.g    # 1
javac -cp antlr-3.3.jar *.java               # 2
java -cp .;antlr-3.3.jar Main                # 3

The following is printed to the console:

src    : "a\"b\\c"
parsed : "a\"b\\c"

I.e.: the input src is parsed as expected.

If you're encountering problems with ANTLRWorks' interpreter: don't use it, it's a bit buggy. Either use ANTLRWorks' debugger, or use a custom class as I did above.

回复收藏 0 原文

~没有更多了~