这个antlr语法与这个输入不匹配的原因是什么？

发布于 2024-10-18 17:45:58 字数 4861 浏览 2 评论 0原文

对于提出如此类似的问题，我提前表示歉意，但我感到相当沮丧，而且我可能能够更好地解释新问题。

我正在尝试重写结构化文件的部分内容，并考虑使用antlr。这些文件是 {X} 标记行。我正在寻找一个字符，这样如果我找到它，我就可以重写文件的不同部分。但是这个字符（'#'）可以出现在文件的许多部分。但是，如果它出现在第四个 {#} 上，它会确定我是否需要以某种方式或以另一种方式重写下一个 {X} 的一部分，或者根本不需要（如果那里没有任何内容）。

典型输入：

{ 1 }{ 去哪里？ # }{ 去哪里？ }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }
{ 2 }{ 开车即可。 }{ 开车就行。 }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ 不在这里。 }

（我添加了第一个 #，所以你会看到它可以在那里。我的语法，antlr 3.3 - 它警告“第 1:31 行在输入 '{ # }' 处没有可行的替代方案”和“第 2:35 行在输入 '{ 0 }' 处没有可行的替代方案”

grammar VampireDialog;

options
{
output=AST;
ASTLabelType=CommonTree;
language=Java;
} 
tokens
{
REWRITE;
}

@parser::header {
import java.util.LinkedList;
import java.io.File;
}

@members {
//the lookahead type i'm using ( ()=> ) wraps everything after in a if, including the actions ( {} )
//that i need to use to prepare the arguments for the replace rules. Declare them global.
    String condition, command, wrappedCommand; boolean isEmpty, alreadyProcessed;

    public static void main(String[] args) throws Exception {
        File vampireDir = new File(System.getProperty("user.home"), "Desktop/Vampire the Masquerade - Bloodlines/Vampire the Masquerade - Bloodlines/Vampire/dlg/dummy");
        
        List<File> files = new LinkedList<File>();
        getFiles(256, new File[]{vampireDir}, files, new LinkedList<File>());
        for (File f : files) {
            if (f.getName().endsWith(".dlg")) {
                System.out.println(f.getName());
                VampireDialogLexer lex = new VampireDialogLexer(new ANTLRFileStream(f.getAbsolutePath(), "Windows-1252"));
                TokenRewriteStream tokens = new TokenRewriteStream(lex);
                VampireDialogParser parser = new VampireDialogParser(tokens);
                    Tree t = (Tree) parser.dialog().getTree();
                    System.out.println(t.toStringTree());
            }
        }
    }

    public static void getFiles(int levels, File[] search, List<File> files, List<File> directories) {
        for (File f : search) {
            if (!f.exists()) {
                throw new AssertionError("Search file array has non-existing files");
            }
        }
        getFilesAux(levels, search, files, directories);
    }

    private static void getFilesAux(int levels, File[] startFiles, List<File> files, List<File> directories) {
        List<File[]> subFilesList = new ArrayList<File[]>(50);
        for (File f : startFiles) {
            File[] subFiles = f.listFiles();
            if (subFiles == null) {
                files.add(f);
            } else {
                directories.add(f);
                subFilesList.add(subFiles);
            }
        }

        if (levels > 0) {
            for (File[] subFiles : subFilesList) {
                getFilesAux(levels - 1, subFiles, files, directories);
            }
        }
    }
}




/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/
dialog : (ANY ANY ANY npc_or_pc ANY* NL*)*;

npc_or_pc : (ANY ANY) =>
         pc_marker  pc_condition
|        npc_marker npc_condition;


pc_marker  :  t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?;
npc_marker :  t=ANY {!t.getText().trim().isEmpty() &&  t.getText().contains("#")}?;

pc_condition : '{' condition_text '}'
   { 
     condition = $condition_text.tree.toStringTree();
     isEmpty = condition.trim().isEmpty();
     command = "npc.Count()";
     wrappedCommand  =  "("+condition+") and "+ command;
     alreadyProcessed = condition.endsWith(command);
   }
   -> {alreadyProcessed}?   '{' condition_text '}'
   -> {isEmpty}?            '{' REWRITE[command] '}'
   ->                       '{' REWRITE[wrappedCommand] '}';

npc_condition : '{' condition_text '}'
   { 
     condition = $condition_text.tree.toStringTree();
     isEmpty = condition.trim().isEmpty();
     command = "npc.Reset()";
     wrappedCommand  =  "("+condition+") and "+ command;
     alreadyProcessed = condition.endsWith(command);
   }
   -> {alreadyProcessed}?   '{' condition_text '}'
   -> {isEmpty}?            '{' REWRITE[command] '}'
   ->                       '{' REWRITE[wrappedCommand] '}';

marker_text :    TEXT;
condition_text : TEXT;


/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/
//in the parser ~('#') means: "match any token except the token that matches '#'" 
//and in lexer rules ~('#') means: "match any character except '#'"


TEXT : ~('{'|NL|'}')*;
ANY : '{' TEXT '}';
NL : ( '\r' | '\n'| '\u000C');

原文

I apologize in advance about asking a so similar question, but i'm rather frustrated, and i will probably be able to explain better on a new question.

I'm trying to rewrite parts of a structured file, and thought of using antlr.
These files are lines of {X} tokens.
There is a character i'm looking ahead for, so that i can rewrite parts of the file different if i find it.
But this character ( '#' ), can occur in many parts of the file. However, if it appears on 4th {#} it determines if i need to rewrite part of the next {X} in a way, or in another way, or not at all (if there is nothing there).

Typical input:

{ 1 }{ Where to? # }{ Where to? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }{ }
{ 2 }{ Just drive. }{ Just drive. }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ Not here. }

(I added the first # so you see it can be there.)
My grammar, antlr 3.3 - it is warns "line 1:31 no viable alternative at input '{ # }'" and "line 2:35 no viable alternative at input '{ 0 }'"

grammar VampireDialog;

options
{
output=AST;
ASTLabelType=CommonTree;
language=Java;
} 
tokens
{
REWRITE;
}

@parser::header {
import java.util.LinkedList;
import java.io.File;
}

@members {
//the lookahead type i'm using ( ()=> ) wraps everything after in a if, including the actions ( {} )
//that i need to use to prepare the arguments for the replace rules. Declare them global.
    String condition, command, wrappedCommand; boolean isEmpty, alreadyProcessed;

    public static void main(String[] args) throws Exception {
        File vampireDir = new File(System.getProperty("user.home"), "Desktop/Vampire the Masquerade - Bloodlines/Vampire the Masquerade - Bloodlines/Vampire/dlg/dummy");
        
        List<File> files = new LinkedList<File>();
        getFiles(256, new File[]{vampireDir}, files, new LinkedList<File>());
        for (File f : files) {
            if (f.getName().endsWith(".dlg")) {
                System.out.println(f.getName());
                VampireDialogLexer lex = new VampireDialogLexer(new ANTLRFileStream(f.getAbsolutePath(), "Windows-1252"));
                TokenRewriteStream tokens = new TokenRewriteStream(lex);
                VampireDialogParser parser = new VampireDialogParser(tokens);
                    Tree t = (Tree) parser.dialog().getTree();
                    System.out.println(t.toStringTree());
            }
        }
    }

    public static void getFiles(int levels, File[] search, List<File> files, List<File> directories) {
        for (File f : search) {
            if (!f.exists()) {
                throw new AssertionError("Search file array has non-existing files");
            }
        }
        getFilesAux(levels, search, files, directories);
    }

    private static void getFilesAux(int levels, File[] startFiles, List<File> files, List<File> directories) {
        List<File[]> subFilesList = new ArrayList<File[]>(50);
        for (File f : startFiles) {
            File[] subFiles = f.listFiles();
            if (subFiles == null) {
                files.add(f);
            } else {
                directories.add(f);
                subFilesList.add(subFiles);
            }
        }

        if (levels > 0) {
            for (File[] subFiles : subFilesList) {
                getFilesAux(levels - 1, subFiles, files, directories);
            }
        }
    }
}




/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/
dialog : (ANY ANY ANY npc_or_pc ANY* NL*)*;

npc_or_pc : (ANY ANY) =>
         pc_marker  pc_condition
|        npc_marker npc_condition;


pc_marker  :  t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?;
npc_marker :  t=ANY {!t.getText().trim().isEmpty() &&  t.getText().contains("#")}?;

pc_condition : '{' condition_text '}'
   { 
     condition = $condition_text.tree.toStringTree();
     isEmpty = condition.trim().isEmpty();
     command = "npc.Count()";
     wrappedCommand  =  "("+condition+") and "+ command;
     alreadyProcessed = condition.endsWith(command);
   }
   -> {alreadyProcessed}?   '{' condition_text '}'
   -> {isEmpty}?            '{' REWRITE[command] '}'
   ->                       '{' REWRITE[wrappedCommand] '}';

npc_condition : '{' condition_text '}'
   { 
     condition = $condition_text.tree.toStringTree();
     isEmpty = condition.trim().isEmpty();
     command = "npc.Reset()";
     wrappedCommand  =  "("+condition+") and "+ command;
     alreadyProcessed = condition.endsWith(command);
   }
   -> {alreadyProcessed}?   '{' condition_text '}'
   -> {isEmpty}?            '{' REWRITE[command] '}'
   ->                       '{' REWRITE[wrappedCommand] '}';

marker_text :    TEXT;
condition_text : TEXT;


/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/
//in the parser ~('#') means: "match any token except the token that matches '#'" 
//and in lexer rules ~('#') means: "match any character except '#'"


TEXT : ~('{'|NL|'}')*;
ANY : '{' TEXT '}';
NL : ( '\r' | '\n'| '\u000C');

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

我恋#小黄人 2024-10-25 17:45:58

您的语法中有 3 处错误：

问题

#1

在您的 npc_or_pc 规则中：

npc_or_pc 
  :  (ANY ANY)=> pc_marker  pc_condition 
  |              npc_marker npc_condition
  ;

您不应该向前查找 ANY ANY，因为这会同时满足 >pc_marker 和 npc_marker。您应该先查找 pc_marker，然后查找 ANY（或 pc_condition）。

#2

在您的 pc_condition 和 npc_condition 规则中：

pc_condition 
  :  '{' condition_text '}'
  ;

npc_condition 
  :  '{' condition_text '}'
  ;

您使用的是标记 { 和 }但词法分析器永远不会创建这样的标记。一旦词法分析器看到一个{，它后面总是跟着TEXT '}'，所以词法分析器生成的唯一标记将是ANY 和 NL 类型：这些是解析器可用的唯一标记，这给我们带来了问题 3：

3

在您的规则中 marker_text 和condition_text：

marker_text    : TEXT;
condition_text : TEXT;

您使用的是令牌TEXT，它永远不会成为令牌流的一部分（请参阅#2）。

解决方案

#1

更改前向查找 pc_marker：

npc_or_pc 
  :  (pc_marker ... )=> pc_marker  ...
  |                     npc_marker ...
  ;

#2

删除 pc_condition 和 npc_condition 规则并将其替换为 ANY 标记：

npc_or_pc 
  :  (pc_marker ANY)=> pc_marker  ANY
  |                    npc_marker ANY
  ;

#3

删除 marker_text 和 condition_text 规则，自从删除了 pc_condition 后，您就不再需要它们了npc_condition 已经。

演示

这是您修改后的语法：

grammar VampireDialog;

dialog 
  :  (line {System.out.print($line.text);})* EOF
  ;

line
  :  ANY ANY ANY npc_or_pc ANY* NL+
  ;

npc_or_pc 
  :  (pc_marker ANY)=> pc_marker  ANY {System.out.print("PC  :: ");}
  |                    npc_marker ANY {System.out.print("NPC :: ");}
  ;


pc_marker  
  :  t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?
  ;

npc_marker 
  :  t=ANY {!t.getText().trim().isEmpty() &&  t.getText().contains("#")}?
  ;

TEXT : ~('{'|NL|'}')*;
ANY  : '{' TEXT '}';
NL   : ( '\r' | '\n'| '\u000C');

甚至是稍短的等效语法：

grammar VampireDialog;

dialog 
  :  (line {System.out.print($line.text);})* EOF
  ;

line
  :  ANY ANY ANY npc_or_pc ANY+ NL+
  ;

npc_or_pc 
  :  (pc_marker ANY)=> pc_marker {System.out.print("PC  :: ");}
  |                    ANY       {System.out.print("NPC :: ");}
  ;

pc_marker  
  :  t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?
  ;

ANY  : '{' ~('{'|NL|'}')* '}';
NL   : ( '\r' | '\n'| '\u000C');

可以使用以下命令进行测试：

import org.antlr.runtime.*;

public class Main {
    public static void main(String[] args) throws Exception {
        String source = 
                "{ 1 }{ Where to? }{ Where to? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }{ }\n" + 
                "{ 2 }{ Just drive. }{ Just drive. }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ Not here. }\n";
        ANTLRStringStream in = new ANTLRStringStream(source);
        VampireDialogLexer lexer = new VampireDialogLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        VampireDialogParser parser = new VampireDialogParser(tokens);
        parser.dialog();
    }
}

它将在控制台上打印以下内容：

NPC :: { 1 }{ Where to? }{ Where to? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }{ }
PC  :: { 2 }{ Just drive. }{ Just drive. }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ Not here. }

There are 3 things going wrong in your grammar:

Problems

#1

In your npc_or_pc rule:

npc_or_pc 
  :  (ANY ANY)=> pc_marker  pc_condition 
  |              npc_marker npc_condition
  ;

you shouldn't be looking ahead for ANY ANY, because that would satisfy both pc_marker and npc_marker. You should look ahead for pc_marker followed by ANY (or pc_condition).

#2

In both your pc_condition and npc_condition rules:

pc_condition 
  :  '{' condition_text '}'
  ;

npc_condition 
  :  '{' condition_text '}'
  ;

you're using the tokens { and } but the lexer will never create such tokens. As soon as the lexer sees a { it will always be followed by TEXT '}', so the only tokens that the lexer produces will be of type ANY and NL: those are the only tokens available for the parser, which brings us to problem 3:

3

In your rules marker_text and condition_text:

marker_text    : TEXT;
condition_text : TEXT;

you're using the token TEXT, which will never be a part of the token stream (see #2).

Solutions

#1

Change the look ahead to look for pc_marker instead:

npc_or_pc 
  :  (pc_marker ... )=> pc_marker  ...
  |                     npc_marker ...
  ;

#2

Remove both the pc_condition and npc_condition rules and replace them by ANY tokens:

npc_or_pc 
  :  (pc_marker ANY)=> pc_marker  ANY
  |                    npc_marker ANY
  ;

#3

Remove both the marker_text and condition_text rules, you don't need them anymore since you removed pc_condition and npc_condition already.

Demo

Here's your modified grammar:

grammar VampireDialog;

dialog 
  :  (line {System.out.print($line.text);})* EOF
  ;

line
  :  ANY ANY ANY npc_or_pc ANY* NL+
  ;

npc_or_pc 
  :  (pc_marker ANY)=> pc_marker  ANY {System.out.print("PC  :: ");}
  |                    npc_marker ANY {System.out.print("NPC :: ");}
  ;


pc_marker  
  :  t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?
  ;

npc_marker 
  :  t=ANY {!t.getText().trim().isEmpty() &&  t.getText().contains("#")}?
  ;

TEXT : ~('{'|NL|'}')*;
ANY  : '{' TEXT '}';
NL   : ( '\r' | '\n'| '\u000C');

or even the slightly shorter equivalent:

grammar VampireDialog;

dialog 
  :  (line {System.out.print($line.text);})* EOF
  ;

line
  :  ANY ANY ANY npc_or_pc ANY+ NL+
  ;

npc_or_pc 
  :  (pc_marker ANY)=> pc_marker {System.out.print("PC  :: ");}
  |                    ANY       {System.out.print("NPC :: ");}
  ;

pc_marker  
  :  t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?
  ;

ANY  : '{' ~('{'|NL|'}')* '}';
NL   : ( '\r' | '\n'| '\u000C');

which can be tested with:

import org.antlr.runtime.*;

public class Main {
    public static void main(String[] args) throws Exception {
        String source = 
                "{ 1 }{ Where to? }{ Where to? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }{ }\n" + 
                "{ 2 }{ Just drive. }{ Just drive. }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ Not here. }\n";
        ANTLRStringStream in = new ANTLRStringStream(source);
        VampireDialogLexer lexer = new VampireDialogLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        VampireDialogParser parser = new VampireDialogParser(tokens);
        parser.dialog();
    }
}

which will print the following to the console:

NPC :: { 1 }{ Where to? }{ Where to? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }{ }
PC  :: { 2 }{ Just drive. }{ Just drive. }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ Not here. }

回复收藏 0 原文

~没有更多了~

关于作者

酒与心事

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

这个antlr语法与这个输入不匹配的原因是什么？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

问题

#1

#2

3

解决方案

#1

#2

#3

演示

Problems

#1

#2

3

Solutions

#1

#2

#3

Demo

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

这个antlr语法与这个输入不匹配的原因是什么？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（1）

问题

#1

#2

3

解决方案

#1

#2

#3

演示

Problems

#1

#2

3

Solutions

#1

#2

#3

Demo

关于作者

相关话题

热门标签

推荐作者

1CH1MKgiKxn9p

ゞ记忆︶ㄣ

JackDx

信远

yaoduoduo1995

霞映澄塘

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。