这个antlr语法与这个输入不匹配的原因是什么?
对于提出如此类似的问题,我提前表示歉意,但我感到相当沮丧,而且我可能能够更好地解释新问题。
我正在尝试重写结构化文件的部分内容,并考虑使用antlr。 这些文件是 {X} 标记行。 我正在寻找一个字符,这样如果我找到它,我就可以重写文件的不同部分。 但是这个字符('#')可以出现在文件的许多部分。但是,如果它出现在第四个 {#} 上,它会确定我是否需要以某种方式或以另一种方式重写下一个 {X} 的一部分,或者根本不需要(如果那里没有任何内容)。
典型输入:
{ 1 }{ 去哪里? # }{ 去哪里? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }
{ 2 }{ 开车即可。 }{ 开车就行。 }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ 不在这里。 }
(我添加了第一个 #,所以你会看到它可以在那里。 我的语法,antlr 3.3 - 它警告“第 1:31 行在输入 '{ # }' 处没有可行的替代方案”和“第 2:35 行在输入 '{ 0 }' 处没有可行的替代方案”
grammar VampireDialog;
options
{
output=AST;
ASTLabelType=CommonTree;
language=Java;
}
tokens
{
REWRITE;
}
@parser::header {
import java.util.LinkedList;
import java.io.File;
}
@members {
//the lookahead type i'm using ( ()=> ) wraps everything after in a if, including the actions ( {} )
//that i need to use to prepare the arguments for the replace rules. Declare them global.
String condition, command, wrappedCommand; boolean isEmpty, alreadyProcessed;
public static void main(String[] args) throws Exception {
File vampireDir = new File(System.getProperty("user.home"), "Desktop/Vampire the Masquerade - Bloodlines/Vampire the Masquerade - Bloodlines/Vampire/dlg/dummy");
List<File> files = new LinkedList<File>();
getFiles(256, new File[]{vampireDir}, files, new LinkedList<File>());
for (File f : files) {
if (f.getName().endsWith(".dlg")) {
System.out.println(f.getName());
VampireDialogLexer lex = new VampireDialogLexer(new ANTLRFileStream(f.getAbsolutePath(), "Windows-1252"));
TokenRewriteStream tokens = new TokenRewriteStream(lex);
VampireDialogParser parser = new VampireDialogParser(tokens);
Tree t = (Tree) parser.dialog().getTree();
System.out.println(t.toStringTree());
}
}
}
public static void getFiles(int levels, File[] search, List<File> files, List<File> directories) {
for (File f : search) {
if (!f.exists()) {
throw new AssertionError("Search file array has non-existing files");
}
}
getFilesAux(levels, search, files, directories);
}
private static void getFilesAux(int levels, File[] startFiles, List<File> files, List<File> directories) {
List<File[]> subFilesList = new ArrayList<File[]>(50);
for (File f : startFiles) {
File[] subFiles = f.listFiles();
if (subFiles == null) {
files.add(f);
} else {
directories.add(f);
subFilesList.add(subFiles);
}
}
if (levels > 0) {
for (File[] subFiles : subFilesList) {
getFilesAux(levels - 1, subFiles, files, directories);
}
}
}
}
/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/
dialog : (ANY ANY ANY npc_or_pc ANY* NL*)*;
npc_or_pc : (ANY ANY) =>
pc_marker pc_condition
| npc_marker npc_condition;
pc_marker : t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?;
npc_marker : t=ANY {!t.getText().trim().isEmpty() && t.getText().contains("#")}?;
pc_condition : '{' condition_text '}'
{
condition = $condition_text.tree.toStringTree();
isEmpty = condition.trim().isEmpty();
command = "npc.Count()";
wrappedCommand = "("+condition+") and "+ command;
alreadyProcessed = condition.endsWith(command);
}
-> {alreadyProcessed}? '{' condition_text '}'
-> {isEmpty}? '{' REWRITE[command] '}'
-> '{' REWRITE[wrappedCommand] '}';
npc_condition : '{' condition_text '}'
{
condition = $condition_text.tree.toStringTree();
isEmpty = condition.trim().isEmpty();
command = "npc.Reset()";
wrappedCommand = "("+condition+") and "+ command;
alreadyProcessed = condition.endsWith(command);
}
-> {alreadyProcessed}? '{' condition_text '}'
-> {isEmpty}? '{' REWRITE[command] '}'
-> '{' REWRITE[wrappedCommand] '}';
marker_text : TEXT;
condition_text : TEXT;
/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/
//in the parser ~('#') means: "match any token except the token that matches '#'"
//and in lexer rules ~('#') means: "match any character except '#'"
TEXT : ~('{'|NL|'}')*;
ANY : '{' TEXT '}';
NL : ( '\r' | '\n'| '\u000C');
I apologize in advance about asking a so similar question, but i'm rather frustrated, and i will probably be able to explain better on a new question.
I'm trying to rewrite parts of a structured file, and thought of using antlr.
These files are lines of {X} tokens.
There is a character i'm looking ahead for, so that i can rewrite parts of the file different if i find it.
But this character ( '#' ), can occur in many parts of the file. However, if it appears on 4th {#} it determines if i need to rewrite part of the next {X} in a way, or in another way, or not at all (if there is nothing there).
Typical input:
{ 1 }{ Where to? # }{ Where to? }{ # }{ }{ G.Cabbie_Line = 1 }{ }{ }{ }{ }{ }{ }{ }
{ 2 }{ Just drive. }{ Just drive. }{ 0 }{ }{ npc.WorldMap( G.WorldMap_State ) }{ }{ }{ }{ }{ }{ }{ Not here. }
(I added the first # so you see it can be there.)
My grammar, antlr 3.3 - it is warns "line 1:31 no viable alternative at input '{ # }'" and "line 2:35 no viable alternative at input '{ 0 }'"
grammar VampireDialog;
options
{
output=AST;
ASTLabelType=CommonTree;
language=Java;
}
tokens
{
REWRITE;
}
@parser::header {
import java.util.LinkedList;
import java.io.File;
}
@members {
//the lookahead type i'm using ( ()=> ) wraps everything after in a if, including the actions ( {} )
//that i need to use to prepare the arguments for the replace rules. Declare them global.
String condition, command, wrappedCommand; boolean isEmpty, alreadyProcessed;
public static void main(String[] args) throws Exception {
File vampireDir = new File(System.getProperty("user.home"), "Desktop/Vampire the Masquerade - Bloodlines/Vampire the Masquerade - Bloodlines/Vampire/dlg/dummy");
List<File> files = new LinkedList<File>();
getFiles(256, new File[]{vampireDir}, files, new LinkedList<File>());
for (File f : files) {
if (f.getName().endsWith(".dlg")) {
System.out.println(f.getName());
VampireDialogLexer lex = new VampireDialogLexer(new ANTLRFileStream(f.getAbsolutePath(), "Windows-1252"));
TokenRewriteStream tokens = new TokenRewriteStream(lex);
VampireDialogParser parser = new VampireDialogParser(tokens);
Tree t = (Tree) parser.dialog().getTree();
System.out.println(t.toStringTree());
}
}
}
public static void getFiles(int levels, File[] search, List<File> files, List<File> directories) {
for (File f : search) {
if (!f.exists()) {
throw new AssertionError("Search file array has non-existing files");
}
}
getFilesAux(levels, search, files, directories);
}
private static void getFilesAux(int levels, File[] startFiles, List<File> files, List<File> directories) {
List<File[]> subFilesList = new ArrayList<File[]>(50);
for (File f : startFiles) {
File[] subFiles = f.listFiles();
if (subFiles == null) {
files.add(f);
} else {
directories.add(f);
subFilesList.add(subFiles);
}
}
if (levels > 0) {
for (File[] subFiles : subFilesList) {
getFilesAux(levels - 1, subFiles, files, directories);
}
}
}
}
/*------------------------------------------------------------------
* PARSER RULES
*------------------------------------------------------------------*/
dialog : (ANY ANY ANY npc_or_pc ANY* NL*)*;
npc_or_pc : (ANY ANY) =>
pc_marker pc_condition
| npc_marker npc_condition;
pc_marker : t=ANY {!t.getText().trim().isEmpty() && !t.getText().contains("#")}?;
npc_marker : t=ANY {!t.getText().trim().isEmpty() && t.getText().contains("#")}?;
pc_condition : '{' condition_text '}'
{
condition = $condition_text.tree.toStringTree();
isEmpty = condition.trim().isEmpty();
command = "npc.Count()";
wrappedCommand = "("+condition+") and "+ command;
alreadyProcessed = condition.endsWith(command);
}
-> {alreadyProcessed}? '{' condition_text '}'
-> {isEmpty}? '{' REWRITE[command] '}'
-> '{' REWRITE[wrappedCommand] '}';
npc_condition : '{' condition_text '}'
{
condition = $condition_text.tree.toStringTree();
isEmpty = condition.trim().isEmpty();
command = "npc.Reset()";
wrappedCommand = "("+condition+") and "+ command;
alreadyProcessed = condition.endsWith(command);
}
-> {alreadyProcessed}? '{' condition_text '}'
-> {isEmpty}? '{' REWRITE[command] '}'
-> '{' REWRITE[wrappedCommand] '}';
marker_text : TEXT;
condition_text : TEXT;
/*------------------------------------------------------------------
* LEXER RULES
*------------------------------------------------------------------*/
//in the parser ~('#') means: "match any token except the token that matches '#'"
//and in lexer rules ~('#') means: "match any character except '#'"
TEXT : ~('{'|NL|'}')*;
ANY : '{' TEXT '}';
NL : ( '\r' | '\n'| '\u000C');
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您的语法中有 3 处错误:
问题
#1
在您的
npc_or_pc
规则中:您不应该向前查找
ANY ANY
,因为这会同时满足>pc_marker
和npc_marker
。您应该先查找pc_marker
,然后查找ANY
(或pc_condition
)。#2
在您的
pc_condition
和npc_condition
规则中:您使用的是标记
{
和}
但词法分析器永远不会创建这样的标记。一旦词法分析器看到一个{
,它后面总是跟着TEXT '}'
,所以词法分析器生成的唯一标记将是ANY
和NL
类型:这些是解析器可用的唯一标记,这给我们带来了问题 3:3
在您的规则中
marker_text
和condition_text
:您使用的是令牌
TEXT
,它永远不会成为令牌流的一部分(请参阅#2)。解决方案
#1
更改前向查找
pc_marker
:#2
删除
pc_condition
和npc_condition
规则并将其替换为ANY
标记:#3
删除
marker_text
和condition_text
规则,自从删除了pc_condition
后,您就不再需要它们了npc_condition
已经。演示
这是您修改后的语法:
甚至是稍短的等效语法:
可以使用以下命令进行测试:
它将在控制台上打印以下内容:
There are 3 things going wrong in your grammar:
Problems
#1
In your
npc_or_pc
rule:you shouldn't be looking ahead for
ANY ANY
, because that would satisfy bothpc_marker
andnpc_marker
. You should look ahead forpc_marker
followed byANY
(orpc_condition
).#2
In both your
pc_condition
andnpc_condition
rules:you're using the tokens
{
and}
but the lexer will never create such tokens. As soon as the lexer sees a{
it will always be followed byTEXT '}'
, so the only tokens that the lexer produces will be of typeANY
andNL
: those are the only tokens available for the parser, which brings us to problem 3:3
In your rules
marker_text
andcondition_text
:you're using the token
TEXT
, which will never be a part of the token stream (see #2).Solutions
#1
Change the look ahead to look for
pc_marker
instead:#2
Remove both the
pc_condition
andnpc_condition
rules and replace them byANY
tokens:#3
Remove both the
marker_text
andcondition_text
rules, you don't need them anymore since you removedpc_condition
andnpc_condition
already.Demo
Here's your modified grammar:
or even the slightly shorter equivalent:
which can be tested with:
which will print the following to the console: