ANTLR:词法分析器规则严格接受一个字母和多个字符的标记,而不是仅接受一个(Java)
我已经为 ANTLR 解析器和词法分析器编写了以下语法,用于为逻辑公式构建树,并且有几个问题(如果有人可以帮助的话):
class AntlrFormulaParser extends Parser;
options {
buildAST = true;
}
biconexpr : impexpr (BICONDITIONAL^ impexpr)*;
impexpr : orexpr (IMPLICATION^ orexpr)*;
orexpr : andexpr (DISJUNCTION^ andexpr)*;
andexpr : notexpr (CONJUNCTION^ notexpr)*;
notexpr : (NEGATION^)? formula;
formula
: atom
| LEFT_PAREN! biconexpr RIGHT_PAREN!
;
atom
: CHAR
| TRUTH
| FALSITY
;
class AntlrFormulaLexer extends Lexer;
// Atoms
CHAR: 'a'..'z';
TRUTH: ('\u22A4' | 'T');
FALSITY: ('\u22A5' | 'F');
// Grouping
LEFT_PAREN: '(';
RIGHT_PAREN: ')';
NEGATION: ('\u00AC' | '~' | '!');
CONJUNCTION: ('\u2227' | '&' | '^');
DISJUNCTION: ('\u2228' | '|' | 'V');
IMPLICATION: ('\u2192' | "->");
BICONDITIONAL: ('\u2194' | "<->");
WHITESPACE : (' ' | '\t' | '\r' | '\n') { $setType(Token.SKIP); };
树语法:
tree grammar AntlrFormulaTreeParser;
options {
tokenVocab=AntlrFormula;
ASTLabelType=CommonTree;
}
expr returns [Formula f]
: ^(BICONDITIONAL f1=expr f2=expr) {
$f = new Biconditional(f1, f2);
}
| ^(IMPLICATION f1=expr f2=expr) {
$f = new Implication(f1, f2);
}
| ^(DISJUNCTION f1=expr f2=expr) {
$f = new Disjunction(f1, f2);
}
| ^(CONJUNCTION f1=expr f2=expr) {
$f = new Conjunction(f1, f2);
}
| ^(NEGATION f1=expr) {
$f = new Negation(f1);
}
| CHAR {
$f = new Atom($CHAR.getText());
}
| TRUTH {
$f = Atom.TRUTH;
}
| FALSITY {
$f = Atom.FALSITY;
}
;
我在上述语法中遇到的问题是:
AntlrFormulaLexer 的 java 代码中的标记、IMPLICATION 和 BICONDITIONAL 似乎仅检查其各自的第一个字符(即“-”和“<”)以匹配标记,而不是按照语法中指定的整个字符串。
在测试 AntlrFormulaParser 的 java 代码时,如果我传递一个字符串,例如“~ab”,它会返回一个“(~ a)”树(字符串“ab&c”仅返回“a”),当它确实应该返回错误/异常时,因为根据上述语法,原子只能有一个字母。对于这些示例字符串,它根本不会给出任何错误/异常。
如果有人能帮助我解决这两个问题,我将非常感激。谢谢 :)
I've written the below grammar for ANTLR parser and lexer for building trees for logical formulae and had a couple of questions if someone could help:
class AntlrFormulaParser extends Parser;
options {
buildAST = true;
}
biconexpr : impexpr (BICONDITIONAL^ impexpr)*;
impexpr : orexpr (IMPLICATION^ orexpr)*;
orexpr : andexpr (DISJUNCTION^ andexpr)*;
andexpr : notexpr (CONJUNCTION^ notexpr)*;
notexpr : (NEGATION^)? formula;
formula
: atom
| LEFT_PAREN! biconexpr RIGHT_PAREN!
;
atom
: CHAR
| TRUTH
| FALSITY
;
class AntlrFormulaLexer extends Lexer;
// Atoms
CHAR: 'a'..'z';
TRUTH: ('\u22A4' | 'T');
FALSITY: ('\u22A5' | 'F');
// Grouping
LEFT_PAREN: '(';
RIGHT_PAREN: ')';
NEGATION: ('\u00AC' | '~' | '!');
CONJUNCTION: ('\u2227' | '&' | '^');
DISJUNCTION: ('\u2228' | '|' | 'V');
IMPLICATION: ('\u2192' | "->");
BICONDITIONAL: ('\u2194' | "<->");
WHITESPACE : (' ' | '\t' | '\r' | '\n') { $setType(Token.SKIP); };
The tree grammar:
tree grammar AntlrFormulaTreeParser;
options {
tokenVocab=AntlrFormula;
ASTLabelType=CommonTree;
}
expr returns [Formula f]
: ^(BICONDITIONAL f1=expr f2=expr) {
$f = new Biconditional(f1, f2);
}
| ^(IMPLICATION f1=expr f2=expr) {
$f = new Implication(f1, f2);
}
| ^(DISJUNCTION f1=expr f2=expr) {
$f = new Disjunction(f1, f2);
}
| ^(CONJUNCTION f1=expr f2=expr) {
$f = new Conjunction(f1, f2);
}
| ^(NEGATION f1=expr) {
$f = new Negation(f1);
}
| CHAR {
$f = new Atom($CHAR.getText());
}
| TRUTH {
$f = Atom.TRUTH;
}
| FALSITY {
$f = Atom.FALSITY;
}
;
The problems I'm having with the above grammar are these:
The tokens, IMPLICATION and BICONDITIONAL, in the java code for AntlrFormulaLexer only seem to be checking for their respective first character (i.e. '-' and '<') to match the token, instead of the whole string, as specified in the grammar.
When testing the java code for AntlrFormulaParser, if I pass a string such as "~ab", it returns a tree of "(~ a)" (and a string "ab&c" returns just "a"), when it should really be returning an error/exception, since an atom can only have one letter according to the above grammar. It doesn't give any error/exception at all with these sample strings.
I'd really appreciate if someone could help me solve these couple of problems. Thank you :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我会将以下定义更改为:
注意“->”与 '->'
并解决错误问题:
从这里:
http://www.antlr.org/wiki/pages/viewpage.action ?pageId=4554943
修复了针对 antlr 3.3 进行编译的语法(另存为 AntlrFormula.g):
链接到 antlr 3.3 二进制文件:http://www.antlr.org/download/antlr-3.3-complete.jar
您需要尝试匹配程序规则才能匹配完整文件。
可使用此类进行测试:
I would change the following definitions as:
note "->" vs '->'
And to solve the error issue:
from here:
http://www.antlr.org/wiki/pages/viewpage.action?pageId=4554943
Fixed grammar to compile against antlr 3.3 (save as AntlrFormula.g):
Link to antlr 3.3 binary: http://www.antlr.org/download/antlr-3.3-complete.jar
you will need to try to match the program rule in order to match the complete file.
testable with this class: