JavaCC 问题 - 生成的代码未找到所有解析错误
刚开始接触JavaCC。但我对此有一种奇怪的行为。我想验证输入形式的标记(字母和数字),它与符号(+、-、/)连接并且可以包含括号。 我希望这是可以理解的:)
在 main 方法中是一个字符串,它应该产生一个错误,因为它有一个左括号但有两个右括号,但我没有得到解析异常 -->为什么?
有人知道为什么我没有得到例外吗?
我在最初的尝试中遇到了左递归和选择冲突的问题,但设法克服了它们。也许我引入了问题?!
哦 - 也许我的解决方案不是很好 - 忽略这个事实...或者更好,给出一些建议;-)
文件:CodeParser.jj
options {
STATIC=false;
}
PARSER_BEGIN(CodeParser)
package com.testing;
import java.io.StringReader;
import java.io.Reader;
public class CodeParser {
public CodeParser(String s)
{
this((Reader)(new StringReader(s)));
}
public static void main(String args[])
{
try
{
/** String has one open, but two closing parenthesis --> should produce parse error */
String s = "A+BC+-(2XXL+A/-B))";
CodeParser parser = new CodeParser(s);
parser.expression();
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
PARSER_END(CodeParser)
TOKEN:
{
<code : ("-")?(["A"-"Z", "0"-"9"])+ >
| <op : ("+"|"/") >
| <not : ("-") >
| <lparenthesis : ("(") >
| <rparenthesis : (")") >
}
void expression() :
{
}
{
negated_expression() | parenthesis_expression() | LOOKAHEAD(2) operator_expression() | <code>
}
void negated_expression() :
{
}
{
<not>parenthesis_expression()
}
void parenthesis_expression() :
{
}
{
<lparenthesis>expression()<rparenthesis>
}
void operator_expression() :
{
}
{
<code><op>expression()
}
编辑 - 11/16/2009
现在我给了 ANTLR尝试。
我更改了一些术语以更好地匹配我的问题域。我想出了以下代码(使用本网站上的答案),它现在似乎可以完成工作:
grammar Code;
CODE : ('A'..'Z'|'0'..'9')+;
OP : '+'|'/';
start : terms EOF;
terms : term (OP term)*;
term : '-'? CODE
| '-'? '(' terms ')';
顺便说一下...... ANTLRWORKS 是一个用于调试/可视化的出色工具!对我帮助很大。
附加信息
上面的代码匹配如下内容:
(-Z19+-Z07+((FV+((M005+(M272/M276))/((M278/M273/M642)+-M005)))/(FW+(M005+(M273/M278/M642)))))+(-Z19+-Z07+((FV+((M005+(M272/M276))/((M278/M273/M642/M651)+-M005)))/(FW+(M0))))
Just started with JavaCC. But I have a strange behaviour with it. I want to verify input int the form of tokens (letters and numbers) wich are concatenated with signs (+, -, /) and wich can contain parenthesis.
I hope that was understandable :)
In the main method is a string, which should produce an error, because it has one opening but two closing parenthesis, but I do not get a parse exception --> Why?
Does anybody have a clue why I don't get the exception?
I was struggling with left recursion and choice conflicts with my initial try, but managed to get over them. Maybe there I introduced the problem?!
Oh - and maybe my solution is not very good - ignore this fact... or better, give some advice ;-)
File: CodeParser.jj
options {
STATIC=false;
}
PARSER_BEGIN(CodeParser)
package com.testing;
import java.io.StringReader;
import java.io.Reader;
public class CodeParser {
public CodeParser(String s)
{
this((Reader)(new StringReader(s)));
}
public static void main(String args[])
{
try
{
/** String has one open, but two closing parenthesis --> should produce parse error */
String s = "A+BC+-(2XXL+A/-B))";
CodeParser parser = new CodeParser(s);
parser.expression();
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
PARSER_END(CodeParser)
TOKEN:
{
<code : ("-")?(["A"-"Z", "0"-"9"])+ >
| <op : ("+"|"/") >
| <not : ("-") >
| <lparenthesis : ("(") >
| <rparenthesis : (")") >
}
void expression() :
{
}
{
negated_expression() | parenthesis_expression() | LOOKAHEAD(2) operator_expression() | <code>
}
void negated_expression() :
{
}
{
<not>parenthesis_expression()
}
void parenthesis_expression() :
{
}
{
<lparenthesis>expression()<rparenthesis>
}
void operator_expression() :
{
}
{
<code><op>expression()
}
Edit - 11/16/2009
Now I gave ANTLR a try.
I changed some terms to better match my problem domain. I came up with the following code (using the answers on this site), which seems to do the work now:
grammar Code;
CODE : ('A'..'Z'|'0'..'9')+;
OP : '+'|'/';
start : terms EOF;
terms : term (OP term)*;
term : '-'? CODE
| '-'? '(' terms ')';
And by the way... ANTLRWORKS is a great tool for debugging/visualizing! Helped me a lot.
Additional info
Above code matches stuff like:
(-Z19+-Z07+((FV+((M005+(M272/M276))/((M278/M273/M642)+-M005)))/(FW+(M005+(M273/M278/M642)))))+(-Z19+-Z07+((FV+((M005+(M272/M276))/((M278/M273/M642/M651)+-M005)))/(FW+(M0))))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
kgregory所说的就是正确的答案。如果您使用 DEBUG_PARSER 选项构建语法然后运行它,您可以看到这一点:
看到了吗?最后消耗的标记是倒数第二个字符 - 倒数第二个右括号。
如果您想要例外,再次像 kgregory 所说的那样,您可以添加一个名为“文件”或“数据”或其他名称的新顶级产品,并以令牌结束它。这样任何像这样的悬空括号都会导致错误。这是执行此操作的语法:
以及示例运行:
瞧!一个例外。
What kgregory says is the right answer. You can see this if you build the grammar with the DEBUG_PARSER option and then run it:
See that? The last token consumed is the second to last character - the second to last right parenthesis.
If you want the exception, again, like kgregory said, you could add a new top-level production called "file" or "data" or something and end it with an token. That way any dangling parens like this would cause an error. Here's an grammar that does that:
And a sample run:
Voila! An exception.
来自 Java CC 常见问题解答 :
4.7 我添加了 LOOKAHEAD 规范,警告消失了;这是否意味着我解决了问题?
不。如果您使用 LOOKAHEAD 规范,JavaCC 将不会报告选择冲突警告。 没有警告并不意味着您已经正确解决了问题,它只是意味着您添加了 LOOKAHEAD 规范。
我首先会尝试在不使用前瞻的情况下消除冲突。
From the Java CC FAQ:
4.7 I added a LOOKAHEAD specification and the warning went away; does that mean I fixed the problem?
No. JavaCC will not report choice conflict warnings if you use a LOOKAHEAD specification. The absence of a warning doesn't mean that you've solved the problem correctly, it just means that you added a LOOKAHEAD specification.
I would start by trying to get rid of the conflict without using a lookahead first.
问题是使用解析器时不会出现错误,对吗?并不是解析器生成器声称语法不正确(这似乎是其他答案中的讨论)。
如果是这种情况,那么我怀疑您遇到了问题,因为解析器正确匹配表达式产生式,然后忽略后续输入。我已经很长时间没有使用 JavaCC 了,但是 iirc 它并没有因为未到达流末尾而抛出错误。
大多数语法都有一个显式的顶级产生式来匹配整个文件,看起来像这样(我确信语法是错误的,正如我所说,已经很长时间了):
或者,可能有一个 EOF 标记,您可以如果您只想处理单个表达式,请使用。
The problem is that you don't get the error when using the parser, correct? Not that the parser generator is claiming that the grammar is incorrect (which seems to be the discussion in the other answer).
If that's the case, then I suspect that you're seeing the problem because the parser properly matches the expression production, then ignores subsequent input. I haven't used JavaCC for a long time, but iirc it didn't throw an error for not reaching end-of-stream.
Most grammars have an explicit top-level production to match the entire file, looking something like this (I'm sure the syntax is wrong, as I said, it's been a long time):
Or, there's probably an EOF token that you can use, if you want to process just a single expression.