ANTLR 重写查询文本以使用较早的节点重复文本
我是 ANTLR 的新手,正在尝试使用以下内容解析查询
grammar SearchEngineQuery;
options { language = CSharp2; output = AST; }
tokens {
AndNode;
}
LPARENTHESIS : '(';
RPARENTHESIS : ')';
AND : 'and';
OR : 'or';
ANDNOT : 'andnot';
NOT : 'not';
NEAR : 'near';
fragment CHARACTER : ('a'..'z'|'0'..'9'|'-');
fragment QUOTE : ('"');
fragment WILDCARD : ('*'|'?');
fragment SPACE : (' '|'\n'|'\r'|'\t'|'\u000C');
WILD_STRING
: (CHARACTER)*
(
('?')
(CHARACTER)*
)+
;
PREFIX_STRING
: (CHARACTER)+
(
('*')
)+
;
WS : (SPACE) { $channel=HIDDEN; };
PHRASE : (QUOTE)(WORD)(WILDCARD)?((SPACE)+(WORD)(WILDCARD)?)*(QUOTE);
WORD : (CHARACTER)+;
startExpression : nearExpression;
nearExpression : andExpression (NEAR^ andExpression)*;
andExpression
: (andnotExpression -> andnotExpression)
(AND? a=andnotExpression -> ^(AndNode $andnotExpression $a))*
;
andnotExpression : orExpression (ANDNOT^ orExpression)*;
orExpression : notExpression (OR^ notExpression)* ;
notExpression : (NOT^)? (phraseExpression | wildExpression | prefixExpression | atomicExpression);
phraseExpression : (PHRASE^);
wildExpression : (WILD_STRING^);
prefixExpression : (PREFIX_STRING^);
atomicExpression : WORD | LPARENTHESIS! andExpression RPARENTHESIS!;
这似乎适用于一般查询。然而,a close (b or c)
的情况实际上需要处理为:
和 a 近 (b 或 c 和 (d 或 e))
需要处理为:
我无法确定如何执行此操作。任何帮助将不胜感激。
谢谢
I am new to ANTLR and am trying to parse queries using the following
grammar SearchEngineQuery;
options { language = CSharp2; output = AST; }
tokens {
AndNode;
}
LPARENTHESIS : '(';
RPARENTHESIS : ')';
AND : 'and';
OR : 'or';
ANDNOT : 'andnot';
NOT : 'not';
NEAR : 'near';
fragment CHARACTER : ('a'..'z'|'0'..'9'|'-');
fragment QUOTE : ('"');
fragment WILDCARD : ('*'|'?');
fragment SPACE : (' '|'\n'|'\r'|'\t'|'\u000C');
WILD_STRING
: (CHARACTER)*
(
('?')
(CHARACTER)*
)+
;
PREFIX_STRING
: (CHARACTER)+
(
('*')
)+
;
WS : (SPACE) { $channel=HIDDEN; };
PHRASE : (QUOTE)(WORD)(WILDCARD)?((SPACE)+(WORD)(WILDCARD)?)*(QUOTE);
WORD : (CHARACTER)+;
startExpression : nearExpression;
nearExpression : andExpression (NEAR^ andExpression)*;
andExpression
: (andnotExpression -> andnotExpression)
(AND? a=andnotExpression -> ^(AndNode $andnotExpression $a))*
;
andnotExpression : orExpression (ANDNOT^ orExpression)*;
orExpression : notExpression (OR^ notExpression)* ;
notExpression : (NOT^)? (phraseExpression | wildExpression | prefixExpression | atomicExpression);
phraseExpression : (PHRASE^);
wildExpression : (WILD_STRING^);
prefixExpression : (PREFIX_STRING^);
atomicExpression : WORD | LPARENTHESIS! andExpression RPARENTHESIS!;
This seems to work ok for general queries. However, the case of a near (b or c)
needs to be actually handled as:
and a near (b or c and (d or e))
needs to be handled as:
I am unable to determine how to do this. Any help would be most appreciated.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能可以通过使用多遍树重写语法来实现此目的。
规则应该相当短。
与 OR 情况类似的内容:
在 topDown 中添加一个操作,每当规则匹配时,该操作都会设置
rewrite
标志,因此只要rewrite
就可以应用此语法标志已设置。我用它来优化/预先计算数学表达式,它的作用就像一个魅力。
You would probably be able to achieve this by using a multiple pass tree rewriting grammar.
The rules should be fairly short.
something similar as this for the OR case:
in topDown add an action that sets a
rewrite
flag, whenever a rule matched, so you can apply this grammar as long as therewrite
flag is set.I use this to optimize/precalculate math expressions and it works like a charm.