javacc 标记正则表达式 and_symbol_in
我需要描述包含一些单词的标记。该单词可以包含英文字母和一些其他特殊符号,但不应该以某些定义的英文字母开头(例如“O”)。
看起来我需要 AND_SYMBOL_IN 操作或其他东西,但我没有在javacc 文档。 我需要这样的行为:
TOKEN : { < LETTERS: (
(~["O", "-"] AND_SYMBOL_IN ["a"-"z","A"-"Z","-",".","&","|","0"-"9"])? (["a"-"z","A"-"Z","-",".","&","|","0"-"9"])+
) > }
我可以创建特殊的令牌(如下所示),但我相信有更好的决定,不是吗?
TOKEN : { < #LETTEREX: (
["a"-"z","A"-"N","P"-"Z",".","&","|","0"-"9","-"]) > }
TOKEN : { < LETTERS: (
(< LETTEREX > ) (< LETTEREX > | ["O"])+
) > }
I need to describe the token containing some word. The word could contain english letters and some other special symbols, but shouldn`t begin with some defined english letters (for example, 'O").
It looks like I need AND_SYMBOL_IN operation or something, but I haven`t find it in the javacc documentation.
I need the behavior something like this:
TOKEN : { < LETTERS: (
(~["O", "-"] AND_SYMBOL_IN ["a"-"z","A"-"Z","-",".","&","|","0"-"9"])? (["a"-"z","A"-"Z","-",".","&","|","0"-"9"])+
) > }
I can create special token(like below), but I believe there is more nice decision, isn`t it?
TOKEN : { < #LETTEREX: (
["a"-"z","A"-"N","P"-"Z",".","&","|","0"-"9","-"]) > }
TOKEN : { < LETTERS: (
(< LETTEREX > ) (< LETTEREX > | ["O"])+
) > }
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
JavaCC 使用语法中声明匹配标记的顺序来解决大小相等的匹配之间的歧义。因此,一种可能性是在您想要的标记之前匹配您不想要的标记:
例如:
这是否合适取决于您有多少特殊情况以及它们的复杂程度。
JavaCC resolves ambiguities between equally sized matches using the order that the matching tokens are declared in the grammar. So one possibility is to match the token you don't want before the token you do:
For example:
How suitable this is depends on how many special cases you have and how complicated they are.