ANTLR 解析 Java 属性
我正在尝试学习 ANTLR 并为 Java Properties 编写语法。我在这里遇到了一个问题,希望得到一些帮助。
在 Java Properties 中,它有一点奇怪的转义处理。例如,
key1=1=Key1
key\=2==
Java 运行时中的键值对结果为
KEY VALUE
=== =====
key1 1=Key1
key=2 =
到目前为止,这是我可以模仿的最好的结果。通过将“=”和值折叠成一个标记。
grammar Prop;
file : (pair | LINE_COMMENT)* ;
pair : ID VALUE ;
ID : (~('='|'\r'|'\n') | '\\=')* ;
VALUE : '=' (~('\r'|'\n'))*;
CARRIAGE_RETURN
: ('\r'|'\n') + {$channel=HIDDEN;}
;
LINE_COMMENT
: '#' ~('\r'|'\n')* ('\r'|'\n'|EOF)
;
如果我可以实施更好的方案,有什么好的建议吗? 多谢
I'm trying to pick up ANTLR and writing a grammar for Java Properties. I'm hitting an issue here and will appreciate some help.
In Java Properties, it has a little strange escape handling. For example,
key1=1=Key1
key\=2==
results in key-value pairs in Java runtime as
KEY VALUE
=== =====
key1 1=Key1
key=2 =
So far, this is the best I can mimic.. by folding the '=' and value into one single token.
grammar Prop;
file : (pair | LINE_COMMENT)* ;
pair : ID VALUE ;
ID : (~('='|'\r'|'\n') | '\\=')* ;
VALUE : '=' (~('\r'|'\n'))*;
CARRIAGE_RETURN
: ('\r'|'\n') + {$channel=HIDDEN;}
;
LINE_COMMENT
: '#' ~('\r'|'\n')* ('\r'|'\n'|EOF)
;
Is there any good suggestion if I can implement a better one?
Thanks a lot
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这并不那么容易。您无法在词法分析级别处理太多事情,因为很多事情都依赖于特定的上下文。因此,在词法分析级别,您只能匹配单个字符并在解析器规则中构造键和值。此外,
=
和:
作为可能的键值分隔符以及这些字符可以作为值的开头这一事实,使它们成为屁股上的疼痛要翻译成语法。最简单的方法是将这些(可能的)分隔符包含在您的值规则中,并将分隔符 和 值匹配在一起后,从中删除分隔符。一个小演示:
JavaProperties.g
来测试上面的语法
可以使用类: Main.java
和输入文件:
test.properties
,以产生以下输出:
意识到我的语法只是一个示例:它不考虑所有有效的语法属性文件(有时应该忽略反斜杠,没有 Unicode 转义,键和值中缺少许多字符)。有关属性文件的完整规范,请参阅:
http:// /download.oracle.com/javase/6/docs/api/java/util/Properties.html#load%28java.io.Reader%29
It's not as easy as that. You can't handle much at the lexing level because many things depend on a certain context. So at the lexing level, you can only match single characters and construct key and values in parser rules. Also, the
=
and:
as possible key-value separators and the fact that these characters can be the start of a value, makes them a pain in the butt to translate into a grammar. The easiest would be to include these (possible) separator chars in your value-rule and after matching the separator and value together, strip the separator chars from it.A small demo:
JavaProperties.g
The grammar above can be tested with the class:
Main.java
and the input file:
test.properties
to produce the following output:
Realize that my grammar is just an example: it does not account for all valid properties files (sometimes backslashes should be ignored, there's no Unicode escapes, many characters are missing in the key and value). For a complete specification of properties files, see:
http://download.oracle.com/javase/6/docs/api/java/util/Properties.html#load%28java.io.Reader%29