使用 ANTLR 识别 JavaScript 文件中的全局变量声明
我一直在使用 ANTLR 提供的 ECMAScript 语法,目的是识别 JavaScript 全局变量。生成了 AST,我现在想知道过滤全局变量声明的基本方法是什么。
我有兴趣在 AST 中查找所有最外层的“variableDeclaration”标记;但实际的操作方法却让我无法理解。这是我到目前为止的设置代码:
String input = "var a, b; var c;";
CharStream cs = new ANTLRStringStream(input);
JavaScriptLexer lexer = new JavaScriptLexer(cs);
CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
JavaScriptParser parser = new JavaScriptParser(tokens);
program_return programReturn = parser.program();
作为 ANTLR 的新手,有人可以提供任何指示吗?
I've been using the ANTLR supplied ECMAScript grammar with the objective of identifying JavaScript global variables. An AST is produced and I'm now wondering what the based way of filtering out the global variable declarations is.
I'm interested in looking for all of the outermost "variableDeclaration" tokens in my AST; the actual how-to-do-this is eluding me though. Here's my set up code so far:
String input = "var a, b; var c;";
CharStream cs = new ANTLRStringStream(input);
JavaScriptLexer lexer = new JavaScriptLexer(cs);
CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
JavaScriptParser parser = new JavaScriptParser(tokens);
program_return programReturn = parser.program();
Being new to ANTLR can anyone offer any pointers?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我猜您正在使用此语法。
尽管该语法表明创建了正确的 AST,但事实并非如此。它使用一些内联运算符从解析树中排除某些标记,但它从不为树创建任何根,从而产生完全平坦的解析树。由此,您无法以合理的方式获取所有全局变量。
您需要稍微调整语法:
在语法文件顶部的
options { ... }
下添加以下内容:现在替换以下规则:
functionDeclaration
、functionExpression
和variableDeclaration
以及这些:现在生成了更合适的树。如果您现在解析源代码:
将生成以下树:
您现在所要做的就是迭代子级当您偶然发现
VARIABLE
标记时,您就知道它是“全局”的,因为所有其他变量都将位于FUNCTION
节点下。下面是如何做到这一点:
它会产生以下输出:
I guess you're using this grammar.
Although that grammar suggests a proper AST is created, this is not the case. It uses some inline operators to exclude certain tokens from the parse-tree, but it never creates any roots for the tree, resulting in a completely flat parse tree. From this, you can't get all global vars in a reasonable way.
You'll need to adjust the grammar slightly:
Add the following under the
options { ... }
at the top of the grammar file:Now replace the following rules:
functionDeclaration
,functionExpression
andvariableDeclaration
with these:Now a more suitable tree is generated. If you now parse the source:
the following tree is generated:
All you now have to do is iterate over the children of the root of your tree and when you stumble upon a
VARIABLE
token, you know it's a "global" since all other variables will be underFUNCTION
nodes.Here's how to do that:
which produces the following output: