在 ANTLR 3 中,如何在运行时而不是提前生成词法分析器(和解析器)?

发布于 2024-11-02 20:14:11 字数 91 浏览 1 评论 0原文

我想在运行时生成 antlr 词法分析器——也就是说,生成语法并从语法生成词法分析器类及其在运行时的支持位。我很高兴将它输入到 java 编译器中,它可以在运行时访问。

I want to generate an antlr lexer at runtime -- that is, generate the grammar and from the grammar generate the lexer class, and its supporting bits at runtime. I am happy to feed it into the the java compiler, which is accessible at runtime.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

鹤仙姿 2024-11-09 20:14:11

这是一种快速但肮脏的方法:

  1. 生成一个组合(!)ANTLR语法.g文件,给定一个字符串作为语法源,
  2. 并创建一个解析器和解析器。 Lexer 从这个 .g 文件中
  3. 编译这些解析器和解析器。 Lexer .java 文件,
  4. 创建解析器和解析器的实例词法分析器类并调用解析器的入口点。

Main.java

import java.io.*;
import javax.tools.*;
import java.lang.reflect.*;
import org.antlr.runtime.*;
import org.antlr.Tool;

public class Main {

    public static void main(String[] args) throws Exception {

        // The grammar which echos the parsed characters to theconsole,
        // skipping any white space chars.
        final String grammar =
                "grammar T;                                                  \n" +
                "                                                            \n" +
                "parse                                                       \n" +
                "  :  (ANY {System.out.println(\"ANY=\" + $ANY.text);})* EOF \n" +
                "  ;                                                         \n" +
                "                                                            \n" +
                "SPACE                                                       \n" +
                "  :  (' ' | '\\t' | '\\r' | '\\n') {skip();}                \n" +
                "  ;                                                         \n" +
                "                                                            \n" +
                "ANY                                                         \n" +
                "  :  .                                                      \n" +
                "  ;                                                           ";
        final String grammarName = "T";
        final String entryPoint = "parse";

        // 1 - Write the `.g` grammar file to disk.
        Writer out = new BufferedWriter(new FileWriter(new File(grammarName + ".g")));
        out.write(grammar);
        out.close();

        // 2 - Generate the lexer and parser.
        Tool tool = new Tool(new String[]{grammarName + ".g"});
        tool.process();

        // 3 - Compile the lexer and parser.
        JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
        compiler.run(null, System.out, System.err, "-sourcepath", "", grammarName + "Lexer.java");
        compiler.run(null, System.out, System.err, "-sourcepath", "", grammarName + "Parser.java");

        // 4 - Parse the command line parameter using the dynamically created lexer and 
        //     parser with a bit of reflection Voodoo :)
        Lexer lexer = (Lexer)Class.forName(grammarName + "Lexer").newInstance();
        lexer.setCharStream(new ANTLRStringStream(args[0]));
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        Class<?> parserClass = Class.forName(grammarName + "Parser");
        Constructor parserCTor = parserClass.getConstructor(TokenStream.class);
        Parser parser = (Parser)parserCTor.newInstance(tokens);
        Method entryPointMethod = parserClass.getMethod(entryPoint);
        entryPointMethod.invoke(parser);
    }
}

在像这样(在 *nix 上)

java -cp .:antlr-3.2.jar Main "a b    c"

或 Windows 上

java -cp .;antlr-3.2.jar Main "a b    c"

编译并运行它之后,会产生以下输出:

ANY=a
ANY=b
ANY=c

Here's a quick and dirty way to:

  1. generate a combined (!) ANTLR grammar .g file given a String as grammar-source,
  2. and create a Parser & Lexer from this .g file,
  3. compile the these Parser & Lexer .java files,
  4. create instances of the Parser & Lexer classes and invoke the entry point of the parser.

Main.java

import java.io.*;
import javax.tools.*;
import java.lang.reflect.*;
import org.antlr.runtime.*;
import org.antlr.Tool;

public class Main {

    public static void main(String[] args) throws Exception {

        // The grammar which echos the parsed characters to theconsole,
        // skipping any white space chars.
        final String grammar =
                "grammar T;                                                  \n" +
                "                                                            \n" +
                "parse                                                       \n" +
                "  :  (ANY {System.out.println(\"ANY=\" + $ANY.text);})* EOF \n" +
                "  ;                                                         \n" +
                "                                                            \n" +
                "SPACE                                                       \n" +
                "  :  (' ' | '\\t' | '\\r' | '\\n') {skip();}                \n" +
                "  ;                                                         \n" +
                "                                                            \n" +
                "ANY                                                         \n" +
                "  :  .                                                      \n" +
                "  ;                                                           ";
        final String grammarName = "T";
        final String entryPoint = "parse";

        // 1 - Write the `.g` grammar file to disk.
        Writer out = new BufferedWriter(new FileWriter(new File(grammarName + ".g")));
        out.write(grammar);
        out.close();

        // 2 - Generate the lexer and parser.
        Tool tool = new Tool(new String[]{grammarName + ".g"});
        tool.process();

        // 3 - Compile the lexer and parser.
        JavaCompiler compiler = ToolProvider.getSystemJavaCompiler();
        compiler.run(null, System.out, System.err, "-sourcepath", "", grammarName + "Lexer.java");
        compiler.run(null, System.out, System.err, "-sourcepath", "", grammarName + "Parser.java");

        // 4 - Parse the command line parameter using the dynamically created lexer and 
        //     parser with a bit of reflection Voodoo :)
        Lexer lexer = (Lexer)Class.forName(grammarName + "Lexer").newInstance();
        lexer.setCharStream(new ANTLRStringStream(args[0]));
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        Class<?> parserClass = Class.forName(grammarName + "Parser");
        Constructor parserCTor = parserClass.getConstructor(TokenStream.class);
        Parser parser = (Parser)parserCTor.newInstance(tokens);
        Method entryPointMethod = parserClass.getMethod(entryPoint);
        entryPointMethod.invoke(parser);
    }
}

Which, after compiling and running it like this (on *nix):

java -cp .:antlr-3.2.jar Main "a b    c"

or on Windows

java -cp .;antlr-3.2.jar Main "a b    c"

, produces the following output:

ANY=a
ANY=b
ANY=c
じ违心 2024-11-09 20:14:11

您必须使用 org.antlr.Tool() 类才能使其正常工作。

您可以在 github 上查看 ANTLRWorks 源代码以了解如何使用它,特别是 generate () 方法此处< /a>:

ErrorListener el = ErrorListener.getThreadInstance();
ErrorManager.setErrorListener(el);

String[] params;
if(debug)
    params = new String[] { "-debug", "-o", getOutputPath(), "-lib", window.getFileFolder(), window.getFilePath() };
else
    params = new String[] { "-o", getOutputPath(), "-lib", window.getFileFolder(), window.getFilePath() };

new File(getOutputPath()).mkdirs();

Tool antlr = new Tool(Utils.concat(params, AWPrefs.getANTLR3Options()));
antlr.process();

boolean success = !el.hasErrors();
if(success) {
    dateOfModificationOnDisk = window.getDocument().getDateOfModificationOnDisk();
}
lastError = el.getFirstErrorMessage();
el.clear();
ErrorManager.removeErrorListener();
return success;

You'll have to use org.antlr.Tool() class to get it working.

You can check ANTLRWorks source code on github to have an idea how to use it, specifically the generate() method here:

ErrorListener el = ErrorListener.getThreadInstance();
ErrorManager.setErrorListener(el);

String[] params;
if(debug)
    params = new String[] { "-debug", "-o", getOutputPath(), "-lib", window.getFileFolder(), window.getFilePath() };
else
    params = new String[] { "-o", getOutputPath(), "-lib", window.getFileFolder(), window.getFilePath() };

new File(getOutputPath()).mkdirs();

Tool antlr = new Tool(Utils.concat(params, AWPrefs.getANTLR3Options()));
antlr.process();

boolean success = !el.hasErrors();
if(success) {
    dateOfModificationOnDisk = window.getDocument().getDateOfModificationOnDisk();
}
lastError = el.getFirstErrorMessage();
el.clear();
ErrorManager.removeErrorListener();
return success;
演多会厌 2024-11-09 20:14:11

您是否尝试过使用适当的 String[] 参数调用 org.antlr.Tool.main(String[]) ?

如果这太麻烦了,您可以对 Tool 类进行逆向工程(源代码)来了解它是如何工作的,以及如何完成您需要执行的特定任务。

Have you tried calling org.antlr.Tool.main(String[]) with an appropriate String[] argument?

If that's too cumbersome, you could reverse engineer the Tool class (source code) to figure out how it works, and how to do the specific tasks you need to do.

心如荒岛 2024-11-09 20:14:11

ANTRL 4

由于我找不到 ANTLR 4 的特定主题,我决定分享一个类似于 的解决方案Bart Kiers 的答案,但专为 ANTLR 4 量身定制。如果您需要在运行时动态生成 ANTLR 词法分析器和解析器,则此解决方案非常有用。但是,请小心:这种方法利用 Java Reflection 并将临时文件写入文件系统,这在某些环境中可能被认为是“脏”且有风险的。

public final class DynamicLexerAndParser {

    /**
     * Temporary directory.
     * Where the generated ANTLR lexer and parser will be stored.
     */
    private final Path temp;

    /**
     * ANTLR grammar text.
     */
    private final String grammar;

    /**
     * Top rule name. Required to build a parse tree.
     */
    private final String top;

    /**
     * Constructor.
     * @param temp Temporary directory where the generated ANTLR lexer and parser will be stored.
     * @param grammar ANTLR grammar text.
     * @param top Top rule name.
     */
    DynamicLexerAndParser(final Path temp, final String grammar, final String top) {
        this.temp = temp;
        this.grammar = grammar;
        this.top = top;
    }

    /**
     * Dynamically generate lexer and parser.
     * @param code Code example for provided grammar.
     */
    public void generate(final String code) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, IOException {
            final String name = this.grammarName();
            // Save grammar to a temp dir
            final Path gpath = this.temp.resolve(String.format("%s.g4", name));
            Files.write(gpath, this.grammar.getBytes(StandardCharsets.UTF_8));
            // Generate Parser.java and Lexer.java
            final Tool tool = new Tool(new String[]{gpath.toString()});
            tool.processGrammarsOnCommandLine();
            // Compile Parser.java and Lexer.java
            ToolProvider.getSystemJavaCompiler().run(
                System.in,
                System.out,
                System.err,
                Files.list(this.temp).filter(Files::isRegularFile)
                    .filter(java -> java.getFileName().toString().endsWith(".java"))
                    .map(Path::toString)
                    .toArray(String[]::new)
            );
            // Generated lexer
            final Lexer lexer = this.lexer(this.load(String.format("%sLexer", name)), 
            // Generated parser
            final Parser parser = this.parser(this.load(String.format("%sParser", name)), lexer);
    }

    /**
     * Create parser instance.
     * @param cparser Loaded parser class.
     * @param lexer Lexer instance.
     * @return Parser instance.
     */
    private Parser parser(final Class<?> cparser, final Lexer lexer) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, InstantiationException {
            final Constructor<?> constructor = cparser.getDeclaredConstructor(TokenStream.class);
            return (Parser) constructor.newInstance(new CommonTokenStream(lexer));
    }

    /**
     * Create lexer instance.
     * @param clexer Loaded lexer class.
     * @param code Code to parse.
     * @return Lexer instance.
     */
    private Lexer lexer(final Class<?> clexer, final String code) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, InstantiationException {
            final Constructor<?> constructor = clexer.getDeclaredConstructor(CharStream.class);
            return (Lexer) constructor.newInstance(CharStreams.fromString(code));
    }

    /**
     * Get grammar name.
     * @return Grammar name.
     */
    String grammarName() {
        final Matcher matcher = Pattern.compile("grammar\\s+(\\w+);").matcher(this.grammar);
        if (matcher.find()) {
            return matcher.group(1);
        } else {
            throw new IllegalStateException("Grammar name not found");
        }
    }

    /**
     * Load class from the temp directory.
     * @param name Class name.
     * @return Loaded class.
     */
    private Class<?> load(String name) {
        try {
            return new URLClassLoader(
                new URL[]{this.temp.toFile().toURI().toURL()}
            ).loadClass(name);
        } catch (final MalformedURLException | ClassNotFoundException exception) {
            throw new IllegalStateException(
                "Something went wrong during class loading",
                exception
            );
        }
    }
}

ANTRL 4

Since I couldn’t find a specific topic for ANTLR 4, I decided to share a solution similar to Bart Kiers' answer but tailored for ANTLR 4. This solution is useful if you need to dynamically generate an ANTLR lexer and parser at runtime. However, be cautious: this approach leverages Java Reflection and writes temporary files to the file system, which can be considered "dirty" and risky in certain environments.

public final class DynamicLexerAndParser {

    /**
     * Temporary directory.
     * Where the generated ANTLR lexer and parser will be stored.
     */
    private final Path temp;

    /**
     * ANTLR grammar text.
     */
    private final String grammar;

    /**
     * Top rule name. Required to build a parse tree.
     */
    private final String top;

    /**
     * Constructor.
     * @param temp Temporary directory where the generated ANTLR lexer and parser will be stored.
     * @param grammar ANTLR grammar text.
     * @param top Top rule name.
     */
    DynamicLexerAndParser(final Path temp, final String grammar, final String top) {
        this.temp = temp;
        this.grammar = grammar;
        this.top = top;
    }

    /**
     * Dynamically generate lexer and parser.
     * @param code Code example for provided grammar.
     */
    public void generate(final String code) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, IOException {
            final String name = this.grammarName();
            // Save grammar to a temp dir
            final Path gpath = this.temp.resolve(String.format("%s.g4", name));
            Files.write(gpath, this.grammar.getBytes(StandardCharsets.UTF_8));
            // Generate Parser.java and Lexer.java
            final Tool tool = new Tool(new String[]{gpath.toString()});
            tool.processGrammarsOnCommandLine();
            // Compile Parser.java and Lexer.java
            ToolProvider.getSystemJavaCompiler().run(
                System.in,
                System.out,
                System.err,
                Files.list(this.temp).filter(Files::isRegularFile)
                    .filter(java -> java.getFileName().toString().endsWith(".java"))
                    .map(Path::toString)
                    .toArray(String[]::new)
            );
            // Generated lexer
            final Lexer lexer = this.lexer(this.load(String.format("%sLexer", name)), 
            // Generated parser
            final Parser parser = this.parser(this.load(String.format("%sParser", name)), lexer);
    }

    /**
     * Create parser instance.
     * @param cparser Loaded parser class.
     * @param lexer Lexer instance.
     * @return Parser instance.
     */
    private Parser parser(final Class<?> cparser, final Lexer lexer) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, InstantiationException {
            final Constructor<?> constructor = cparser.getDeclaredConstructor(TokenStream.class);
            return (Parser) constructor.newInstance(new CommonTokenStream(lexer));
    }

    /**
     * Create lexer instance.
     * @param clexer Loaded lexer class.
     * @param code Code to parse.
     * @return Lexer instance.
     */
    private Lexer lexer(final Class<?> clexer, final String code) throws NoSuchMethodException, IllegalAccessException, InvocationTargetException, InstantiationException {
            final Constructor<?> constructor = clexer.getDeclaredConstructor(CharStream.class);
            return (Lexer) constructor.newInstance(CharStreams.fromString(code));
    }

    /**
     * Get grammar name.
     * @return Grammar name.
     */
    String grammarName() {
        final Matcher matcher = Pattern.compile("grammar\\s+(\\w+);").matcher(this.grammar);
        if (matcher.find()) {
            return matcher.group(1);
        } else {
            throw new IllegalStateException("Grammar name not found");
        }
    }

    /**
     * Load class from the temp directory.
     * @param name Class name.
     * @return Loaded class.
     */
    private Class<?> load(String name) {
        try {
            return new URLClassLoader(
                new URL[]{this.temp.toFile().toURI().toURL()}
            ).loadClass(name);
        } catch (final MalformedURLException | ClassNotFoundException exception) {
            throw new IllegalStateException(
                "Something went wrong during class loading",
                exception
            );
        }
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文