使用 C++ ANTLR 生成的 C 解析器中的类型

发布于 2024-08-22 16:40:49 字数 1951 浏览 3 评论 0原文

我正在尝试在使用 C 作为输出语言的 C++ 项目中使用 ANTLR v3.2 生成的解析器。理论上,生成的解析器可以编译为 C++,但我在处理解析器操作内的 C++ 类型时遇到问题。这是一个 C++ 头文件,定义了我想在解析器中使用的几种类型:

/* expr.h */
enum Kind {
  PLUS,
  MINUS
};

class Expr { // stub
};

class ExprFactory {
public:
  Expr mkExpr(Kind kind, Expr op1, Expr op2);
  Expr mkInt(std::string n);  
};

这是一个简单的解析器定义:

/* Expr.g */
grammar Expr;

options {
  language = 'C'; 
}

@parser::includes {
  #include "expr.h"
}

@members {
  ExprFactory *exprFactory;
}

start returns [Expr expr]
  : e = expression EOF { $expr = e; }
  ;

expression returns [Expr e]
  : TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
    { e = exprFactory->mkExpr(k,op1,op2); }
  | INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); }
  ;

builtinOp returns [Kind kind]
  : TOK_PLUS { kind = PLUS; }
  | TOK_MINUS { kind = MINUS; }
  ;

TOK_PLUS : '+';
TOK_MINUS : '-';
TOK_LPAREN : '(';
TOK_RPAREN : ')';
INTEGER : ('0'..'9')+;

语法通过 ANTLR 运行得很好。当我尝试编译 ExprParser.c 时,出现类似

  1. conversion from 'long int' to non-scalar type 'Expr' requests
  2. no match for 'operator=' in 'e = 0l 的 错误'
  3. 从“long int”到“Kind”的无效转换

在每种情况下,该语句都是 ExprKind 的初始化> 值为NULL

我可以通过将所有内容更改为 Expr* 来解决 Expr 的问题。这是可行的,尽管不太理想。但是传递像 Kind 这样的简单枚举的指针似乎很荒谬。我发现的一个丑陋的解决方法是创建第二个返回值,它将 Kind 值推送到结构中并将初始化抑制为 NULL。即,builtinOp 变为

builtinOp returns [Kind kind, bool dummy]
  : TOK_PLUS { $kind = PLUS; }
  | TOK_MINUS { $kind = MINUS; }
  ;

,第一个表达式 替代变为

TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
    { e = exprFactory->mkExpr(k.kind,*op1,*op2); }

必须有更好的方法来做事吗?我是否缺少 C 语言后端的配置选项?有没有其他方法来安排我的语法以避免这种尴尬?有没有我可以使用的纯 C++ 后端?

I'm trying to use an ANTLR v3.2-generated parser in a C++ project using C as the output language. The generated parser can, in theory, be compiled as C++, but I'm having trouble dealing with C++ types inside parser actions. Here's a C++ header file defining a few types I'd like to use in the parser:

/* expr.h */
enum Kind {
  PLUS,
  MINUS
};

class Expr { // stub
};

class ExprFactory {
public:
  Expr mkExpr(Kind kind, Expr op1, Expr op2);
  Expr mkInt(std::string n);  
};

And here's a simple parser definition:

/* Expr.g */
grammar Expr;

options {
  language = 'C'; 
}

@parser::includes {
  #include "expr.h"
}

@members {
  ExprFactory *exprFactory;
}

start returns [Expr expr]
  : e = expression EOF { $expr = e; }
  ;

expression returns [Expr e]
  : TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
    { e = exprFactory->mkExpr(k,op1,op2); }
  | INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); }
  ;

builtinOp returns [Kind kind]
  : TOK_PLUS { kind = PLUS; }
  | TOK_MINUS { kind = MINUS; }
  ;

TOK_PLUS : '+';
TOK_MINUS : '-';
TOK_LPAREN : '(';
TOK_RPAREN : ')';
INTEGER : ('0'..'9')+;

The grammar runs through ANTLR just fine. When I try to compile ExprParser.c, I get errors like

  1. conversion from ‘long int’ to non-scalar type ‘Expr’ requested
  2. no match for ‘operator=’ in ‘e = 0l’
  3. invalid conversion from ‘long int’ to ‘Kind’

In each case, the statement is an initialization of an Expr or Kind value to NULL.

I can make the problem go away for the Expr's by changing everything to Expr*. This is workable, though hardly ideal. But passing around pointers for a simple enum like Kind seems ridiculous. One ugly workaround I've found is to create a second return value, which pushes the Kind value into a struct and suppresses the initialization to NULL. I.e, builtinOp becomes

builtinOp returns [Kind kind, bool dummy]
  : TOK_PLUS { $kind = PLUS; }
  | TOK_MINUS { $kind = MINUS; }
  ;

and the first expression alternative becomes

TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
    { e = exprFactory->mkExpr(k.kind,*op1,*op2); }

There has to be a better way to do things? Am I missing a configuration option to the C language backend? Is there another way to arrange my grammar to avoid this awkwardness? Is there a pure C++ backend I can use?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

深空失忆 2024-08-29 16:40:49

以下是我针对此问题找到的解决方案。问题的关键是ANTLR想要初始化所有的返回值和属性。对于非基本类型,ANTLR 只是假设它可以使用 NULL 进行初始化。因此,例如,上面的 表达式 规则将被转换为类似的内容。

static Expr
expression(pExprParser ctx)
{   
    Expr e = NULL; // Declare and init return value
    Kind k; // declare attributes
    Expr op1, op2;
    k = NULL; // init attributes
    op1 = NULL;
    op2 = NULL;
    ...
}

据我所知,这些选择是:

  1. 给出可以合法初始化为 的值原始类型空。例如,使用 Expr*Kind* 而不是 ExprKind

  2. 使用“虚拟”技巧,如上所述,将值推送到不会初始化的结构中。

    使用“虚拟”技巧,如上所述

  3. 使用引用参数而不是返回值。例如,

    builtinOp[Kind&]种类]
      : TOK_PLUS { 种类 = PLUS; }
      | TOK_MINUS { 种类 = 减; }
      ;
    
  4. 使用使上述声明和初始化合法的操作来增强用作值类型的类。即,对于 Expr 返回值,您需要一个可以采用 NULL 的构造函数:

    Expr(long int n);
    

    对于 Expr 属性,您需要一个无参数构造函数和一个可以采用 NULLoperator=

    Expr();
    Expr 运算符=(long int n);
    

我知道这很老套,但我暂时使用#4。碰巧我的 Expr 类对这些操作有一个相当自然的定义。

PS 在ANTLR列表上,C后端的维护者暗示这个问题可能会在未来的版本中得到解决。

Here are the solutions I have found to this problem. The crux of the issue is that ANTLR wants to initialize all return values and attributes. For non-primitive types, ANTLR just assumes it can initialize with NULL. So, for example, the expression rule above will be translated into something like

static Expr
expression(pExprParser ctx)
{   
    Expr e = NULL; // Declare and init return value
    Kind k; // declare attributes
    Expr op1, op2;
    k = NULL; // init attributes
    op1 = NULL;
    op2 = NULL;
    ...
}

The choices, as I see them, are these:

  1. Give the values primitive types that can legally be initialized to NULL. E.g., use Expr* and Kind* instead of Expr and Kind.

  2. Use the "dummy" trick, as above, to push the value into a structure where it won't be initialized.

  3. Use reference parameters instead of return values. E.g.,

    builtinOp[Kind& kind]
      : TOK_PLUS { kind = PLUS; }
      | TOK_MINUS { kind = MINUS; }
      ;
    
  4. Augment the classes used as value types with operations that make the above declarations and initializations legal. I.e., for a Expr return value, you need a constructor that can take NULL:

    Expr(long int n);
    

    For an Expr attribute, you need a no-arg constructor and an operator= that can take NULL:

    Expr();
    Expr operator=(long int n);
    

I know it is pretty hacky, but I'm going with #4 for the time being. It just so happens that my Expr class has a fairly natural definition of these operations.

P.S. On the ANTLR list, the maintainer of the C backend hints that this problem may be solved in future releases.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文