使用 C++ ANTLR 生成的 C 解析器中的类型
我正在尝试在使用 C 作为输出语言的 C++ 项目中使用 ANTLR v3.2 生成的解析器。理论上,生成的解析器可以编译为 C++,但我在处理解析器操作内的 C++ 类型时遇到问题。这是一个 C++ 头文件,定义了我想在解析器中使用的几种类型:
/* expr.h */
enum Kind {
PLUS,
MINUS
};
class Expr { // stub
};
class ExprFactory {
public:
Expr mkExpr(Kind kind, Expr op1, Expr op2);
Expr mkInt(std::string n);
};
这是一个简单的解析器定义:
/* Expr.g */
grammar Expr;
options {
language = 'C';
}
@parser::includes {
#include "expr.h"
}
@members {
ExprFactory *exprFactory;
}
start returns [Expr expr]
: e = expression EOF { $expr = e; }
;
expression returns [Expr e]
: TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
{ e = exprFactory->mkExpr(k,op1,op2); }
| INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); }
;
builtinOp returns [Kind kind]
: TOK_PLUS { kind = PLUS; }
| TOK_MINUS { kind = MINUS; }
;
TOK_PLUS : '+';
TOK_MINUS : '-';
TOK_LPAREN : '(';
TOK_RPAREN : ')';
INTEGER : ('0'..'9')+;
语法通过 ANTLR 运行得很好。当我尝试编译 ExprParser.c 时,出现类似
conversion from 'long int' to non-scalar type 'Expr' requests
no match for 'operator=' in 'e = 0l 的 错误'
从“long int”到“Kind”的无效转换
在每种情况下,该语句都是 Expr
或 Kind
的初始化> 值为NULL
。
我可以通过将所有内容更改为 Expr*
来解决 Expr
的问题。这是可行的,尽管不太理想。但是传递像 Kind
这样的简单枚举的指针似乎很荒谬。我发现的一个丑陋的解决方法是创建第二个返回值,它将 Kind 值推送到结构中并将初始化抑制为 NULL。即,builtinOp
变为
builtinOp returns [Kind kind, bool dummy]
: TOK_PLUS { $kind = PLUS; }
| TOK_MINUS { $kind = MINUS; }
;
,第一个表达式
替代变为
TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
{ e = exprFactory->mkExpr(k.kind,*op1,*op2); }
必须有更好的方法来做事吗?我是否缺少 C 语言后端的配置选项?有没有其他方法来安排我的语法以避免这种尴尬?有没有我可以使用的纯 C++ 后端?
I'm trying to use an ANTLR v3.2-generated parser in a C++ project using C as the output language. The generated parser can, in theory, be compiled as C++, but I'm having trouble dealing with C++ types inside parser actions. Here's a C++ header file defining a few types I'd like to use in the parser:
/* expr.h */
enum Kind {
PLUS,
MINUS
};
class Expr { // stub
};
class ExprFactory {
public:
Expr mkExpr(Kind kind, Expr op1, Expr op2);
Expr mkInt(std::string n);
};
And here's a simple parser definition:
/* Expr.g */
grammar Expr;
options {
language = 'C';
}
@parser::includes {
#include "expr.h"
}
@members {
ExprFactory *exprFactory;
}
start returns [Expr expr]
: e = expression EOF { $expr = e; }
;
expression returns [Expr e]
: TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
{ e = exprFactory->mkExpr(k,op1,op2); }
| INTEGER { e = exprFactory->mkInt((char*)$INTEGER.text->chars); }
;
builtinOp returns [Kind kind]
: TOK_PLUS { kind = PLUS; }
| TOK_MINUS { kind = MINUS; }
;
TOK_PLUS : '+';
TOK_MINUS : '-';
TOK_LPAREN : '(';
TOK_RPAREN : ')';
INTEGER : ('0'..'9')+;
The grammar runs through ANTLR just fine. When I try to compile ExprParser.c, I get errors like
conversion from ‘long int’ to non-scalar type ‘Expr’ requested
no match for ‘operator=’ in ‘e = 0l’
invalid conversion from ‘long int’ to ‘Kind’
In each case, the statement is an initialization of an Expr
or Kind
value to NULL
.
I can make the problem go away for the Expr
's by changing everything to Expr*
. This is workable, though hardly ideal. But passing around pointers for a simple enum like Kind
seems ridiculous. One ugly workaround I've found is to create a second return value, which pushes the Kind
value into a struct and suppresses the initialization to NULL
. I.e, builtinOp
becomes
builtinOp returns [Kind kind, bool dummy]
: TOK_PLUS { $kind = PLUS; }
| TOK_MINUS { $kind = MINUS; }
;
and the first expression
alternative becomes
TOK_LPAREN k=builtinOp op1=expression op2=expression TOK_RPAREN
{ e = exprFactory->mkExpr(k.kind,*op1,*op2); }
There has to be a better way to do things? Am I missing a configuration option to the C language backend? Is there another way to arrange my grammar to avoid this awkwardness? Is there a pure C++ backend I can use?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
以下是我针对此问题找到的解决方案。问题的关键是ANTLR想要初始化所有的返回值和属性。对于非基本类型,ANTLR 只是假设它可以使用 NULL 进行初始化。因此,例如,上面的
表达式
规则将被转换为类似的内容。据我所知,这些选择是:
给出可以合法初始化为
的值原始类型空。例如,使用
Expr*
和Kind*
而不是Expr
和Kind
。使用“虚拟”技巧,如上所述,将值推送到不会初始化的结构中。
使用“虚拟”技巧,如上所述
使用引用参数而不是返回值。例如,
使用使上述声明和初始化合法的操作来增强用作值类型的类。即,对于
Expr
返回值,您需要一个可以采用NULL
的构造函数:对于
Expr
属性,您需要一个无参数构造函数和一个可以采用NULL
的operator=
:我知道这很老套,但我暂时使用#4。碰巧我的
Expr
类对这些操作有一个相当自然的定义。PS 在ANTLR列表上,C后端的维护者暗示这个问题可能会在未来的版本中得到解决。
Here are the solutions I have found to this problem. The crux of the issue is that ANTLR wants to initialize all return values and attributes. For non-primitive types, ANTLR just assumes it can initialize with
NULL
. So, for example, theexpression
rule above will be translated into something likeThe choices, as I see them, are these:
Give the values primitive types that can legally be initialized to
NULL
. E.g., useExpr*
andKind*
instead ofExpr
andKind
.Use the "dummy" trick, as above, to push the value into a structure where it won't be initialized.
Use reference parameters instead of return values. E.g.,
Augment the classes used as value types with operations that make the above declarations and initializations legal. I.e., for a
Expr
return value, you need a constructor that can takeNULL
:For an
Expr
attribute, you need a no-arg constructor and anoperator=
that can takeNULL
:I know it is pretty hacky, but I'm going with #4 for the time being. It just so happens that my
Expr
class has a fairly natural definition of these operations.P.S. On the ANTLR list, the maintainer of the C backend hints that this problem may be solved in future releases.