关于后缀到中缀解析器的建议

发布于 2024-10-16 03:38:32 字数 640 浏览 5 评论 0原文

我遇到过一种专有的基于堆栈的脚本语言,它看起来像是 x86 ASM 的简化版本。

我用 C++ 为这种语言构建了一个基于堆栈的线性解析器,我希望它能生成伪 C 代码,使该语言更容易阅读。

我至少遇到了一个严重的问题,我觉得这与我的解析器的线性性质有关......例如,假设我们有以下代码:

push const int
push const str
call some_method
pop const str
pop const int
return last return val

使用我当前的实现,我可以很容易地生成以下代码:

retval = some_method(str, int)
return retval

但是以下将是一个主要的痛苦:

return some_method(some_str, some_int)

当我遇到指令/操作码时,我会意识到 -variables- 被推送到堆栈上,但仅此而已......

它归结为我可以从 postfix 转到中缀用于类似指令的组合(例如推送+调用),但不适用于多个指令。

我在语言解析器方面非常缺乏经验,所以对我宽容点!您的建议是什么?

I've come across a proprietary stack-based scripting language that looks like a somewhat simplified version of x86 ASM.

I built a stack-based linear parser for this language in C++ that I hope will produce pseudo-C code to make the language a lot easier to read.

I've encountered at least one serious issue that I feel has to do with the linear nature of my parser... for example, let's say we have the following code:

push const int
push const str
call some_method
pop const str
pop const int
return last return val

With my current implementation, I could very easily generate the following:

retval = some_method(str, int)
return retval

But the following will be a major pain:

return some_method(some_str, some_int)

When I encounter an instruction/opcode, I will be aware of -variables- pushed onto the stack, but that is about it...

What it boils down to is that I can go from postfix to infix for a combination similar instructions (pushes + calls for example), but not for multiple ones.

I am very unexperienced when it comes to language parsers so go easy on me! What would be your suggestion?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

仅一夜美梦 2024-10-23 03:38:32

您想要的是符号执行。安排表达式的 C++ 表示形式,例如

class Expression{...};
class StringConstant:public Expression{...};
class FunctionCall:public Expression{...};

然后,使符号堆栈包含 Expression*。然后,当您将

return foo

其编译为

cout<<"return "; foo_expr->print();

每个表达式都有适当的(可能是递归的)打印方法时。

编辑:如果没有return语句,则需要迭代值堆栈,并对所有尚未被消耗的值执行->print。

What you want is symbolic execution. Arrange to have a C++ representation of expressions, such as

class Expression{...};
class StringConstant:public Expression{...};
class FunctionCall:public Expression{...};

Then, make your symbolic stack contain Expression*. When you then arrive at

return foo

compile this to

cout<<"return "; foo_expr->print();

where each Expression would have an appropriate (possibly recursive) print method.

Edit: If there is no return statement, you need to iterate over the value stack, and perform ->print for all values that haven't otherwise been consumed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文