对在编码表达式解析树/求值器时使用多态性的疑问
我正在为一种玩具语言开发一个解释器,只是为了了解如何构建一个解释器。我已经为生成解析树的语法实现了一个简单的解析器。现在,评估解析树的常用方法是在解析树的节点上使用多态性,代码如下所示(在 Python 中):
class Node:
def __init__(self,left,right):
self.left=left
self.right=right
def evaluate(self):
print "throw an exception"
class AddNode(Node):
def evaluate(self):
return evaluate(self.left)+evaluate(self.right)
class SubNode(Node):
def evaluate(self):
return evaluate(self.left)-evaluate(self.right)
方法评估在 Node 的每个后代中被重写,并防御为需要。只需在根上调用evaluate() 即可找到表达式的值。
另一种方法是在节点中不具有任何评估函数,而是生成一个简单地为节点分配类型的解析树。评估代码是在一个单独的函数中完成的,该函数递归地检查每个节点并决定如何评估它。这将导致函数具有较大的“if else”结构、开关或跳转表。
几乎在我看过的每一个地方,以及我在大学里学到的东西,似乎都表明第一种方法总是更好。我真正不完全理解的是为什么第一种方法应该优于第二种方法。为什么在这种情况下多态性一定更好?带有大型“if else”表的函数不存在,但基本上相同的代码仍然存在,但移动到不同的位置并分散在许多不同的类中。
这引起了我的注意,因为我认为以后可能需要更改某些运算符的含义,甚至可能允许在运行时重新定义运算符。我还将函数调用视为与运算符相同,因此拥有可以在运行时添加的跳转表之类的东西会很好。
第一种方法可能有一些我现在还没有看到的显着优势。有人可以指出吗?
I am working on an interpreter for a toy language, just to learn about building an interpreter. I have already implemented a simple parser for the grammar that generates a parse tree. Now, the usual way of evaluating the parse tree would be to use polymorphism on the nodes of the parse tree, with code that looks something like below (in Python):
class Node:
def __init__(self,left,right):
self.left=left
self.right=right
def evaluate(self):
print "throw an exception"
class AddNode(Node):
def evaluate(self):
return evaluate(self.left)+evaluate(self.right)
class SubNode(Node):
def evaluate(self):
return evaluate(self.left)-evaluate(self.right)
The method evaluate is overridden in every descendant of Node and defend to behave as needed. The value of an expression can be found by simply calling evaluate() on the root.
Another way is to not have any evaluate function in the nodes, but to generate a parse tree that simply assigns a type to the node. The evaluation code is done in a separate function that recursively examines each node and decides how to evaluate it. This would result in a function with a large "if else" construct, switch, or jump table.
In almost every place I look, and what I have been taught back in college, seems to indicate that the first method is always better. What I don't really understand completely is why the first method should be superior to the second. Why is polymorphism necessarily better in this case? The function with the large "if else" table is not there, but basically the same code is still there but moved to a different place and scattered over a lot of different classes.
This came to my attention because I figured I might need to change the meaning of certain operators later on, perhaps even allow operators to be redefined at runtime. I am also treating function calls the same as operators, so having something like a jump table that can be added to at runtime would be nice.
There is probably some significant advantage with the first method that I'm not seeing right now. Can someone please point that out?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
这是一个有趣的问题。在我看来,即使有运算符重载,多态性仍然是一个好方法。
您使用多态类定义的是该语言支持的原始操作。运算符重载不会向该语言添加任何更原始的功能。重载运算符实际上相当于函数调用,并且应该以类似的方式实现。
在我看来,将运算符的语义保留在运算符的类中将使您的设计更加清晰。这是因为要实现一个新的原语操作,您只需要定义类,然后在解析过程中创建一个实例。另一种方法需要更改解析器和评估器。我只是有一种感觉,求值器可以成为一个 if 语句的大嵌套来处理运算符的各种特殊情况,并且这些特殊情况可以很好地隐藏在运算符类中。
我很难真正相信我在这里所说的,因为我过去使用过这两种技术,而且它们都效果很好。 ;-)
That's an interesting question. It seems to me that polymorphism is still a good way to go, even when you have operator overloading.
What you are defining with your polymorphic classes are the primitive operations that are supported in the language. Operator overloading doesn't add any more primitive capability to the language. An overloaded operator is actually equivalent to a function call and should be implemented in a similar way.
Keeping the semantics of an operator localized in the operator's class will keep your design cleaner, in my opinion. That's because to implement a new primitive operation, you just need to define the class and then create an instance during parsing. The alternative would require changing the parser and the evaluator. I just have a feeling that the evaluator could become a big nest of if statements to handle various special cases for operators, and these special cases could be hidden well in the operator classes.
I'm having a hard time really believing what I'm saying here because I've used both techiques in the past and they've both worked well. ;-)