用 Python 编写解释器。 isinstance 被认为是有害的吗?
我正在将我创建的领域特定语言的解释器从 Scala 移植到 Python。在此过程中,我尝试找到一种 Pythonic 的方式来模拟我广泛使用的 Scala 的案例类功能。最后我求助于使用 isinstance,但感觉我可能错过了一些东西。
诸如 这篇 等攻击 isinstance 使用的文章让我想知道是否有更好的方法解决我的问题的方法不涉及一些基本的重写。
我已经构建了许多 Python 类,每个类代表不同类型的抽象语法树节点,例如 For、While、Break、Return、Statement 等
Scala 允许像这样处理运算符求值:
case EOp("==",EInt(l),EInt(r)) => EBool(l==r)
case EOp("==",EBool(l),EBool(r)) => EBool(l==r)
到目前为止,对于端口对于Python,我广泛使用了elif块和isinstance调用来达到相同的效果,但更加冗长和非Pythonic。有更好的办法吗?
I'm porting over the interpreter for a domain specific language I created from Scala to Python. In the process I tried to find a way that way pythonic to emulate the case class feature of Scala that I used extensively. In the end I resorted to using isinstance, but was left feeling that I was perhaps missing something.
Articles such as this one attacking the use of isinstance made me wonder whether there was a better way to solve my problem that doesn't involve some fundamental rewrite.
I've built up a number of Python classes that each represent a different type of abstract syntax tree node, such as For, While, Break, Return, Statement etc
Scala allows for the handling of operator evaluation like this:
case EOp("==",EInt(l),EInt(r)) => EBool(l==r)
case EOp("==",EBool(l),EBool(r)) => EBool(l==r)
So far for the port to Python I've made extensive use of elif blocks and isinstance calls to achieve the same effect, much more verbose and un-pythonic. Is there a better way?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
总结:这是编写编译器的常见方法,在这里也很好。
在其他语言中处理此问题的一种非常常见的方法是通过“模式匹配”,这正是您所描述的。我希望这是 Scala 中
case
语句的名称。它是编写编程语言实现和工具的一种非常常见的习惯用法:编译器、解释器等。为什么它这么好?因为实现与数据完全分离(这通常很糟糕,但在编译器中通常是可取的)。那么问题是,这种编程语言实现的常见习惯用法在 Python 中是一种反模式。呃哦。正如您可能知道的那样,这更多的是一个政治问题而不是语言问题。如果其他 Python 爱好者看到这些代码,他们会尖叫;如果其他语言实现者看到它,他们会立即理解它。
这是 Python 中的反模式的原因是因为 Python 鼓励鸭子类型接口:您不应该具有基于类型的行为,而应该由对象在运行时可用的方法来定义它们。 S.如果你希望它是惯用的Python,洛特的答案可以很好地工作,但它增加的很少。
我怀疑你的设计并不是真正的鸭子类型 - 毕竟它是一个编译器,并且使用名称和静态结构定义的类非常常见。如果您愿意,您可以将对象视为具有“类型”字段,并且
isinstance
用于基于该类型进行模式匹配。附录:
模式匹配可能是人们喜欢用函数式语言编写编译器等的首要原因。
Summary: This is a common way to write compilers, and its just fine here.
A very common way to handle this in other languages is by "pattern matching", which is exactly what you've described. I expect that's the name for that
case
statement in Scala. Its a very common idiom for writing programming language implementations and tools: compilers, interpreters etc. Why is it so good? Because the implementation is completely separated from the data (which is often bad, but generally desirable in compilers).The problem then is that this common idiom for programming language implementation is an anti-pattern in Python. Uh oh. As you can probably tell, this is more a political issue than a language issue. If other Pythonistas saw the code they would scream; if other language implementers saw it, they would understand it immediately.
The reason this is an anti-pattern in Python is because Python encourages duck-typed interfaces: you shouldn't have behaviour based on type, but rather they should be defined by the methods that an object has available at run-time. S. Lott's answer works fine if you want it to be idiomatic Python, but it adds little.
I suspect that your design isn't really duck-typed - its a compiler after all, and classes defined using a name, with a static structure, are pretty common. If you prefer, you could think of your objects as having a "type" field, and
isinstance
is used to pattern-match based on that type.Addenum:
Pattern-matching is probably the number one reason that people love writing compilers etc in functional languages.
python 中有一条经验法则,如果您发现自己编写了一大块 if/elif 语句,具有类似的条件(例如一堆 isinstance(...) ),那么您可能正在以错误的方式解决问题。
更好的方法包括使用类和多态性、访问者模式、字典查找等。在您的情况下,制作一个具有不同类型重载的 Operators 类可以工作(如上所述),带有(类型,运算符)项的字典也可以。
There's a rule of thumb in python, if you find yourself writing a large block of if/elif statements, with similar conditions (a bunch of isinstance(...) for example) then you're probably solving the problem the wrong way.
Better ways involve using classes and polymorphism, visitor pattern, dict lookup, etc. In your case making an Operators class with overloads for different types could work (as noted above), so could a dict with (type, operator) items.
是的。
不用实例,只需使用多态性。更简单。
这种情况非常简单,并且不需要
isinstance
。像这样需要强制的事情又如何呢?
这仍然是一个多态性问题。
Yes.
Instead of instance, just use Polymorphism. It's simpler.
This kind of this is very simple, and never requires
isinstance
.What about something like this where there is coercion required?
This is still a polymorphism issue.
本文并未攻击
isinstance
。它攻击了针对特定类进行代码测试的想法。是的,有更好的方法。或者几个。例如,您可以将类型的处理放入函数中,然后通过查找每个类型来找到正确的函数。像这样:
如果你不想自己做这个注册表,你可以看看 Zope 组件架构,它通过接口和适配器来处理这个,它真的很酷。但这可能有点矫枉过正了。
更好的是,如果您能以某种方式避免进行任何类型的类型检查,但这可能很棘手。
The article does not attack
isinstance
. It attacks the idea of making your code test for specific classes.And yes, there is a better way. Or several. You can for example make the handling of a type into a function, and then find the correct function by looking up per type. Like this:
If you don't want to do this registry yourself, you can look at the Zope Component Architecture, which handles this through interfaces and adapters, and it really cool. But that's probably overkill.
Even better is if you can somehow avoid doing any type of type checking, but that may be tricky.
在我使用 Python 3 编写的 DSL 中,我使用了复合设计模式,因此节点在使用时都是多态的,正如 S. Lott 所建议的那样。
但是,当我首先读取输入以创建这些节点时,我确实大量使用了 isinstance 检查(针对 Python 3 提供的抽象基类,例如 collections.Iterable 等,这些基类也在 2.6 中提供)我相信),以及检查 hasattr
'__call__'
因为我的输入中允许调用。这是我发现的最简洁的方法(特别是涉及递归),而不是仅仅尝试针对输入进行操作并捕获异常,这是我想到的替代方案。当输入无效时,我自己引发自定义异常,以提供尽可能多的精确故障信息。使用 isinstance 进行此类测试比使用 type() 更通用,因为 isinstance 将捕获子类 - 如果您可以针对抽象基类进行测试,那就更好了。请参阅http://www.python.org/dev/peps/pep-3119/ 有关抽象基类的信息。
In a DSL I wrote using Python 3, I used the Composite design pattern so the nodes were all polymorphic in their use, as S. Lott is recommending.
But, when I was reading in the input to create those nodes in the first place, I did use isinstance checks a lot (against abstract base classes like collections.Iterable, etc., which Python 3 provides, and which are in 2.6 as well I believe), as well as checks for hasattr
'__call__'
since callables were allowed in my input. This was the cleanest way I found to do it (particulary with recursion involved), rather than just trying operations against input and catching exceptions, which is the alternative that comes to mind. I was raising custom exceptions myself when the input was invalid to give as much precise failure information as possible.Using isinstance for such tests is more general than using type(), since isinstance will catch subclasses - and if you can test against the abstract base classes, that is all the better. See http://www.python.org/dev/peps/pep-3119/ for info on the abstract base classes.
在这种特殊情况下,您似乎要实现的是一个运算符重载系统,该系统使用对象的类型作为您打算调用的运算符的选择机制。您的节点类型恰好与您的语言类型相当直接对应,但实际上您正在编写一个解释器。节点的类型只是一段数据。
我不知道人们是否可以将自己的类型添加到您的领域特定语言中。但无论如何我都会推荐表驱动设计。
制作一个包含 (binary_operator, type1, type2, result_type, evalfunc) 的数据表。使用 isinstance 在该表中搜索匹配项,并有一些标准来优先选择某些匹配项而不是其他匹配项。可能可以使用比表更复杂的数据结构来加快搜索速度,但现在您基本上是使用长长的 ifelse 语句列表来进行线性搜索,所以我打赌一个普通的旧表会比你现在做的稍微快一点。
我不认为 isinstance 在这里是错误的选择,主要是因为类型只是解释器用来做出决定的一段数据。双重调度和其他类似技术只会掩盖程序正在执行的操作的真正内容。
Python 中的一个巧妙之处是,由于运算符函数和类型都是第一类对象,因此您可以直接将它们填充到表(或您选择的任何数据结构)中。
In this particular case, what you seem to be implementing is an operator overloading system that uses the types of the objects as the selection mechanism for the operator you intend to call. Your node types happen to fairly directly correspond to your language's types, but in reality you're writing an interpreter. The type of the node is just a piece of data.
I don't know if people can add their own types to your domain specific language. But I would recommend a table driven design regardless.
Make a table of data containing (binary_operator, type1, type2, result_type, evalfunc). Search through that table for matches using isinstance and have some criteria for preferring some matches over others. It may be possible to use a somewhat more sophisticated data structure than a table to make searching faster, but right now you're basically using long lists of ifelse statements to do a linear search anyway, so I'm betting a plain old table will be slightly faster than what you're doing now.
I do not consider isinstance to be the wrong choice here largely because the type is just a piece of data your interpreter is working with to make a decision. Double dispatch and other techniques of that ilk are just going to obscure the real meat of what your program is doing.
One of the neat things in Python is that since operator functions and types are all first class objects, you can just stuff them in the table (or whatever data structure you choose) directly.
如果您需要参数的多态性(除了接收器之外),例如按照您的示例建议使用二元运算符处理类型转换,您可以使用以下技巧:
If you need Polymorphism on arguments (in addition to the receiver), for example to handle type conversions with binary operators as suggested by your example, you can use the following trick: