为什么在定义时评估默认参数?

发布于 2024-08-09 16:28:54 字数 499 浏览 4 评论 0原文

我很难理解算法中问题的根本原因。然后,通过一步步简化函数,我发现 Python 中默认参数的求值并不像我预期的那样。

代码如下:

class Node(object):
    def __init__(self, children = []):
        self.children = children

问题是 Node 类的每个实例共享相同的 children 属性,如果该属性没有明确给出,例如:

>>> n0 = Node()
>>> n1 = Node()
>>> id(n1.children)
Out[0]: 25000176
>>> id(n0.children)
Out[0]: 25000176

我不明白这个设计决策的逻辑?为什么 Python 设计者决定在定义时评估默认参数?这对我来说似乎非常违反直觉。

I had a very difficult time with understanding the root cause of a problem in an algorithm. Then, by simplifying the functions step by step I found out that evaluation of default arguments in Python doesn't behave as I expected.

The code is as follows:

class Node(object):
    def __init__(self, children = []):
        self.children = children

The problem is that every instance of Node class shares the same children attribute, if the attribute is not given explicitly, such as:

>>> n0 = Node()
>>> n1 = Node()
>>> id(n1.children)
Out[0]: 25000176
>>> id(n0.children)
Out[0]: 25000176

I don't understand the logic of this design decision? Why did Python designers decide that default arguments are to be evaluated at definition time? This seems very counter-intuitive to me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

森林散布 2024-08-16 16:28:54

另一种选择将是相当重量级的——将“默认参数值”存储在函数对象中作为代码的“thunks”,每次调用该函数而没有为该参数指定值时都会一遍又一遍地执行——并且会使其获得早期绑定(在定义时间绑定)要困难得多,这通常是您想要的。例如,在现有的 Python 中:

def ack(m, n, _memo={}):
  key = m, n
  if key not in _memo:
    if m==0: v = n + 1
    elif n==0: v = ack(m-1, 1)
    else: v = ack(m-1, ack(m, n-1))
    _memo[key] = v
  return _memo[key]

...编写像上面这样的记忆函数是一项非常基本的任务。类似地:

for i in range(len(buttons)):
  buttons[i].onclick(lambda i=i: say('button %s', i))

...简单的 i=i 依赖于默认参数值的早期绑定(定义时间),是一种获得早期绑定的非常简单的方法。因此,当前的规则很简单、直接,并且可以让您以一种非常容易解释和理解的方式完成您想要的所有操作:如果您想要后期绑定表达式的值,请在函数体中计算该表达式;如果您想要早期绑定,请将其评估为 arg 的默认值。

另一种方法是在这两种情况下强制进行后期绑定,但不会提供这种灵活性,并且会迫使您在每次需要早期绑定时都经历一些麻烦(例如将函数包装到闭包工厂中),如上面的示例所示 - 然而这种假设的设计决策迫使程序员使用更多重量级的样板(除了在各处生成和重复评估 thunk 的“隐形”样板)。

换句话说,“应该有一种,最好只有一种,明显的方法来做到这一点[1]”:当您想要后期绑定时,已经有一种完全明显的方法来实现它(因为所有函数的代码都只执行在调用时,显然那里评估的所有内容都是后期绑定的);让 default-arg 评估产生早期绑定也为您提供了一种明显的方法来实现早期绑定(加号!-),而不是提供两种明显的方法来获得后期绑定,并且没有明显的方法来获得早期绑定(减号!-)。

[1]:“虽然这种方式一开始可能并不明显,除非你是荷兰人。”

The alternative would be quite heavyweight -- storing "default argument values" in the function object as "thunks" of code to be executed over and over again every time the function is called without a specified value for that argument -- and would make it much harder to get early binding (binding at def time), which is often what you want. For example, in Python as it exists:

def ack(m, n, _memo={}):
  key = m, n
  if key not in _memo:
    if m==0: v = n + 1
    elif n==0: v = ack(m-1, 1)
    else: v = ack(m-1, ack(m, n-1))
    _memo[key] = v
  return _memo[key]

...writing a memoized function like the above is quite an elementary task. Similarly:

for i in range(len(buttons)):
  buttons[i].onclick(lambda i=i: say('button %s', i))

...the simple i=i, relying on the early-binding (definition time) of default arg values, is a trivially simple way to get early binding. So, the current rule is simple, straightforward, and lets you do all you want in a way that's extremely easy to explain and understand: if you want late binding of an expression's value, evaluate that expression in the function body; if you want early binding, evaluate it as the default value of an arg.

The alternative, forcing late binding for both situation, would not offer this flexibility, and would force you to go through hoops (such as wrapping your function into a closure factory) every time you needed early binding, as in the above examples -- yet more heavy-weight boilerplate forced on the programmer by this hypothetical design decision (beyond the "invisible" ones of generating and repeatedly evaluating thunks all over the place).

In other words, "There should be one, and preferably only one, obvious way to do it [1]": when you want late binding, there's already a perfectly obvious way to achieve it (since all of the function's code is only executed at call time, obviously everything evaluated there is late-bound); having default-arg evaluation produce early binding gives you an obvious way to achieve early binding as well (a plus!-) rather than giving TWO obvious ways to get late binding and no obvious way to get early binding (a minus!-).

[1]: "Although that way may not be obvious at first unless you're Dutch."

一腔孤↑勇 2024-08-16 16:28:54

问题是这样的。

每次调用函数时都将函数作为初始值设定项进行计算,成本太高。

  • 0 是一个简单的文字。评估一次,永远使用它。

  • int 是一个函数(如 list),每次需要将其作为初始值设定项时都必须对其进行求值。

构造[]是字面意思,如0,这意味着“这个确切的对象”。

问题是有些人希望它意味着 list 就像“请为我评估这个函数,以获取作为初始值设定项的对象”。

添加必要的 if 语句来始终执行此评估将是一个沉重的负担。最好将所有参数都视为文字,并且在尝试进行函数求值的过程中不要进行任何额外的函数求值。

此外,更根本的是,从技术上来说,将参数默认值实现为函数求值是不可能的。

考虑一下这种循环的递归恐怖。假设默认值不是文字,而是函数,每次需要参数的默认值时都会对其进行求值。

[这与collections.defaultdict 的工作方式类似。]

def aFunc( a=another_func ):
    return a*2

def another_func( b=aFunc ):
    return b*3

another_func() 的值是多少?要获得 b 的默认值,它必须评估 aFunc,这需要评估 another_func。哎呀。

The issue is this.

It's too expensive to evaluate a function as an initializer every time the function is called.

  • 0 is a simple literal. Evaluate it once, use it forever.

  • int is a function (like list) that would have to be evaluated each time it's required as an initializer.

The construct [] is literal, like 0, that means "this exact object".

The problem is that some people hope that it to means list as in "evaluate this function for me, please, to get the object that is the initializer".

It would be a crushing burden to add the necessary if statement to do this evaluation all the time. It's better to take all arguments as literals and not do any additional function evaluation as part of trying to do a function evaluation.

Also, more fundamentally, it's technically impossible to implement argument defaults as function evaluations.

Consider, for a moment the recursive horror of this kind of circularity. Let's say that instead of default values being literals, we allow them to be functions which are evaluated each time a parameter's default values are required.

[This would parallel the way collections.defaultdict works.]

def aFunc( a=another_func ):
    return a*2

def another_func( b=aFunc ):
    return b*3

What is the value of another_func()? To get the default for b, it must evaluate aFunc, which requires an eval of another_func. Oops.

做个少女永远怀春 2024-08-16 16:28:54

当然,在你的情况下很难理解。但您必须看到,每次评估默认参数都会给系统带来沉重的运行时负担。

您还应该知道,在容器类型的情况下可能会出现此问题 - 但您可以通过明确说明来规避它:

def __init__(self, children = None):
    if children is None:
       children = []
    self.children = children

Of course in your situation it is difficult to understand. But you must see, that evaluating default args every time would lay a heavy runtime burden on the system.

Also you should know, that in case of container types this problem may occur -- but you could circumvent it by making the thing explicit:

def __init__(self, children = None):
    if children is None:
       children = []
    self.children = children
心欲静而疯不止 2024-08-16 16:28:54

我认为这也违反直觉,直到我了解了 Python 如何实现默认参数。

函数是一个对象。在加载时,Python 创建函数对象,计算 def 语句中的默认值,将它们放入一个元组中,并将该元组添加为名为 func_defaults 的函数的属性。然后,当调用函数时,如果调用未提供值,Python 将从 func_defaults 中获取默认值。

例如:

>>> class C():
        pass

>>> def f(x=C()):
        pass

>>> f.func_defaults
(<__main__.C instance at 0x0298D4B8>,)

因此,所有对不提供参数的 f 的调用都将使用相同的 C 实例,因为这是默认值。

至于为什么 Python 这样做:嗯,该元组可以包含每次需要默认参数值时都会被调用的函数。除了显而易见的性能问题之外,您还开始陷入一系列特殊情况,例如存储文字值而不是非可变类型的函数,以避免不必要的函数调用。当然,这对性能也有很多影响。

实际行为非常简单。如果您希望在运行时通过函数调用生成默认值,则有一个简单的解决方法:

def f(x = None):
   if x == None:
      x = g()

I thought this was counterintuitive too, until I learned how Python implements default arguments.

A function's an object. At load time, Python creates the function object, evaluates the defaults in the def statement, puts them into a tuple, and adds that tuple as an attribute of the function named func_defaults. Then, when a function is called, if the call doesn't provide a value, Python grabs the default value out of func_defaults.

For instance:

>>> class C():
        pass

>>> def f(x=C()):
        pass

>>> f.func_defaults
(<__main__.C instance at 0x0298D4B8>,)

So all calls to f that don't provide an argument will use the same instance of C, because that's the default value.

As far as why Python does it this way: well, that tuple could contain functions that would get called every time a default argument value was needed. Apart from the immediately obvious problem of performance, you start getting into a universe of special cases, like storing literal values instead of functions for non-mutable types to avoid unnecessary function calls. And of course there are performance implications galore.

The actual behavior is really simple. And there's a trivial workaround, in the case where you want a default value to be produced by a function call at runtime:

def f(x = None):
   if x == None:
      x = g()
烟沫凡尘 2024-08-16 16:28:54

解决此问题的方法,此处讨论(而且非常可靠),是:

class Node(object):
    def __init__(self, children = None):
        self.children = [] if children is None else children

至于为什么要从 von Löwis 那里寻找答案,但这很可能是因为由于 Python 的架构,函数定义创建了一个代码对象,并且可能没有在默认参数中使用像这样的引用类型的工具。

The workaround for this, discussed here (and very solid), is:

class Node(object):
    def __init__(self, children = None):
        self.children = [] if children is None else children

As for why look for an answer from von Löwis, but it's likely because the function definition makes a code object due to the architecture of Python, and there might not be a facility for working with reference types like this in default arguments.

云巢 2024-08-16 16:28:54

这来自于Python对语法和执行简单性的重视。 def 语句出现在执行过程中的某个时刻。当 python 解释器到达该点时,它会计算该行中的代码,然后从函数体创建一个代码对象,该对象将在稍后调用该函数时运行。

这是函数声明和函数体之间的简单划分。当代码中到达该声明时,就会执行该声明。主体在调用时执行。请注意,每次到达时都会执行声明,因此您可以通过循环创建多个函数。

funcs = []
for x in xrange(5):
    def foo(x=x, lst=[]):
        lst.append(x)
        return lst
    funcs.append(foo)
for func in funcs:
    print "1: ", func()
    print "2: ", func()

已创建五个单独的函数,并且每次执行函数声明时都会创建一个单独的列表。在每次循环 funcs 时,相同的函数在每次遍历时都会执行两次,每次都使用相同的列表。这给出了结果:

1:  [0]
2:  [0, 0]
1:  [1]
2:  [1, 1]
1:  [2]
2:  [2, 2]
1:  [3]
2:  [3, 3]
1:  [4]
2:  [4, 4]

其他人已经为您提供了解决方法,即使用 param=None,并在值为 None 时在正文中分配一个列表,这完全是惯用的 python。它有点难看,但是简单性很强大,而且解决方法也不是太痛苦。

编辑添加:有关此问题的更多讨论,请参阅 effbot 的文章:http://effbot.org/ zone/default-values.htm,以及语言参考,此处:http: //docs.python.org/reference/compound_stmts.html#function

This comes from python's emphasis on syntax and execution simplicity. a def statement occurs at a certain point during execution. When the python interpreter reaches that point, it evaluates the code in that line, and then creates a code object from the body of the function, which will be run later, when you call the function.

It's a simple split between function declaration and function body. The declaration is executed when it is reached in the code. The body is executed at call time. Note that the declaration is executed every time it is reached, so you can create multiple functions by looping.

funcs = []
for x in xrange(5):
    def foo(x=x, lst=[]):
        lst.append(x)
        return lst
    funcs.append(foo)
for func in funcs:
    print "1: ", func()
    print "2: ", func()

Five separate functions have been created, with a separate list created each time the function declaration was executed. On each loop through funcs, the same function is executed twice on each pass through, using the same list each time. This gives the results:

1:  [0]
2:  [0, 0]
1:  [1]
2:  [1, 1]
1:  [2]
2:  [2, 2]
1:  [3]
2:  [3, 3]
1:  [4]
2:  [4, 4]

Others have given you the workaround, of using param=None, and assigning a list in the body if the value is None, which is fully idiomatic python. It's a little ugly, but the simplicity is powerful, and the workaround is not too painful.

Edited to add: For more discussion on this, see effbot's article here: http://effbot.org/zone/default-values.htm, and the language reference, here: http://docs.python.org/reference/compound_stmts.html#function

掀纱窥君容 2024-08-16 16:28:54

我将通过补充其他帖子中的主要论点来提供不同意见。

在执行函数时评估默认参数会对性能产生不利影响。

我觉得这很难相信。如果像 foo='some_string' 这样的默认参数分配确实增加了不可接受的开销,我确信可以识别​​对不可变文字的分配并预先计算它们。

如果您想要使用诸如 foo = [] 之类的可变对象进行默认分配,只需使用 foo = None,后跟 foo = foo 或 [] 在函数体中。

虽然这在个别情况下可能不成问题,但作为一种设计模式,它并不是很优雅。它添加了样板代码并隐藏了默认参数值。如果 foo 可以是像 numpy 数组这样具有未定义真值的对象,则像 foo = foo 或 ... 这样的模式不起作用。在 None 是可能有意传递的有意义的参数值的情况下,它不能用作哨兵,并且这种解决方法变得非常丑陋。

当前的行为对于应该在函数调用之间共享的可变默认对象很有用。

我很高兴看到相反的证据,但根据我的经验,这种用例比每次调用函数时都应该重新创建的可变对象要少得多。对我来说,这似乎也是一个更高级的用例,而空容器的意外默认分配是新 Python 程序员的常见问题。因此,最小惊讶原则建议在执行函数时应评估默认参数值。

此外,在我看来,对于应该在函数调用之间共享的可变对象,存在一个简单的解决方法:在函数外部初始化它们。

所以我认为这是一个糟糕的设计决策。我的猜测是,选择它是因为它的实现实际上更简单,并且因为它有一个有效的(尽管有限)用例。不幸的是,我认为这一点永远不会改变,因为 Python 核心开发人员希望避免重复 Python 3 引入的向后不兼容性。

I'll provide a dissenting opinion, by addessing the main arguments in the other posts.

Evaluating default arguments when the function is executed would be bad for performance.

I find this hard to believe. If default argument assignments like foo='some_string' really add an unacceptable amount of overhead, I'm sure it would be possible to identify assignments to immutable literals and precompute them.

If you want a default assignment with a mutable object like foo = [], just use foo = None, followed by foo = foo or [] in the function body.

While this may be unproblematic in individual instances, as a design pattern it's not very elegant. It adds boilerplate code and obscures default argument values. Patterns like foo = foo or ... don't work if foo can be an object like a numpy array with undefined truth value. And in situations where None is a meaningful argument value that may be passed intentionally, it can't be used as a sentinel and this workaround becomes really ugly.

The current behaviour is useful for mutable default objects that should be shared accross function calls.

I would be happy to see evidence to the contrary, but in my experience this use case is much less frequent than mutable objects that should be created anew every time the function is called. To me it also seems like a more advanced use case, whereas accidental default assignments with empty containers are a common gotcha for new Python programmers. Therefore, the principle of least astonishment suggests default argument values should be evaluated when the function is executed.

In addition, it seems to me that there exists an easy workaround for mutable objects that should be shared across function calls: initialise them outside the function.

So I would argue that this was a bad design decision. My guess is that it was chosen because its implementation is actually simpler and because it has a valid (albeit limited) use case. Unfortunately, I don't think this will ever change, since the core Python developers want to avoid a repeat of the amount of backwards incompatibility that Python 3 introduced.

倾听心声的旋律 2024-08-16 16:28:54

因为如果他们有,那么有人会提出一个问题,询问为什么不是相反:-p

假设现在他们有。如果需要,您将如何实施当前行为?在函数内创建新对象很容易,但您不能“取消创建”它们(您可以删除它们,但它不一样)。

Because if they had, then someone would post a question asking why it wasn't the other way around :-p

Suppose now that they had. How would you implement the current behaviour if needed? It's easy to create new objects inside a function, but you cannot "uncreate" them (you can delete them, but it's not the same).

宣告ˉ结束 2024-08-16 16:28:54

Python 函数定义只是代码,就像所有其他代码一样;它们并不像某些语言那样“神奇”。例如,在 Java 中,您可以将“现在”引用为“稍后”定义的内容:

public static void foo() { bar(); }
public static void main(String[] args) { foo(); }
public static void bar() {}

但在 Python 中

def foo(): bar()
foo()   # boom! "bar" has no binding yet
def bar(): pass
foo()   # ok

,默认参数是在计算该行代码时计算的!

Python function definitions are just code, like all the other code; they're not "magical" in the way that some languages are. For example, in Java you could refer "now" to something defined "later":

public static void foo() { bar(); }
public static void main(String[] args) { foo(); }
public static void bar() {}

but in Python

def foo(): bar()
foo()   # boom! "bar" has no binding yet
def bar(): pass
foo()   # ok

So, the default argument is evaluated at the moment that that line of code is evaluated!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文