“最少的惊讶”和可变默认参数

发布于 2024-09-28 15:33:53 字数 1169 浏览 8 评论 0 原文

任何长期使用 Python 的人都会被以下问题所困扰(或撕成碎片):

def foo(a=[]):
    a.append(5)
    return a

Python 新手会期望这个不带参数调用的函数总是返回一个只有一个元素的列表:[5] 。结果却非常不同,而且非常令人惊讶(对于新手来说):

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]
>>> foo()
[5, 5, 5, 5]
>>> foo()

我的一位经理曾经第一次遇到这个功能,并称其为该语言的“一个戏剧性的设计缺陷”。我回答说,这种行为是有底层解释的,如果不了解其内部原理,确实会非常令人费解和意外。但是,我无法(对自己)回答以下问题:在函数定义时而不是在函数执行时绑定默认参数的原因是什么?我怀疑经验丰富的行为是否有实际用途(谁真的在C中使用了静态变量,而不滋生错误?)

编辑

Utaal 特别是,我进一步阐述了:

def a():
    print("a executed")
    return []

           
def b(x=a()):
    x.append(5)
    print(x)

a executed
>>> b()
[5]
>>> b()
[5, 5]

对我来说,设计决策似乎是相对的将参数的范围放在哪里:在函数内部,还是与它“在一起”?

在函数内部进行绑定意味着当函数被调用时,x 有效地绑定到指定的默认值,而不是定义,这会带来一个严重的缺陷:def该行将是“混合”,因为部分绑定(函数对象的)将在定义时发生,而部分绑定(默认参数的分配)将在函数调用时发生。

实际的行为更加一致:当执行该行时,即在函数定义时,该行的所有内容都会被评估。

Anyone tinkering with Python long enough has been bitten (or torn to pieces) by the following issue:

def foo(a=[]):
    a.append(5)
    return a

Python novices would expect this function called with no parameter to always return a list with only one element: [5]. The result is instead very different, and very astonishing (for a novice):

>>> foo()
[5]
>>> foo()
[5, 5]
>>> foo()
[5, 5, 5]
>>> foo()
[5, 5, 5, 5]
>>> foo()

A manager of mine once had his first encounter with this feature, and called it "a dramatic design flaw" of the language. I replied that the behavior had an underlying explanation, and it is indeed very puzzling and unexpected if you don't understand the internals. However, I was not able to answer (to myself) the following question: what is the reason for binding the default argument at function definition, and not at function execution? I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs?)

Edit:

Baczek made an interesting example. Together with most of your comments and Utaal's in particular, I elaborated further:

def a():
    print("a executed")
    return []

           
def b(x=a()):
    x.append(5)
    print(x)

a executed
>>> b()
[5]
>>> b()
[5, 5]

To me, it seems that the design decision was relative to where to put the scope of parameters: inside the function, or "together" with it?

Doing the binding inside the function would mean that x is effectively bound to the specified default when the function is called, not defined, something that would present a deep flaw: the def line would be "hybrid" in the sense that part of the binding (of the function object) would happen at definition, and part (assignment of default parameters) at function invocation time.

The actual behavior is more consistent: everything of that line gets evaluated when that line is executed, meaning at function definition.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(30

挽清梦 2024-10-05 15:33:53

实际上,这不是设计缺陷,也不是因为内部或性能原因。它源于这样一个事实:Python 中的函数是一等对象,而不仅仅是一段代码。

一旦你这样想,它就完全有意义了:函数是一个根据其定义求值的对象;默认参数是一种“成员数据”,因此它们的状态可能会从一个调用更改为另一个调用 - 与任何其他对象中的情况完全相同。

无论如何,effbot (Fredrik Lundh) 在 Python 中的默认参数值
我发现它非常清楚,我真的建议阅读它以更好地了解函数对象的工作原理。

Actually, this is not a design flaw, and it is not because of internals or performance. It comes simply from the fact that functions in Python are first-class objects, and not only a piece of code.

As soon as you think of it this way, then it completely makes sense: a function is an object being evaluated on its definition; default parameters are kind of "member data" and therefore their state may change from one call to the other - exactly as in any other object.

In any case, the effbot (Fredrik Lundh) has a very nice explanation of the reasons for this behavior in Default Parameter Values in Python.
I found it very clear, and I really suggest reading it for a better knowledge of how function objects work.

想挽留 2024-10-05 15:33:53

假设你有下面的代码

fruits = ("apples", "bananas", "loganberries")

def eat(food=fruits):
    ...

当我看到 eat 的声明时,最不令人惊讶的是,如果没有给出第一个参数,它将等于元组 ("apples", "bananas", 如果默认参数是在函数执行而不是函数声明时绑定的

但是,假设稍后在代码中,我做了类似的事情,

def some_random_function():
    global fruits
    fruits = ("blueberries", "mangos")

,我会惊讶地(以一种非常糟糕的方式)发现水果已经被改变了。在我看来,这比发现上面的 foo 函数正在改变列表更令人惊讶。

真正的问题在于可变变量,所有语言都在某种程度上存在这个问题。这里有一个问题:假设在 Java 中我有以下代码:

StringBuffer s = new StringBuffer("Hello World!");
Map<StringBuffer,Integer> counts = new HashMap<StringBuffer,Integer>();
counts.put(s, 5);
s.append("!!!!");
System.out.println( counts.get(s) );  // does this work?

现在,我的映射在放入映射时使用 StringBuffer 键的值,还是通过引用存储该键?不管怎样,有人都会感到惊讶;要么是尝试使用与放入对象的值相同的值从 Map 中获取对象的人,要么是似乎无法检索对象的人,即使他们使用了密钥'正在使用的对象实际上与用于将其放入映射中的对象相同(这实际上就是为什么Python不允许将其可变内置数据类型用作字典键的原因)。

你的例子是Python新手会感到惊讶和被咬伤的一个很好的例子。但我认为,如果我们“解决”这个问题,那么只会造成一种不同的情况,他们会被咬,而且这种情况会更不直观。而且,在处理可变变量时总是如此;你总是会遇到这样的情况:某人可能会根据他们所编写的代码直观地期望一种或相反的行为。

我个人喜欢Python当前的方法:默认函数参数在定义函数时计算,并且该对象始终是默认值。我想他们可以使用空列表进行特殊情况,但这种特殊的情况会引起更多的惊讶,更不用说向后不兼容了。

Suppose you have the following code

fruits = ("apples", "bananas", "loganberries")

def eat(food=fruits):
    ...

When I see the declaration of eat, the least astonishing thing is to think that if the first parameter is not given, that it will be equal to the tuple ("apples", "bananas", "loganberries")

However, suppose later on in the code, I do something like

def some_random_function():
    global fruits
    fruits = ("blueberries", "mangos")

then if default parameters were bound at function execution rather than function declaration, I would be astonished (in a very bad way) to discover that fruits had been changed. This would be more astonishing IMO than discovering that your foo function above was mutating the list.

The real problem lies with mutable variables, and all languages have this problem to some extent. Here's a question: suppose in Java I have the following code:

StringBuffer s = new StringBuffer("Hello World!");
Map<StringBuffer,Integer> counts = new HashMap<StringBuffer,Integer>();
counts.put(s, 5);
s.append("!!!!");
System.out.println( counts.get(s) );  // does this work?

Now, does my map use the value of the StringBuffer key when it was placed into the map, or does it store the key by reference? Either way, someone is astonished; either the person who tried to get the object out of the Map using a value identical to the one they put it in with, or the person who can't seem to retrieve their object even though the key they're using is literally the same object that was used to put it into the map (this is actually why Python doesn't allow its mutable built-in data types to be used as dictionary keys).

Your example is a good one of a case where Python newcomers will be surprised and bitten. But I'd argue that if we "fixed" this, then that would only create a different situation where they'd be bitten instead, and that one would be even less intuitive. Moreover, this is always the case when dealing with mutable variables; you always run into cases where someone could intuitively expect one or the opposite behavior depending on what code they're writing.

I personally like Python's current approach: default function arguments are evaluated when the function is defined and that object is always the default. I suppose they could special-case using an empty list, but that kind of special casing would cause even more astonishment, not to mention be backwards incompatible.

行雁书 2024-10-05 15:33:53

文档的相关部分

执行函数定义时,默认参数值从左到右计算。这意味着在定义函数时,表达式会计算一次,并且相同的“预计算”每次调用都会使用该值。当默认参数是可变对象(例如列表或字典)时,理解这一点尤其重要:如果函数修改对象(例如,通过将项目附加到列表),则默认值实际上被修改。这通常不是我们想要的。解决这个问题的方法是使用 None 作为默认值,并在函数体内显式测试它,例如:

def whats_on_the_telly(penguin=None):
    如果企鹅没有:
        企鹅=[]
    penguin.append("动物园的财产")
    返回企鹅

The relevant part of the documentation:

Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, e.g.:

def whats_on_the_telly(penguin=None):
    if penguin is None:
        penguin = []
    penguin.append("property of the zoo")
    return penguin
沧笙踏歌 2024-10-05 15:33:53

我对 Python 解释器的内部工作原理一无所知(而且我也不是编译器和解释器方面的专家),所以如果我提出任何不明智或不可能的建议,请不要责怪我。

假设 python 对象是可变的,我认为在设计默认参数时应该考虑到这一点。
当您实例化一个列表时:

a = []

您期望获得一个由 a 引用的列表。

为什么 a=[] 应该在

def x(a=[]):

函数定义时而不是在调用时实例化新列表?
这就像您在问“如果用户不提供参数,则实例化一个新列表并使用它,就好像它是由调用者生成的一样”。
我认为这是不明确的:

def x(a=datetime.datetime.now()):

用户,您是否希望 a 默认为与您定义或执行 x 时对应的日期时间?
在这种情况下,与前一个一样,我将保持相同的行为,就好像默认参数“赋值”是函数的第一条指令(在函数调用时调用 datetime.now() ) 。
另一方面,如果用户想要定义时间映射,他可以写:

b = datetime.datetime.now()
def x(a=b):

我知道,我知道:这是一个闭包。或者,Python 可能会提供一个关键字来强制定义时绑定:

def x(static a=b):

I know nothing about the Python interpreter inner workings (and I'm not an expert in compilers and interpreters either) so don't blame me if I propose anything unsensible or impossible.

Provided that python objects are mutable I think that this should be taken into account when designing the default arguments stuff.
When you instantiate a list:

a = []

you expect to get a new list referenced by a.

Why should the a=[] in

def x(a=[]):

instantiate a new list on function definition and not on invocation?
It's just like you're asking "if the user doesn't provide the argument then instantiate a new list and use it as if it was produced by the caller".
I think this is ambiguous instead:

def x(a=datetime.datetime.now()):

user, do you want a to default to the datetime corresponding to when you're defining or executing x?
In this case, as in the previous one, I'll keep the same behaviour as if the default argument "assignment" was the first instruction of the function (datetime.now() called on function invocation).
On the other hand, if the user wanted the definition-time mapping he could write:

b = datetime.datetime.now()
def x(a=b):

I know, I know: that's a closure. Alternatively Python might provide a keyword to force definition-time binding:

def x(static a=b):
随遇而安 2024-10-05 15:33:53

嗯,原因很简单,绑定是在执行代码时完成的,并且执行函数定义时,嗯......当定义函数时。

比较一下:

class BananaBunch:
    bananas = []

    def addBanana(self, banana):
        self.bananas.append(banana)

这段代码遇到了完全相同的意外事件。 bananas 是一个类属性,因此,当您向其中添加内容时,它会添加到该类的所有实例中。原因完全一样。

这只是“它是如何工作的”,在函数情况下使其以不同的方式工作可能会很复杂,在类情况下可能是不可能的,或者至少会大大减慢对象实例化的速度,因为您必须保留类代码并在创建对象时执行它。

是的,这是意料之外的。但一旦投入使用,它就会完全符合 Python 的一般工作方式。事实上,这是一个很好的教学工具,一旦你理解了为什么会发生这种情况,你就会更好地理解 Python。

也就是说,它应该在任何优秀的 Python 教程中占据显着位置。因为正如你提到的,每个人迟早都会遇到这个问题。

Well, the reason is quite simply that bindings are done when code is executed, and the function definition is executed, well... when the functions is defined.

Compare this:

class BananaBunch:
    bananas = []

    def addBanana(self, banana):
        self.bananas.append(banana)

This code suffers from the exact same unexpected happenstance. bananas is a class attribute, and hence, when you add things to it, it's added to all instances of that class. The reason is exactly the same.

It's just "How It Works", and making it work differently in the function case would probably be complicated, and in the class case likely impossible, or at least slow down object instantiation a lot, as you would have to keep the class code around and execute it when objects are created.

Yes, it is unexpected. But once the penny drops, it fits in perfectly with how Python works in general. In fact, it's a good teaching aid, and once you understand why this happens, you'll grok python much better.

That said it should feature prominently in any good Python tutorial. Because as you mention, everyone runs into this problem sooner or later.

葬シ愛 2024-10-05 15:33:53

你为什么不反省一下呢?

我真的很惊讶没有人对可调用对象执行 Python 提供的富有洞察力的内省(23 apply)。

给定一个简单的小函数 func 定义为:

>>> def func(a = []):
...    a.append(5)

当 Python 遇到它时,它要做的第一件事就是编译它,以便为该函数创建一个 code 对象。完成此编译步骤后,Python评估*,然后存储默认参数(此处为空列表[])函数对象本身。。正如最上面的答案所提到的:列表 a 现在可以被视为函数 func成员

因此,让我们进行一些内省,检查列表如何在函数对象内部扩展。我使用 Python 3.x 来实现此目的,对于 Python 2 同样适用(在 Python 2 中使用 __defaults__func_defaults;是的,两个同一事物的名称)。

执行前的函数:

>>> def func(a = []):
...     a.append(5)
...     

Python 执行此定义后,它将采用指定的任何默认参数(此处为 a = [])和 将它们填充到函数对象的 __defaults__ 属性中(相关部分:Callables):

>>> func.__defaults__
([],)

好的,所以正如预期的那样,空列表作为 __defaults__ 中的单个条目。

执行后的函数:

现在让我们执行这个函数:

>>> func()

现在,让我们再次看看那些 __defaults__

>>> func.__defaults__
([5],)

惊讶吗? 对象内部的值发生了变化!现在,对函数的连续调用将简单地附加到嵌入的 list 对象中:

>>> func(); func(); func()
>>> func.__defaults__
([5, 5, 5, 5],)

所以,你已经知道了,发生这个“缺陷”的原因是因为默认参数是函数对象的一部分。这里没有什么奇怪的事情发生,只是有点令人惊讶。

解决这个问题的常见解决方案是使用 None 作为默认值,然后在函数体中进行初始化:

def func(a = None):
    # or: a = [] if a is None else a
    if a is None:
        a = []

由于函数体每次都会重新执行,因此如果没有参数,您总是会得到一个新的空列表。传递给a


要进一步验证 __defaults__ 中的列表与函数 func 中使用的列表相同,您只需更改函数以返回以下的 id函数体内使用的列表a。然后,将其与 __defaults__ 中的列表(__defaults__ 中的位置 [0])进行比较,您将看到这些确实是如何引用相同的列表实例:

>>> def func(a = []): 
...     a.append(5)
...     return id(a)
>>>
>>> id(func.__defaults__[0]) == func()
True

都具有内省的力量!


* 要验证 Python 在函数编译期间计算默认参数,请尝试执行以下命令:

def bar(a=input('Did you just see me without calling the function?')): 
    pass  # use raw_input in Py2

您会注意到,在构建过程之前调用 input()函数并将其绑定到名称 bar 已完成。

Why don't you introspect?

I'm really surprised no one has performed the insightful introspection offered by Python (2 and 3 apply) on callables.

Given a simple little function func defined as:

>>> def func(a = []):
...    a.append(5)

When Python encounters it, the first thing it will do is compile it in order to create a code object for this function. While this compilation step is done, Python evaluates* and then stores the default arguments (an empty list [] here) in the function object itself. As the top answer mentioned: the list a can now be considered a member of the function func.

So, let's do some introspection, a before and after to examine how the list gets expanded inside the function object. I'm using Python 3.x for this, for Python 2 the same applies (use __defaults__ or func_defaults in Python 2; yes, two names for the same thing).

Function Before Execution:

>>> def func(a = []):
...     a.append(5)
...     

After Python executes this definition it will take any default parameters specified (a = [] here) and cram them in the __defaults__ attribute for the function object (relevant section: Callables):

>>> func.__defaults__
([],)

O.k, so an empty list as the single entry in __defaults__, just as expected.

Function After Execution:

Let's now execute this function:

>>> func()

Now, let's see those __defaults__ again:

>>> func.__defaults__
([5],)

Astonished? The value inside the object changes! Consecutive calls to the function will now simply append to that embedded list object:

>>> func(); func(); func()
>>> func.__defaults__
([5, 5, 5, 5],)

So, there you have it, the reason why this 'flaw' happens, is because default arguments are part of the function object. There's nothing weird going on here, it's all just a bit surprising.

The common solution to combat this is to use None as the default and then initialize in the function body:

def func(a = None):
    # or: a = [] if a is None else a
    if a is None:
        a = []

Since the function body is executed anew each time, you always get a fresh new empty list if no argument was passed for a.


To further verify that the list in __defaults__ is the same as that used in the function func you can just change your function to return the id of the list a used inside the function body. Then, compare it to the list in __defaults__ (position [0] in __defaults__) and you'll see how these are indeed refering to the same list instance:

>>> def func(a = []): 
...     a.append(5)
...     return id(a)
>>>
>>> id(func.__defaults__[0]) == func()
True

All with the power of introspection!


* To verify that Python evaluates the default arguments during compilation of the function, try executing the following:

def bar(a=input('Did you just see me without calling the function?')): 
    pass  # use raw_input in Py2

as you'll notice, input() is called before the process of building the function and binding it to the name bar is made.

情绪少女 2024-10-05 15:33:53

我曾经认为在运行时创建对象是更好的方法。我现在不太确定,因为你确实失去了一些有用的功能,尽管为了防止新手混淆,这可能是值得的。这样做的缺点是:

1。性能

def foo(arg=something_expensive_to_compute())):
    ...

如果使用调用时评估,则每次在不带参数的情况下使用函数时都会调用昂贵的函数。您要么为每次调用支付昂贵的价格,要么需要在外部手动缓存该值,从而污染您的命名空间并增加冗长。

2.强制绑定参数

一个有用的技巧是在创建 lambda 时将 lambda 参数绑定到变量的当前绑定。例如:

funcs = [ lambda i=i: i for i in range(10)]

这将返回一个分别返回 0,1,2,3... 的函数列表。如果行为发生更改,它们会将 i 绑定到 i 的调用时间值,因此您将获得全部返回 9< 的函数列表/代码>。

否则,实现此目的的唯一方法是使用 i 边界创建进一步的闭包,即:

def make_func(i): return lambda: i
funcs = [make_func(i) for i in range(10)]

3。内省

考虑代码:

def foo(a='test', b=100, c=[]):
   print a,b,c

我们可以使用inspect模块获取有关参数和默认值的信息,该

>>> inspect.getargspec(foo)
(['a', 'b', 'c'], None, None, ('test', 100, []))

信息对于文档生成、元编程、装饰器等非常有用。

现在,假设可以更改默认值的行为,以便这相当于:

_undefined = object()  # sentinel value

def foo(a=_undefined, b=_undefined, c=_undefined)
    if a is _undefined: a='test'
    if b is _undefined: b=100
    if c is _undefined: c=[]

但是,我们已经失去了内省的能力,无法查看默认参数是什么。因为对象尚未构造,所以如果不实际调用该函数,我们就无法获取它们。我们能做的最好的事情就是存储源代码并将其作为字符串返回。

I used to think that creating the objects at runtime would be the better approach. I'm less certain now, since you do lose some useful features, though it may be worth it regardless simply to prevent newbie confusion. The disadvantages of doing so are:

1. Performance

def foo(arg=something_expensive_to_compute())):
    ...

If call-time evaluation is used, then the expensive function is called every time your function is used without an argument. You'd either pay an expensive price on each call, or need to manually cache the value externally, polluting your namespace and adding verbosity.

2. Forcing bound parameters

A useful trick is to bind parameters of a lambda to the current binding of a variable when the lambda is created. For example:

funcs = [ lambda i=i: i for i in range(10)]

This returns a list of functions that return 0,1,2,3... respectively. If the behaviour is changed, they will instead bind i to the call-time value of i, so you would get a list of functions that all returned 9.

The only way to implement this otherwise would be to create a further closure with the i bound, ie:

def make_func(i): return lambda: i
funcs = [make_func(i) for i in range(10)]

3. Introspection

Consider the code:

def foo(a='test', b=100, c=[]):
   print a,b,c

We can get information about the arguments and defaults using the inspect module, which

>>> inspect.getargspec(foo)
(['a', 'b', 'c'], None, None, ('test', 100, []))

This information is very useful for things like document generation, metaprogramming, decorators etc.

Now, suppose the behaviour of defaults could be changed so that this is the equivalent of:

_undefined = object()  # sentinel value

def foo(a=_undefined, b=_undefined, c=_undefined)
    if a is _undefined: a='test'
    if b is _undefined: b=100
    if c is _undefined: c=[]

However, we've lost the ability to introspect, and see what the default arguments are. Because the objects haven't been constructed, we can't ever get hold of them without actually calling the function. The best we could do is to store off the source code and return that as a string.

淡莣 2024-10-05 15:33:53

捍卫 Python 的 5 点

  1. 简单性:行为在以下意义上是简单的:
    大多数人只会陷入这个陷阱一次,而不是多次。

  2. 一致性:Python总是传递对象,而不是名称。
    显然,默认参数是函数的一部分
    标题(不是函数体)。因此应该评估
    在模块加载时(并且仅在模块加载时,除非嵌套),而不是
    在函数调用时。

  3. 有用性:正如 Frederik Lundh 在他的解释中指出的那样
    “Python 中的默认参数值”< /a>,
    当前的行为对于高级编程非常有用。
    (谨慎使用。)


  4. 足够的文档:在最基本的 Python 文档中,
    在教程中,该问题被大声宣布为
    小节中的“重要警告”
    “有关定义函数的更多信息”
    警告甚至使用粗体,
    很少应用于标题之外。
    RTFM:阅读精美手册。

  5. 元学习:掉进陷阱其实是一个很
    有帮助的时刻(至少如果你是一个反思型学习者),
    因为你随后会更好地理解这一点
    上面的“一致性”将会
    教你很多关于 Python 的知识。

5 points in defense of Python

  1. Simplicity: The behavior is simple in the following sense:
    Most people fall into this trap only once, not several times.

  2. Consistency: Python always passes objects, not names.
    The default parameter is, obviously, part of the function
    heading (not the function body). It therefore ought to be evaluated
    at module load time (and only at module load time, unless nested), not
    at function call time.

  3. Usefulness: As Frederik Lundh points out in his explanation
    of "Default Parameter Values in Python", the
    current behavior can be quite useful for advanced programming.
    (Use sparingly.)

  4. Sufficient documentation: In the most basic Python documentation,
    the tutorial, the issue is loudly announced as
    an "Important warning" in the first subsection of Section
    "More on Defining Functions".
    The warning even uses boldface,
    which is rarely applied outside of headings.
    RTFM: Read the fine manual.

  5. Meta-learning: Falling into the trap is actually a very
    helpful moment (at least if you are a reflective learner),
    because you will subsequently better understand the point
    "Consistency" above and that will
    teach you a great deal about Python.

你曾走过我的故事 2024-10-05 15:33:53

这种行为很容易解释为:

  1. 函数(类等)声明仅执行一次,创建所有默认值对象,
  2. 所有内容都通过引用传递

所以:

def x(a=0, b=[], c=[], d=0):
    a = a + 1
    b = b + [1]
    c.append(1)
    print a, b, c
  1. a不会改变 - 每个赋值调用都会创建新的int对象 - 打印新对象
  2. b 不变 - 新数组是根据默认值构建的,并打印
  3. c 更改 - 对同一对象执行操作 - 并且打印它

This behavior is easy explained by:

  1. function (class etc.) declaration is executed only once, creating all default value objects
  2. everything is passed by reference

So:

def x(a=0, b=[], c=[], d=0):
    a = a + 1
    b = b + [1]
    c.append(1)
    print a, b, c
  1. a doesn't change - every assignment call creates new int object - new object is printed
  2. b doesn't change - new array is build from default value and printed
  3. c changes - operation is performed on same object - and it is printed
韬韬不绝 2024-10-05 15:33:53

1)所谓的“可变默认参数”问题通常是一个特殊的例子,证明:
“存在此问题的所有函数在实际参数上也会遇到类似的副作用问题,”
这违反了函数式编程的规则,通常是不可取的,应该将两者一起修复。

示例:

def foo(a=[]):                 # the same problematic function
    a.append(5)
    return a

>>> somevar = [1, 2]           # an example without a default parameter
>>> foo(somevar)
[1, 2, 5]
>>> somevar
[1, 2, 5]                      # usually expected [1, 2]

解决方案副本
绝对安全的解决方案是首先复制深度复制输入对象,然后对副本执行任何操作。

def foo(a=[]):
    a = a[:]     # a copy
    a.append(5)
    return a     # or everything safe by one line: "return a + [5]"

许多内置可变类型都有复制方法,例如 some_dict.copy()some_set.copy() ,或者可以像 somelist[:] 一样轻松复制> 或 list(some_list)。每个对象也可以通过 copy.copy(any_object) 或更彻底地通过 copy.deepcopy() 进行复制(如果可变对象由可变对象组成,则后者很有用) 。有些对象从根本上来说是基于副作用的,例如“文件”对象,并且不能通过副本进行有意义的再现。 复制

类似的问题

class Test(object):            # the original problematic class
  def __init__(self, var1=[]):
    self._var1 = var1

somevar = [1, 2]               # an example without a default parameter
t1 = Test(somevar)
t2 = Test(somevar)
t1._var1.append([1])
print somevar                  # [1, 2, [1]] but usually expected [1, 2]
print t2._var1                 # [1, 2, [1]] but usually expected [1, 2]

它不应该保存在该函数返回的实例的任何公共属性中。 (假设按照惯例,实例的私有属性不应从此类或子类外部修改。即_var1是私有属性)

结论:
输入参数对象不应就地修改(变异),也不应绑定到函数返回的对象中。 (如果我们更喜欢没有副作用的编程,强烈推荐。请参阅有关“副作用”的维基< /a> (前两段与此相关。)
.)

2)
仅当需要对实际参数产生副作用但对默认参数不需要产生副作用时,有用的解决方案是 def ...(var1=None): if var1 is None: var1 = [] 更多..

3) 在某些情况下,可变行为有用的默认参数

1) The so-called problem of "Mutable Default Argument" is in general a special example demonstrating that:
"All functions with this problem suffer also from similar side effect problem on the actual parameter,"
That is against the rules of functional programming, usually undesiderable and should be fixed both together.

Example:

def foo(a=[]):                 # the same problematic function
    a.append(5)
    return a

>>> somevar = [1, 2]           # an example without a default parameter
>>> foo(somevar)
[1, 2, 5]
>>> somevar
[1, 2, 5]                      # usually expected [1, 2]

Solution: a copy
An absolutely safe solution is to copy or deepcopy the input object first and then to do whatever with the copy.

def foo(a=[]):
    a = a[:]     # a copy
    a.append(5)
    return a     # or everything safe by one line: "return a + [5]"

Many builtin mutable types have a copy method like some_dict.copy() or some_set.copy() or can be copied easy like somelist[:] or list(some_list). Every object can be also copied by copy.copy(any_object) or more thorough by copy.deepcopy() (the latter useful if the mutable object is composed from mutable objects). Some objects are fundamentally based on side effects like "file" object and can not be meaningfully reproduced by copy. copying

Example problem for a similar SO question

class Test(object):            # the original problematic class
  def __init__(self, var1=[]):
    self._var1 = var1

somevar = [1, 2]               # an example without a default parameter
t1 = Test(somevar)
t2 = Test(somevar)
t1._var1.append([1])
print somevar                  # [1, 2, [1]] but usually expected [1, 2]
print t2._var1                 # [1, 2, [1]] but usually expected [1, 2]

It shouldn't be neither saved in any public attribute of an instance returned by this function. (Assuming that private attributes of instance should not be modified from outside of this class or subclasses by convention. i.e. _var1 is a private attribute )

Conclusion:
Input parameters objects shouldn't be modified in place (mutated) nor they should not be binded into an object returned by the function. (If we prefere programming without side effects which is strongly recommended. see Wiki about "side effect" (The first two paragraphs are relevent in this context.)
.)

2)
Only if the side effect on the actual parameter is required but unwanted on the default parameter then the useful solution is def ...(var1=None): if var1 is None: var1 = [] More..

3) In some cases is the mutable behavior of default parameters useful.

请远离我 2024-10-05 15:33:53

您要问的是为什么 this:

def func(a=[], b = 2):
    pass

在内部与 this: 不等效,

def func(a=None, b = None):
    a_default = lambda: []
    b_default = lambda: 2
    def actual_func(a=None, b=None):
        if a is None: a = a_default()
        if b is None: b = b_default()
    return actual_func
func = func()

除了显式调用 func(None, None) 的情况,我们将忽略它。

换句话说,为什么不存储每个默认参数,并在调用函数时评估它们,而不是评估默认参数呢?

一个答案可能就在那里——它将有效地将每个具有默认参数的函数转变为闭包。即使它全部隐藏在解释器中而不是成熟的闭包中,数据也必须存储在某个地方。它会更慢并且使用更多内存。

What you're asking is why this:

def func(a=[], b = 2):
    pass

isn't internally equivalent to this:

def func(a=None, b = None):
    a_default = lambda: []
    b_default = lambda: 2
    def actual_func(a=None, b=None):
        if a is None: a = a_default()
        if b is None: b = b_default()
    return actual_func
func = func()

except for the case of explicitly calling func(None, None), which we'll ignore.

In other words, instead of evaluating default parameters, why not store each of them, and evaluate them when the function is called?

One answer is probably right there--it would effectively turn every function with default parameters into a closure. Even if it's all hidden away in the interpreter and not a full-blown closure, the data's got to be stored somewhere. It'd be slower and use more memory.

绅士风度i 2024-10-05 15:33:53

这实际上与默认值无关,只是当您编写具有可变默认值的函数时,它通常会出现意外行为。

>>> def foo(a):
    a.append(5)
    print a

>>> a  = [5]
>>> foo(a)
[5, 5]
>>> foo(a)
[5, 5, 5]
>>> foo(a)
[5, 5, 5, 5]
>>> foo(a)
[5, 5, 5, 5, 5]

这段代码中看不到默认值,但您会遇到完全相同的问题。

问题在于,foo 正在修改从调用者传入的可变变量,而调用者并不期望这样做。如果函数被称为类似 append_5; 的代码,这样的代码就可以了。那么调用者将调用该函数以修改它们传入的值,并且该行为是预期的。但是这样的函数不太可能采用默认参数,并且可能不会返回列表(因为调用者已经拥有对该列表的引用;它刚刚传入的列表)。

带有默认参数的原始 foo 不应修改 a,无论它是显式传入还是获取默认值。您的代码应该单独保留可变参数,除非从上下文/名称/文档中清楚地表明应该修改参数。使用作为参数传入的可变值作为本地临时变量是一个非常糟糕的主意,无论我们是否使用 Python,也无论是否涉及默认参数。

如果您需要在计算某些内容的过程中破坏性地操作本地临时变量,并且需要从参数值开始操作,则需要创建一个副本。

This actually has nothing to do with default values, other than that it often comes up as an unexpected behaviour when you write functions with mutable default values.

>>> def foo(a):
    a.append(5)
    print a

>>> a  = [5]
>>> foo(a)
[5, 5]
>>> foo(a)
[5, 5, 5]
>>> foo(a)
[5, 5, 5, 5]
>>> foo(a)
[5, 5, 5, 5, 5]

No default values in sight in this code, but you get exactly the same problem.

The problem is that foo is modifying a mutable variable passed in from the caller, when the caller doesn't expect this. Code like this would be fine if the function was called something like append_5; then the caller would be calling the function in order to modify the value they pass in, and the behaviour would be expected. But such a function would be very unlikely to take a default argument, and probably wouldn't return the list (since the caller already has a reference to that list; the one it just passed in).

Your original foo, with a default argument, shouldn't be modifying a whether it was explicitly passed in or got the default value. Your code should leave mutable arguments alone unless it is clear from the context/name/documentation that the arguments are supposed to be modified. Using mutable values passed in as arguments as local temporaries is an extremely bad idea, whether we're in Python or not and whether there are default arguments involved or not.

If you need to destructively manipulate a local temporary in the course of computing something, and you need to start your manipulation from an argument value, you need to make a copy.

我做我的改变 2024-10-05 15:33:53

Python:可变默认参数

默认参数在程序运行时开始时将函数编译为函数对象时进行评估。当该函数多次使用时,它们在内存中是并保持相同的对象,并且当发生变化时(如果该对象是可变类型),它们在连续调用中保持变化。

它们会发生变化并保持变化,因为每次调用函数时它们都是同一个对象。

等效代码:

由于在编译和实例化函数对象时列表已绑定到函数,因此:

def foo(mutable_default_argument=[]): # make a list the default argument
    """function that uses a list"""

几乎与此完全相同:

_a_list = [] # create a list in the globals

def foo(mutable_default_argument=_a_list): # make it the default argument
    """function that uses a list"""

del _a_list # remove globals name binding

演示

每次引用它们时验证它们是否是同一个对象

  • 这是一个演示 - 您可以通过查看 列表是在函数完成编译为函数对象之前创建的,
  • 观察每次引用列表时 id 都是相同的,观察
  • 当第二次调用使用它的函数时列表保持更改,
  • 观察从源打印输出的顺序(我方便地为您编号):

example.py

print('1. Global scope being evaluated')

def create_list():
    '''noisily create a list for usage as a kwarg'''
    l = []
    print('3. list being created and returned, id: ' + str(id(l)))
    return l

print('2. example_function about to be compiled to an object')

def example_function(default_kwarg1=create_list()):
    print('appending "a" in default default_kwarg1')
    default_kwarg1.append("a")
    print('list with id: ' + str(id(default_kwarg1)) + 
          ' - is now: ' + repr(default_kwarg1))

print('4. example_function compiled: ' + repr(example_function))


if __name__ == '__main__':
    print('5. calling example_function twice!:')
    example_function()
    example_function()

并使用 python example.py 运行它:

1. Global scope being evaluated
2. example_function about to be compiled to an object
3. list being created and returned, id: 140502758808032
4. example_function compiled: <function example_function at 0x7fc9590905f0>
5. calling example_function twice!:
appending "a" in default default_kwarg1
list with id: 140502758808032 - is now: ['a']
appending "a" in default default_kwarg1
list with id: 140502758808032 - is now: ['a', 'a']

这是否违反了原则“最不惊讶”?

这种执行顺序经常让 Python 新用户感到困惑。如果你了解Python的执行模型,那么它就变得很令人期待了。

对新 Python 用户的通常说明:

但这就是为什么对新用户的通常说明是创建默认参数,如下所示:

def example_function_2(default_kwarg=None):
    if default_kwarg is None:
        default_kwarg = []

这使用 None 单例作为哨兵对象来告诉函数我们是否获得了参数除了默认值之外。如果我们没有得到任何参数,那么我们实际上想要使用一个新的空列表 [] 作为默认值。

正如控制流教程部分所说:

如果您不希望在后续调用之间共享默认值,
你可以这样编写函数:

def f(a, L=None):
    如果 L 为无:
        L = []
    L.追加(a)
    返回L

Python: The Mutable Default Argument

Default arguments get evaluated at the time the function is compiled into a function object, at the start of the program runtime. When used by the function, multiple times by that function, they are and remain the same object in memory, and when mutated (if the object is of a mutable type) they remain mutated on consecutive calls.

They are mutated and stay mutated because they are the same object each time the function is called.

Equivalent code:

Since the list is bound to the function when the function object is compiled and instantiated, this:

def foo(mutable_default_argument=[]): # make a list the default argument
    """function that uses a list"""

is almost exactly equivalent to this:

_a_list = [] # create a list in the globals

def foo(mutable_default_argument=_a_list): # make it the default argument
    """function that uses a list"""

del _a_list # remove globals name binding

Demonstration

Here's a demonstration - you can verify that they are the same object each time they are referenced by

  • seeing that the list is created before the function has finished compiling to a function object,
  • observing that the id is the same each time the list is referenced,
  • observing that the list stays changed when the function that uses it is called a second time,
  • observing the order in which the output is printed from the source (which I conveniently numbered for you):

example.py

print('1. Global scope being evaluated')

def create_list():
    '''noisily create a list for usage as a kwarg'''
    l = []
    print('3. list being created and returned, id: ' + str(id(l)))
    return l

print('2. example_function about to be compiled to an object')

def example_function(default_kwarg1=create_list()):
    print('appending "a" in default default_kwarg1')
    default_kwarg1.append("a")
    print('list with id: ' + str(id(default_kwarg1)) + 
          ' - is now: ' + repr(default_kwarg1))

print('4. example_function compiled: ' + repr(example_function))


if __name__ == '__main__':
    print('5. calling example_function twice!:')
    example_function()
    example_function()

and running it with python example.py:

1. Global scope being evaluated
2. example_function about to be compiled to an object
3. list being created and returned, id: 140502758808032
4. example_function compiled: <function example_function at 0x7fc9590905f0>
5. calling example_function twice!:
appending "a" in default default_kwarg1
list with id: 140502758808032 - is now: ['a']
appending "a" in default default_kwarg1
list with id: 140502758808032 - is now: ['a', 'a']

Does this violate the principle of "Least Astonishment"?

This order of execution is frequently confusing to new users of Python. If you understand the Python execution model, then it becomes quite expected.

The usual instruction to new Python users:

But this is why the usual instruction to new users is to create their default arguments like this instead:

def example_function_2(default_kwarg=None):
    if default_kwarg is None:
        default_kwarg = []

This uses the None singleton as a sentinel object to tell the function whether or not we've gotten an argument other than the default. If we get no argument, then we actually want to use a new empty list, [], as the default.

As the tutorial section on control flow says:

If you don’t want the default to be shared between subsequent calls,
you can write the function like this instead:

def f(a, L=None):
    if L is None:
        L = []
    L.append(a)
    return L
抚你发端 2024-10-05 15:33:53

这个话题已经很繁忙了,但是从我在这里读到的内容来看,以下内容帮助我了解了它的内部工作原理:

def bar(a=[]):
     print id(a)
     a = a + [1]
     print id(a)
     return a

>>> bar()
4484370232
4484524224
[1]
>>> bar()
4484370232
4484524152
[1]
>>> bar()
4484370232 # Never change, this is 'class property' of the function
4484523720 # Always a new object 
[1]
>>> id(bar.func_defaults[0])
4484370232

Already busy topic, but from what I read here, the following helped me realizing how it's working internally:

def bar(a=[]):
     print id(a)
     a = a + [1]
     print id(a)
     return a

>>> bar()
4484370232
4484524224
[1]
>>> bar()
4484370232
4484524152
[1]
>>> bar()
4484370232 # Never change, this is 'class property' of the function
4484523720 # Always a new object 
[1]
>>> id(bar.func_defaults[0])
4484370232
失眠症患者 2024-10-05 15:33:53

最简单的答案可能是“定义就是执行”,因此整个论证没有严格的意义。作为一个更人为的示例,您可以引用以下内容:

def a(): return []

def b(x=a()):
    print x

希望这足以表明在 def 语句的执行时不执行默认参数表达式并不容易或没有意义,或者两个都。

不过,我同意当您尝试使用默认构造函数时这是一个陷阱。

The shortest answer would probably be "definition is execution", therefore the whole argument makes no strict sense. As a more contrived example, you may cite this:

def a(): return []

def b(x=a()):
    print x

Hopefully it's enough to show that not executing the default argument expressions at the execution time of the def statement isn't easy or doesn't make sense, or both.

I agree it's a gotcha when you try to use default constructors, though.

残疾 2024-10-05 15:33:53

这是一个性能优化。由于此功能,您认为这两个函数调用中哪一个更快?

def print_tuple(some_tuple=(1,2,3)):
    print some_tuple

print_tuple()        #1
print_tuple((1,2,3)) #2

我给你一个提示。这是反汇编(请参阅 http://docs.python.org/library/dis.html):

#1

0 LOAD_GLOBAL              0 (print_tuple)
3 CALL_FUNCTION            0
6 POP_TOP
7 LOAD_CONST               0 (None)
10 RETURN_VALUE

#2

 0 LOAD_GLOBAL              0 (print_tuple)
 3 LOAD_CONST               4 ((1, 2, 3))
 6 CALL_FUNCTION            1
 9 POP_TOP
10 LOAD_CONST               0 (None)
13 RETURN_VALUE

我怀疑经验丰富的行为是否有实际用途(谁真的在 C 中使用了静态变量,而不滋生错误?)

正如您所看到的,使用不可变的默认参数时有性能优势。如果它是一个经常调用的函数或者默认参数需要很长时间来构造,这可能会产生影响。另外,请记住,Python 不是 C。在 C 中,您拥有几乎免费的常量。在 Python 中你没有这个好处。

It's a performance optimization. As a result of this functionality, which of these two function calls do you think is faster?

def print_tuple(some_tuple=(1,2,3)):
    print some_tuple

print_tuple()        #1
print_tuple((1,2,3)) #2

I'll give you a hint. Here's the disassembly (see http://docs.python.org/library/dis.html):

#1

0 LOAD_GLOBAL              0 (print_tuple)
3 CALL_FUNCTION            0
6 POP_TOP
7 LOAD_CONST               0 (None)
10 RETURN_VALUE

#2

 0 LOAD_GLOBAL              0 (print_tuple)
 3 LOAD_CONST               4 ((1, 2, 3))
 6 CALL_FUNCTION            1
 9 POP_TOP
10 LOAD_CONST               0 (None)
13 RETURN_VALUE

I doubt the experienced behavior has a practical use (who really used static variables in C, without breeding bugs ?)

As you can see, there is a performance benefit when using immutable default arguments. This can make a difference if it's a frequently called function or the default argument takes a long time to construct. Also, bear in mind that Python isn't C. In C you have constants that are pretty much free. In Python you don't have this benefit.

め可乐爱微笑 2024-10-05 15:33:53

如果考虑以下因素,这种行为并不奇怪:

  1. 尝试分配时只读类属性的行为,以及
  2. 函数是对象(在接受的答案中得到了很好的解释)。

(2) 的作用已在此主题中进行了广泛介绍。 (1) 可能是令人惊讶的因素,因为这种行为在来自其他语言时并不“直观”。

(1) 在 Python 类教程 中进行了描述。尝试为只读类属性赋值:

...在最内部范围之外找到的所有变量都是
只读(尝试写入此类变量只会创建一个
新的局部变量在最里面的范围内,留下相同的
命名外部变量不变
)。

回顾最初的示例并考虑以上几点:

def foo(a=[]):
    a.append(5)
    return a

这里 foo 是一个对象,afoo 的一个属性(可在 >foo.func_defs[0])。由于 a 是一个列表,因此 a 是可变的,因此是 foo 的读写属性。当函数实例化时,它被初始化为签名指定的空列表,并且只要函数对象存在就可以读写。

调用 foo 而不覆盖默认值会使用 foo.func_defs 中的默认值。在这种情况下,foo.func_defs[0] 用于函数对象代码范围内的a。对 a 的更改会更改 foo.func_defs[0],它是 foo 对象的一部分,并在 中的代码执行之间持续存在foo.

现在,将其与模拟默认参数行为的文档中的示例进行比较其他语言,这样每次执行函数时都会使用函数签名默认值:

def foo(a, L=None):
    if L is None:
        L = []
    L.append(a)
    return L

考虑到(1)(2),可以看到为什么这样可以实现所需的行为:

  • 实例化 foo 函数对象时,foo.func_defs[0] 设置为 None(一个不可变对象) 。
  • 当函数以默认值执行时(在函数调用中没有为 L 指定参数),foo.func_defs[0] (None)在本地范围内可用 L
  • L = [] 时,在 foo.func_defs[0] 处的赋值无法成功,因为该属性是只读的。
  • 根据(1)在本地范围内创建一个名为 L 的新局部变量并用于函数调用的剩余部分。因此,foo.func_defs[0] 在将来调用 foo 时保持不变。

This behavior is not surprising if you take the following into consideration:

  1. The behavior of read-only class attributes upon assignment attempts, and that
  2. Functions are objects (explained well in the accepted answer).

The role of (2) has been covered extensively in this thread. (1) is likely the astonishment causing factor, as this behavior is not "intuitive" when coming from other languages.

(1) is described in the Python tutorial on classes. In an attempt to assign a value to a read-only class attribute:

...all variables found outside of the innermost scope are
read-only (an attempt to write to such a variable will simply create a
new local variable in the innermost scope, leaving the identically
named outer variable unchanged
).

Look back to the original example and consider the above points:

def foo(a=[]):
    a.append(5)
    return a

Here foo is an object and a is an attribute of foo (available at foo.func_defs[0]). Since a is a list, a is mutable and is thus a read-write attribute of foo. It is initialized to the empty list as specified by the signature when the function is instantiated, and is available for reading and writing as long as the function object exists.

Calling foo without overriding a default uses that default's value from foo.func_defs. In this case, foo.func_defs[0] is used for a within function object's code scope. Changes to a change foo.func_defs[0], which is part of the foo object and persists between execution of the code in foo.

Now, compare this to the example from the documentation on emulating the default argument behavior of other languages, such that the function signature defaults are used every time the function is executed:

def foo(a, L=None):
    if L is None:
        L = []
    L.append(a)
    return L

Taking (1) and (2) into account, one can see why this accomplishes the desired behavior:

  • When the foo function object is instantiated, foo.func_defs[0] is set to None, an immutable object.
  • When the function is executed with defaults (with no parameter specified for L in the function call), foo.func_defs[0] (None) is available in the local scope as L.
  • Upon L = [], the assignment cannot succeed at foo.func_defs[0], because that attribute is read-only.
  • Per (1), a new local variable also named L is created in the local scope and used for the remainder of the function call. foo.func_defs[0] thus remains unchanged for future invocations of foo.
抱猫软卧 2024-10-05 15:33:53

可能确实如此:

  1. 有人正在使用每种语言/库功能,并且
  2. 在此处切换行为是不明智的,但

坚持上述两个功能是完全一致的,并且仍然提出另一点:

  1. 这是一个令人困惑的功能这在 Python 中是不幸的。

其他答案,或者至少其中一些答案要么提出第 1 点和第 2 点,但不提出第 3 点,或者提出第 3 点,淡化第 1 点和第 2 点。但这三个答案都是正确的。

切换可能是正确的中流的马会要求严重的破坏,并且通过更改 Python 来直观地处理 Stefano 的开头片段可能会产生更多问题。确实,熟悉 Python 内部结构的人可以解释后果的雷区。 但是,

现有的行为并不是 Python 式的,Python 之所以成功,是因为该语言几乎没有什么地方严重违反了“最少惊讶”原则。这是一个真正的问题,无论根除它是否明智。这是一个设计缺陷。如果您通过尝试追踪行为来更好地理解该语言,我可以说 C++ 可以做到所有这些甚至更多;例如,通过处理微妙的指针错误,您可以学到很多东西。但这不是Pythonic:那些足够关心Python并在面对这种行为时坚持下去的人是被该语言所吸引的人,因为Python比其他语言带来的惊喜要少得多。当涉足者和好奇者惊讶于让某些东西运行起来所需的时间如此之少时,他们就会成为 Python 狂热者——这并不是因为设计——我的意思是,隐藏的逻辑难题——违背了被 Python 吸引的程序员的直觉。因为它有效

It may be true that:

  1. Someone is using every language/library feature, and
  2. Switching the behavior here would be ill-advised, but

it is entirely consistent to hold to both of the features above and still make another point:

  1. It is a confusing feature and it is unfortunate in Python.

The other answers, or at least some of them either make points 1 and 2 but not 3, or make point 3 and downplay points 1 and 2. But all three are true.

It may be true that switching horses in midstream here would be asking for significant breakage, and that there could be more problems created by changing Python to intuitively handle Stefano's opening snippet. And it may be true that someone who knew Python internals well could explain a minefield of consequences. However,

The existing behavior is not Pythonic, and Python is successful because very little about the language violates the principle of least astonishment anywhere near this badly. It is a real problem, whether or not it would be wise to uproot it. It is a design flaw. If you understand the language much better by trying to trace out the behavior, I can say that C++ does all of this and more; you learn a lot by navigating, for instance, subtle pointer errors. But this is not Pythonic: people who care about Python enough to persevere in the face of this behavior are people who are drawn to the language because Python has far fewer surprises than other language. Dabblers and the curious become Pythonistas when they are astonished at how little time it takes to get something working--not because of a design fl--I mean, hidden logic puzzle--that cuts against the intuitions of programmers who are drawn to Python because it Just Works.

往日情怀 2024-10-05 15:33:53

使用 None 的简单解决方法

>>> def bar(b, data=None):
...     data = data or []
...     data.append(b)
...     return data
... 
>>> bar(3)
[3]
>>> bar(3)
[3]
>>> bar(3)
[3]
>>> bar(3, [34])
[34, 3]
>>> bar(3, [34])
[34, 3]

A simple workaround using None

>>> def bar(b, data=None):
...     data = data or []
...     data.append(b)
...     return data
... 
>>> bar(3)
[3]
>>> bar(3)
[3]
>>> bar(3)
[3]
>>> bar(3, [34])
[34, 3]
>>> bar(3, [34])
[34, 3]
断舍离 2024-10-05 15:33:53

我将演示一种将默认列表值传递给函数的替代结构(它与字典同样有效)。

正如其他人广泛评论的那样,列表参数在定义时绑定到函数,而不是在执行时绑定到函数。由于列表和字典是可变的,因此对此参数的任何更改都会影响对此函数的其他调用。因此,对该函数的后续调用将收到此共享列表,该列表可能已被对该函数的任何其他调用更改。更糟糕的是,两个参数同时使用该函数的共享参数,而忽略了另一个参数所做的更改。

错误的方法(可能...)

def foo(list_arg=[5]):
    return list_arg

a = foo()
a.append(6)
>>> a
[5, 6]

b = foo()
b.append(7)
# The value of 6 appended to variable 'a' is now part of the list held by 'b'.
>>> b
[5, 6, 7]  

# Although 'a' is expecting to receive 6 (the last element it appended to the list),
# it actually receives the last element appended to the shared list.
# It thus receives the value 7 previously appended by 'b'.
>>> a.pop()             
7

您可以使用 id 来验证它们是否是同一个对象:

>>> id(a)
5347866528

>>> id(b)
5347866528

Per Brett Slatkin 的“Effective Python:59 种特定的编写方法” Better Python”,第 20 项:使用 None 和 Docstrings 指定动态默认参数(第 48 页)

在 Python 中实现所需结果的约定是
提供默认值 None 并记录实际行为
在文档字符串中。

此实现确保对函数的每次调用要么接收默认列表,要么接收传递给函数的列表。

首选方法

def foo(list_arg=None):
   """
   :param list_arg:  A list of input values. 
                     If none provided, used a list with a default value of 5.
   """
   if not list_arg:
       list_arg = [5]
   return list_arg

a = foo()
a.append(6)
>>> a
[5, 6]

b = foo()
b.append(7)
>>> b
[5, 7]

c = foo([10])
c.append(11)
>>> c
[10, 11]

“错误方法”可能存在合法用例,程序员希望共享默认列表参数,但这更可能是例外而不是规则。

I am going to demonstrate an alternative structure to pass a default list value to a function (it works equally well with dictionaries).

As others have extensively commented, the list parameter is bound to the function when it is defined as opposed to when it is executed. Because lists and dictionaries are mutable, any alteration to this parameter will affect other calls to this function. As a result, subsequent calls to the function will receive this shared list which may have been altered by any other calls to the function. Worse yet, two parameters are using this function's shared parameter at the same time oblivious to the changes made by the other.

Wrong Method (probably...):

def foo(list_arg=[5]):
    return list_arg

a = foo()
a.append(6)
>>> a
[5, 6]

b = foo()
b.append(7)
# The value of 6 appended to variable 'a' is now part of the list held by 'b'.
>>> b
[5, 6, 7]  

# Although 'a' is expecting to receive 6 (the last element it appended to the list),
# it actually receives the last element appended to the shared list.
# It thus receives the value 7 previously appended by 'b'.
>>> a.pop()             
7

You can verify that they are one and the same object by using id:

>>> id(a)
5347866528

>>> id(b)
5347866528

Per Brett Slatkin's "Effective Python: 59 Specific Ways to Write Better Python", Item 20: Use None and Docstrings to specify dynamic default arguments (p. 48)

The convention for achieving the desired result in Python is to
provide a default value of None and to document the actual behaviour
in the docstring.

This implementation ensures that each call to the function either receives the default list or else the list passed to the function.

Preferred Method:

def foo(list_arg=None):
   """
   :param list_arg:  A list of input values. 
                     If none provided, used a list with a default value of 5.
   """
   if not list_arg:
       list_arg = [5]
   return list_arg

a = foo()
a.append(6)
>>> a
[5, 6]

b = foo()
b.append(7)
>>> b
[5, 7]

c = foo([10])
c.append(11)
>>> c
[10, 11]

There may be legitimate use cases for the 'Wrong Method' whereby the programmer intended the default list parameter to be shared, but this is more likely the exception than the rule.

盗心人 2024-10-05 15:33:53

这里的解决方案是:

  1. 使用 None 作为默认值(或随机数对象),并打开它以在运行时创建您的值;或者
  2. 使用 lambda 作为默认参数,并在 try 块中调用它以获取默认值(这就是 lambda 抽象的用途)。

第二个选项很好,因为该函数的用户可以传入一个可调用的,该可调用的可能已经存在(例如 type

The solutions here are:

  1. Use None as your default value (or a nonce object), and switch on that to create your values at runtime; or
  2. Use a lambda as your default parameter, and call it within a try block to get the default value (this is the sort of thing that lambda abstraction is for).

The second option is nice because users of the function can pass in a callable, which may be already existing (such as a type)

甩你一脸翔 2024-10-05 15:33:53

是的,这是 Python 中的一个设计缺陷,

我已经阅读了所有其他答案,但我不相信。这种设计确实违反了最小惊讶原则。

默认值可以设计为在调用函数时而不是在定义函数时进行评估。 JavaScript 是这样实现的:

function foo(a=[]) {
  a.push(5);
  return a;
}
console.log(foo()); // [5]
console.log(foo()); // [5]
console.log(foo()); // [5]

作为进一步证明这是一个设计缺陷的证据,Python 核心开发人员目前正在讨论引入新语法来解决这个问题。请参阅这篇文章:Python 的后期绑定参数默认值

为了获得更多证据表明这是一个设计缺陷,如果你谷歌“Python 陷阱”,这个设计被提到为一个陷阱,通常是列表中的第一个陷阱,在前 9 个谷歌结果中(1, 2345, 678, 9)。相比之下,如果你用谷歌搜索“Javascript 陷阱”,就会发现 Javascript 中默认参数的行为甚至一次也没有被提及为陷阱。

根据定义,陷阱违反了最少惊讶原则。他们感到惊讶。鉴于默认参数值的行为有更好的设计,不可避免的结论是 Python 的行为代表了一个设计缺陷。

我作为一个热爱 Python 的人这么说。我们可以是 Python 的粉丝,但仍然承认每个对 Python 的这一方面感到不愉快的惊讶的人都是因为它是一个真正的“陷阱”。

Yes, this is a design flaw in Python

I've read all the other answers and I'm not convinced. This design does violate the principle of least astonishment.

The defaults could have been designed to be evaluated when the function is called, rather than when the function is defined. This is how Javascript does it:

function foo(a=[]) {
  a.push(5);
  return a;
}
console.log(foo()); // [5]
console.log(foo()); // [5]
console.log(foo()); // [5]

As further evidence that this is a design flaw, Python core developers are currently discussing introducing new syntax to fix this problem. See this article: Late-bound argument defaults for Python.

For even more evidence that this a design flaw, if you Google "Python gotchas", this design is mentioned as a gotcha, usually the first gotcha in the list, in the first 9 Google results (1, 2, 3, 4, 5, 6, 7, 8, 9). In contrast, if you Google "Javascript gotchas", the behaviour of default arguments in Javascript is not mentioned as a gotcha even once.

Gotchas, by definition, violate the principle of least astonishment. They astonish. Given there are superiour designs for the behaviour of default argument values, the inescapable conclusion is that Python's behaviour here represents a design flaw.

I say this as someone who loves Python. We can be fans of Python, and still admit that everyone who is unpleasantly surprised by this aspect of Python is unpleasantly surprised because it is a genuine "gotcha".

云归处 2024-10-05 15:33:53

您可以通过替换对象(以及与范围的关系)来解决这个问题:

def foo(a=[]):
    a = list(a)
    a.append(5)
    return a

丑陋,但它有效。

You can get round this by replacing the object (and therefore the tie with the scope):

def foo(a=[]):
    a = list(a)
    a.append(5)
    return a

Ugly, but it works.

浅笑依然 2024-10-05 15:33:53

当我们这样做时:

def foo(a=[]):
    ...

...如果调用者没有传递 a 的值,我们将参数 a 分配给一个未命名列表。

为了使讨论更简单,我们暂时给未命名列表命名。 pavlo 怎么样?

def foo(a=pavlo):
   ...

在任何时候,如果调用者没有告诉我们a是什么,我们就会重用pavlo

如果 pavlo 是可变的(可修改的),并且 foo 最终修改它,我们会在下次调用 foo 时注意到这种效果而不指定 >一个

这就是您所看到的(请记住,pavlo 被初始化为 []):

 >>> foo()
 [5]

现在,pavlo 是 [5]。

再次调用 foo() 会再次修改 pavlo

>>> foo()
[5, 5]

在调用 foo() 时指定 a 可确保 pavlo< /code> 没有被触及。

>>> ivan = [1, 2, 3, 4]
>>> foo(a=ivan)
[1, 2, 3, 4, 5]
>>> ivan
[1, 2, 3, 4, 5]

因此,pavlo 仍然是 [5, 5]

>>> foo()
[5, 5, 5]

When we do this:

def foo(a=[]):
    ...

... we assign the argument a to an unnamed list, if the caller does not pass the value of a.

To make things simpler for this discussion, let's temporarily give the unnamed list a name. How about pavlo ?

def foo(a=pavlo):
   ...

At any time, if the caller doesn't tell us what a is, we reuse pavlo.

If pavlo is mutable (modifiable), and foo ends up modifying it, an effect we notice the next time foo is called without specifying a.

So this is what you see (Remember, pavlo is initialized to []):

 >>> foo()
 [5]

Now, pavlo is [5].

Calling foo() again modifies pavlo again:

>>> foo()
[5, 5]

Specifying a when calling foo() ensures pavlo is not touched.

>>> ivan = [1, 2, 3, 4]
>>> foo(a=ivan)
[1, 2, 3, 4, 5]
>>> ivan
[1, 2, 3, 4, 5]

So, pavlo is still [5, 5].

>>> foo()
[5, 5, 5]
夏天碎花小短裙 2024-10-05 15:33:53

我有时利用此行为作为以下​​模式的替代方案:

singleton = None

def use_singleton():
    global singleton

    if singleton is None:
        singleton = _make_singleton()

    return singleton.use_me()

如果 singleton 仅由 use_singleton 使用,我喜欢以下模式作为替代:

# _make_singleton() is called only once when the def is executed
def use_singleton(singleton=_make_singleton()):
    return singleton.use_me()

我已将其用于实例化访问外部资源的客户端类,以及创建用于记忆的字典或列表。

由于我认为这种模式并不为人所知,因此我确实添加了一个简短的评论,以防止将来出现误解。

I sometimes exploit this behavior as an alternative to the following pattern:

singleton = None

def use_singleton():
    global singleton

    if singleton is None:
        singleton = _make_singleton()

    return singleton.use_me()

If singleton is only used by use_singleton, I like the following pattern as a replacement:

# _make_singleton() is called only once when the def is executed
def use_singleton(singleton=_make_singleton()):
    return singleton.use_me()

I've used this for instantiating client classes that access external resources, and also for creating dicts or lists for memoization.

Since I don't think this pattern is well known, I do put a short comment in to guard against future misunderstandings.

枯寂 2024-10-05 15:33:53

其他每个答案都解释了为什么这实际上是一种很好且理想的行为,或者为什么您无论如何都不应该需要它。我的观点是为那些顽固的人准备的,他们想要行使自己的权利,让语言屈服于自己的意愿,而不是相反。

我们将使用装饰器“修复”此行为,该装饰器将复制默认值,而不是为保留默认值的每个位置参数重用相同的实例。

import inspect
from copy import deepcopy  # copy would fail on deep arguments like nested dicts

def sanify(function):
    def wrapper(*a, **kw):
        # store the default values
        defaults = inspect.getargspec(function).defaults # for python2
        # construct a new argument list
        new_args = []
        for i, arg in enumerate(defaults):
            # allow passing positional arguments
            if i in range(len(a)):
                new_args.append(a[i])
            else:
                # copy the value
                new_args.append(deepcopy(arg))
        return function(*new_args, **kw)
    return wrapper

现在让我们使用这个装饰器重新定义我们的函数:

@sanify
def foo(a=[]):
    a.append(5)
    return a

foo() # '[5]'
foo() # '[5]' -- as desired

这​​对于采用多个参数的函数来说特别简洁。比较:

# the 'correct' approach
def bar(a=None, b=None, c=None):
    if a is None:
        a = []
    if b is None:
        b = []
    if c is None:
        c = []
    # finally do the actual work

with

# the nasty decorator hack
@sanify
def bar(a=[], b=[], c=[]):
    # wow, works right out of the box!

重要的是要注意,如果您尝试使用关键字参数,上述解决方案就会中断,如下所示:

foo(a=[4])

可以调整装饰器以允许这种情况,但我们将其作为读者的练习;)

Every other answer explains why this is actually a nice and desired behavior, or why you shouldn't be needing this anyway. Mine is for those stubborn ones who want to exercise their right to bend the language to their will, not the other way around.

We will "fix" this behavior with a decorator that will copy the default value instead of reusing the same instance for each positional argument left at its default value.

import inspect
from copy import deepcopy  # copy would fail on deep arguments like nested dicts

def sanify(function):
    def wrapper(*a, **kw):
        # store the default values
        defaults = inspect.getargspec(function).defaults # for python2
        # construct a new argument list
        new_args = []
        for i, arg in enumerate(defaults):
            # allow passing positional arguments
            if i in range(len(a)):
                new_args.append(a[i])
            else:
                # copy the value
                new_args.append(deepcopy(arg))
        return function(*new_args, **kw)
    return wrapper

Now let's redefine our function using this decorator:

@sanify
def foo(a=[]):
    a.append(5)
    return a

foo() # '[5]'
foo() # '[5]' -- as desired

This is particularly neat for functions that take multiple arguments. Compare:

# the 'correct' approach
def bar(a=None, b=None, c=None):
    if a is None:
        a = []
    if b is None:
        b = []
    if c is None:
        c = []
    # finally do the actual work

with

# the nasty decorator hack
@sanify
def bar(a=[], b=[], c=[]):
    # wow, works right out of the box!

It's important to note that the above solution breaks if you try to use keyword args, like so:

foo(a=[4])

The decorator could be adjusted to allow for that, but we leave this as an exercise for the reader ;)

情愿 2024-10-05 15:33:53

这个“bug”让我大量加班!但我开始看到它的潜在用途(但我仍然希望它在执行时)

我将给你一个我认为有用的例子。

def example(errors=[]):
    # statements
    # Something went wrong
    mistake = True
    if mistake:
        tryToFixIt(errors)
        # Didn't work.. let's try again
        tryToFixItAnotherway(errors)
        # This time it worked
    return errors

def tryToFixIt(err):
    err.append('Attempt to fix it')

def tryToFixItAnotherway(err):
    err.append('Attempt to fix it by another way')

def main():
    for item in range(2):
        errors = example()
    print '\n'.join(errors)

main()

打印以下内容

Attempt to fix it
Attempt to fix it by another way
Attempt to fix it
Attempt to fix it by another way

This "bug" gave me a lot of overtime work hours! But I'm beginning to see a potential use of it (but I would have liked it to be at the execution time, still)

I'm gonna give you what I see as a useful example.

def example(errors=[]):
    # statements
    # Something went wrong
    mistake = True
    if mistake:
        tryToFixIt(errors)
        # Didn't work.. let's try again
        tryToFixItAnotherway(errors)
        # This time it worked
    return errors

def tryToFixIt(err):
    err.append('Attempt to fix it')

def tryToFixItAnotherway(err):
    err.append('Attempt to fix it by another way')

def main():
    for item in range(2):
        errors = example()
    print '\n'.join(errors)

main()

prints the following

Attempt to fix it
Attempt to fix it by another way
Attempt to fix it
Attempt to fix it by another way
月下伊人醉 2024-10-05 15:33:53

这不是设计缺陷。任何被这个绊倒的人都做错了事。

我发现您可能会在 3 种情况下遇到此问题:

  1. 您打算将参数修改为函数的副作用。在这种情况下,拥有默认参数永远没有意义。唯一的例外是当您滥用参数列表来具有函数属性时,例如 cache={},并且您根本不需要使用实际参数来调用该函数。
  2. 您打算不修改参数,但您不小心确实修改了它。这是一个bug,修复一下吧
  3. 您打算修改在函数内部使用的参数,但没想到修改可以在函数外部查看。在这种情况下,您需要复制参数,无论它是否是默认的! Python 不是一种按值调用的语言,因此它不会为您制作副本,您需要明确说明这一点。

问题中的示例可能属于类别 1 或类别 3。奇怪的是,它既修改了传递的列表又返回了它;这很奇怪。你应该选择其中之一。

This is not a design flaw. Anyone who trips over this is doing something wrong.

There are 3 cases I see where you might run into this problem:

  1. You intend to modify the argument as a side effect of the function. In this case it never makes sense to have a default argument. The only exception is when you're abusing the argument list to have function attributes, e.g. cache={}, and you wouldn't be expected to call the function with an actual argument at all.
  2. You intend to leave the argument unmodified, but you accidentally did modify it. That's a bug, fix it.
  3. You intend to modify the argument for use inside the function, but didn't expect the modification to be viewable outside of the function. In that case you need to make a copy of the argument, whether it was the default or not! Python is not a call-by-value language so it doesn't make the copy for you, you need to be explicit about it.

The example in the question could fall into category 1 or 3. It's odd that it both modifies the passed list and returns it; you should pick one or the other.

杯别 2024-10-05 15:33:53

只需将函数更改为:

def notastonishinganymore(a = []): 
    '''The name is just a joke :)'''
    a = a[:]
    a.append(5)
    return a

Just change the function to be:

def notastonishinganymore(a = []): 
    '''The name is just a joke :)'''
    a = a[:]
    a.append(5)
    return a
风尘浪孓 2024-10-05 15:33:53

TLDR:定义时间默认值是一致的并且更严格地更具表现力。


定义函数会影响两个作用域:包含函数的定义作用域和函数包含的执行作用域。虽然块如何映射到范围非常清楚,但问题是 def (): 属于:

...                           # defining scope
def name(parameter=default):  # ???
    ...                       # execution scope

def name 部分必须在定义范围内进行评估 - 毕竟我们希望name在那里可用。仅在函数内部评估该函数将使其无法访问。

由于parameter是一个常量名称,我们可以与def name同时“评估”它。这还有一个优点,它生成的函数具有已知签名 name(parameter=...):,而不是裸露的 name(...):

现在,什么时候评估默认

一致性已经说“在定义时”: def (): 的其他所有内容也最好在定义时进行评估。推迟其中的一部分将是一个令人惊讶的选择。

这两个选择也不等同:如果在定义时评估 default,它仍然会影响执行时间。如果在执行时评估default,则它不能影响定义时间。选择“在定义时”允许表达两种情况,而选择“在执行时”只能表达一种情况:

def name(parameter=defined):  # set default at definition time
    ...

def name(parameter=default):     # delay default until execution time
    parameter = default if parameter is None else parameter
    ...

TLDR: Define-time defaults are consistent and strictly more expressive.


Defining a function affects two scopes: the defining scope containing the function, and the execution scope contained by the function. While it is pretty clear how blocks map to scopes, the question is where def <name>(<args=defaults>): belongs to:

...                           # defining scope
def name(parameter=default):  # ???
    ...                       # execution scope

The def name part must evaluate in the defining scope - we want name to be available there, after all. Evaluating the function only inside itself would make it inaccessible.

Since parameter is a constant name, we can "evaluate" it at the same time as def name. This also has the advantage it produces the function with a known signature as name(parameter=...):, instead of a bare name(...):.

Now, when to evaluate default?

Consistency already says "at definition": everything else of def <name>(<args=defaults>): is best evaluated at definition as well. Delaying parts of it would be the astonishing choice.

The two choices are not equivalent, either: If default is evaluated at definition time, it can still affect execution time. If default is evaluated at execution time, it cannot affect definition time. Choosing "at definition" allows expressing both cases, while choosing "at execution" can express only one:

def name(parameter=defined):  # set default at definition time
    ...

def name(parameter=default):     # delay default until execution time
    parameter = default if parameter is None else parameter
    ...
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文