如何避免由 Python 的早期绑定默认参数(例如可变默认参数“记住”旧数据)引起的问题?

发布于 2024-07-10 11:41:56 字数 412 浏览 10 评论 0 原文

有时,有一个空列表的默认参数似乎很自然。 但是,Python 在这些情况下会产生意外行为

例如,考虑这个函数:

def my_func(working_list=[]):
    working_list.append("a")
    print(working_list)

第一次调用时,默认值将起作用,但之后的调用将更新现有列表(每次调用一个“a”)并打印更新的版本。

如何修复该函数,以便在没有显式参数的情况下重复调用该函数时,每次都使用新的空列表?

Sometimes it seems natural to have a default parameter which is an empty list. However, Python produces unexpected behavior in these situations.

For example, consider this function:

def my_func(working_list=[]):
    working_list.append("a")
    print(working_list)

The first time it is called, the default will work, but calls after that will update the existing list (with one "a" each call) and print the updated version.

How can I fix the function so that, if it is called repeatedly without an explicit argument, a new empty list is used each time?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

背叛残局 2024-07-17 11:41:56
def my_func(working_list=None):
    if working_list is None: 
        working_list = []

    # alternative:
    # working_list = [] if working_list is None else working_list

    working_list.append("a")
    print(working_list)

文档说你应该使用None作为默认且显式在正文中对其进行测试的函数。

def my_func(working_list=None):
    if working_list is None: 
        working_list = []

    # alternative:
    # working_list = [] if working_list is None else working_list

    working_list.append("a")
    print(working_list)

The docs say you should use None as the default and explicitly test for it in the body of the function.

甜味拾荒者 2024-07-17 11:41:56

其他答案已经提供了所要求的直接解决方案,但是,由于这对于新的 Python 程序员来说是一个非常常见的陷阱,因此值得添加解释为什么 Python 会以这种方式运行,这在 Python 搭便车指南位于可变默认参数

Python 的默认参数在定义函数时计算一次,而不是每次调用函数时(就像 Ruby 中那样)。 这意味着,如果您使用可变的默认参数并对其进行变异,那么您将在以后对该函数的所有调用中也对该对象进行变异。

Other answers have already already provided the direct solutions as asked for, however, since this is a very common pitfall for new Python programmers, it's worth adding the explanation of why Python behaves this way, which is nicely summarized in The Hitchhikers Guide to Python under Mutable Default Arguments:

Python's default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.

倥絔 2024-07-17 11:41:56

在这种情况下并不重要,但您可以使用对象标识来测试 None:

if working_list is None: working_list = []

您还可以利用布尔运算符 or 在 python 中的定义方式:

working_list = working_list or []

尽管如果调用者给您一个空列表(其中算作 false)作为工作列表,并期望您的函数修改他给它的列表。

Not that it matters in this case, but you can use object identity to test for None:

if working_list is None: working_list = []

You could also take advantage of how the boolean operator or is defined in python:

working_list = working_list or []

Though this will behave unexpectedly if the caller gives you an empty list (which counts as false) as working_list and expects your function to modify the list he gave it.

无声情话 2024-07-17 11:41:56

如果函数的目的是修改作为working_list传递的参数,请参阅HenryR的答案(=None,检查内部是否为None)。

但是,如果您不打算改变参数,只需将其用作列表的起点,您可以简单地复制它:(

def myFunc(starting_list = []):
    starting_list = list(starting_list)
    starting_list.append("a")
    print starting_list

或者在这个简单的情况下,只需 print running_list + ["a"]但我想这只是一个玩具示例)

一般来说,改变你的参数在 Python 中是不好的风格。 唯一完全期望改变对象的函数是对象的方法。 改变可选参数的情况更加罕见——仅在某些调用中发生的副作用真的是最好的接口吗?

  • 如果您按照“输出参数”的 C 习惯执行此操作,则完全没有必要 - 您始终可以将多个值作为元组返回。

  • 如果您这样做是为了有效地构建一长串结果而不构建中间列表,请考虑将其编写为生成器并在调用它时使用result_list.extend(myFunc())。 这样,您的调用约定就保持非常干净。

经常改变可选参数的一种模式是递归函数中隐藏的“memo”参数:

def depth_first_walk_graph(graph, node, _visited=None):
    if _visited is None:
        _visited = set()  # create memo once in top-level call

    if node in _visited:
        return
    _visited.add(node)
    for neighbour in graph[node]:
        depth_first_walk_graph(graph, neighbour, _visited)

If the intent of the function is to modify the parameter passed as working_list, see HenryR's answer (=None, check for None inside).

But if you didn't intend to mutate the argument, just use it as starting point for a list, you can simply copy it:

def myFunc(starting_list = []):
    starting_list = list(starting_list)
    starting_list.append("a")
    print starting_list

(or in this simple case just print starting_list + ["a"] but I guess that was just a toy example)

In general, mutating your arguments is bad style in Python. The only functions that are fully expected to mutate an object are methods of the object. It's even rarer to mutate an optional argument — is a side effect that happens only in some calls really the best interface?

  • If you do it from the C habit of "output arguments", that's completely unnecessary - you can always return multiple values as a tuple.

  • If you do this to efficiently build a long list of results without building intermediate lists, consider writing it as a generator and using result_list.extend(myFunc()) when you are calling it. This way your calling conventions remains very clean.

One pattern where mutating an optional arg is frequently done is a hidden "memo" arg in recursive functions:

def depth_first_walk_graph(graph, node, _visited=None):
    if _visited is None:
        _visited = set()  # create memo once in top-level call

    if node in _visited:
        return
    _visited.add(node)
    for neighbour in graph[node]:
        depth_first_walk_graph(graph, neighbour, _visited)
世界和平 2024-07-17 11:41:56

我可能偏离主题,但请记住,如果您只想传递可变数量的参数,Pythonic 方法是传递元组 *args 或字典 **kargs. 这些是可选的,并且比语法 myFunc([1, 2, 3]) 更好。

如果你想传递一个元组:

def myFunc(arg1, *args):
  print args
  w = []
  w += args
  print w
>>>myFunc(1, 2, 3, 4, 5, 6, 7)
(2, 3, 4, 5, 6, 7)
[2, 3, 4, 5, 6, 7]

如果你想传递一个字典:

def myFunc(arg1, **kargs):
   print kargs
>>>myFunc(1, option1=2, option2=3)
{'option2' : 2, 'option1' : 3}

I might be off-topic, but remember that if you just want to pass a variable number of arguments, the pythonic way is to pass a tuple *args or a dictionary **kargs. These are optional and are better than the syntax myFunc([1, 2, 3]).

If you want to pass a tuple:

def myFunc(arg1, *args):
  print args
  w = []
  w += args
  print w
>>>myFunc(1, 2, 3, 4, 5, 6, 7)
(2, 3, 4, 5, 6, 7)
[2, 3, 4, 5, 6, 7]

If you want to pass a dictionary:

def myFunc(arg1, **kargs):
   print kargs
>>>myFunc(1, option1=2, option2=3)
{'option2' : 2, 'option1' : 3}
为人所爱 2024-07-17 11:41:56

回顾一下

Python 提前计算参数/参数的默认值; 他们是“早期束缚”的。 这可能会以几种不同的方式导致问题。 例如:

>>> import datetime, time
>>> def what_time_is_it(dt=datetime.datetime.now()): # chosen ahead of time!
...     return f'It is now {dt.strftime("%H:%M:%S")}.'
... 
>>> 
>>> first = what_time_is_it()
>>> time.sleep(10) # Even if time elapses...
>>> what_time_is_it() == first # the reported time is the same!
True

然而,问题最常见的表现方式是当函数的参数可变(例如列表)时,并且在函数的内部发生变化。代码。 发生这种情况时,更改将被“记住”,从而在后续调用中“看到”:

>>> def append_one_and_return(a_list=[]):
...     a_list.append(1)
...     return a_list
... 
>>> 
>>> append_one_and_return()
[1]
>>> append_one_and_return()
[1, 1]
>>> append_one_and_return()
[1, 1, 1]

因为 a_list 是提前创建的,所以每次调用使用默认值将使用相同的列表对象,该对象会在每次调用时进行修改,并附加另一个1值。

这是一个有意识的设计决策可以在某些情况下会被利用 - 尽管通常有更好的方法来解决其他问题。 (考虑使用functools.cachefunctools.lru_cache进行记忆,并且functools.partial 绑定函数参数。)

这也意味着实例的方法无法使用实例的属性作为默认值:在确定默认值时,self不在作用域内,并且该实例无论如何也不存在:(

>>> class Example:
...     def function(self, arg=self):
...         pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in Example
NameError: name 'self' is not defined

Example also 尚不存在,名称 Examplealso 不在范围;因此,类属性在这里也不起作用,即使我们不关心可变性问题。)

解决方案

使用 None 作为哨兵值的

通常是标准的考虑惯用的方法是使用None作为默认值,并显式检查该值并在函数的逻辑中替换它。 因此:

>>> def append_one_and_return_fixed(a_list=None):
...     if a_list is None:
...         a_list = []
...     a_list.append(1)
...     return a_list
... 
>>> append_one_and_return_fixed([2]) # it works consistently with an argument
[2, 1]
>>> append_one_and_return_fixed([2])
[2, 1]
>>> append_one_and_return_fixed() # and also without an argument
[1]
>>> append_one_and_return_fixed()
[1]

之所以有效,是因为代码a_list = []运行(如果需要)函数被调用,而不是提前 - 因此,它每次都会创建一个新的空列表。 因此,这种方法也可以解决datetime.now()问题。 这确实意味着该函数不能将 None 值用于其他目的; 但是,这不会在普通代码中引起问题。

简单地避免可变默认值

如果不需要修改参数来实现函数的逻辑,因为 命令查询分离,最好不这样做

根据这个论点,append_one_and_return 的设计一开始就很糟糕:由于目的是显示输入的某些修改版本,因此它不应该实际修改调用者的变量,但只是创建一个新对象用于显示目的。 这允许使用不可变对象(例如元组)作为默认值。 因此:

def with_appended_one(a_sequence=()):
    return [*a_sequence, 1]

即使明确提供了输入,这种方式也将避免修改输入:

>>> x = [1]
>>> with_appended_one(x)
[1, 1]
>>> x # not modified!
[1]

无需参数即可正常工作,甚至重复:

>>> with_appended_one()
[1]
>>> with_appended_one()
[1]

并且它获得了一定的灵活性:

>>> with_appended_one('example') # a string is a sequence of its characters.
['e', 'x', 'a', 'm', 'p', 'l', 'e', 1]

PEP 671

PEP 671 建议向 Python 引入新语法,允许显式后期绑定参数的默认值。 提议的语法是:

def append_and_show_future(a_list=>None): # note => instead of =
    a_list.append(1)
    print(a_list)

然而,虽然本 PEP 草案提议在 Python 3.12 中引入该功能,但这并没有发生,并且尚无此类语法最近对此想法进行了一些讨论,但是Python 在不久的将来似乎不太可能支持它。

Recap

Python evaluates default values for arguments/parameters ahead of time; they are "early-bound". This can cause problems in a few different ways. For example:

>>> import datetime, time
>>> def what_time_is_it(dt=datetime.datetime.now()): # chosen ahead of time!
...     return f'It is now {dt.strftime("%H:%M:%S")}.'
... 
>>> 
>>> first = what_time_is_it()
>>> time.sleep(10) # Even if time elapses...
>>> what_time_is_it() == first # the reported time is the same!
True

The most common way the problem manifests, however, is when the argument to the function is mutable (for example, a list), and gets mutated within the function's code. When this happens, changes will be "remembered", and thus "seen" on subsequent calls:

>>> def append_one_and_return(a_list=[]):
...     a_list.append(1)
...     return a_list
... 
>>> 
>>> append_one_and_return()
[1]
>>> append_one_and_return()
[1, 1]
>>> append_one_and_return()
[1, 1, 1]

Because a_list was created ahead of time, every call to the function that uses the default value will use the same list object, which gets modified on each call, appending another 1 value.

This is a conscious design decision that can be exploited in some circumstances - although there are often better ways to solve those other problems. (Consider using functools.cache or functools.lru_cache for memoization, and functools.partial to bind function arguments.)

This also implies that methods of an instance cannot use an attribute of the instance as a default: at the time that the default value is determined, self is not in scope, and the instance does not exist anyway:

>>> class Example:
...     def function(self, arg=self):
...         pass
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in Example
NameError: name 'self' is not defined

(The class Example also doesn't exist yet, and the name Example is also not in scope; therefore, class attributes will also not work here, even if we don't care about the mutability issue.)

Solutions

Using None as a sentinel value

The standard, generally-considered-idiomatic approach is to use None as the default value, and explicitly check for this value and replace it in the function's logic. Thus:

>>> def append_one_and_return_fixed(a_list=None):
...     if a_list is None:
...         a_list = []
...     a_list.append(1)
...     return a_list
... 
>>> append_one_and_return_fixed([2]) # it works consistently with an argument
[2, 1]
>>> append_one_and_return_fixed([2])
[2, 1]
>>> append_one_and_return_fixed() # and also without an argument
[1]
>>> append_one_and_return_fixed()
[1]

This works because the code a_list = [] runs (if needed) when the function is called, not ahead of time - thus, it creates a new empty list every time. Therefore, this approach can also solve the datetime.now() issue. It does mean that the function can't use a None value for other purposes; however, this should not cause a problem in ordinary code.

Simply avoiding mutable defaults

If it is not necessary to modify the argument in order to implement the function's logic, because of the principle of command-query separation, it would be better to just not do that.

By this argument, append_one_and_return is poorly designed to begin with: since the purpose is to display some modified version of the input, it should not also actually modify the caller's variable, but instead just create a new object for display purposes. This allows for using an immutable object, such as a tuple, for the default value. Thus:

def with_appended_one(a_sequence=()):
    return [*a_sequence, 1]

This way will avoid modifying the input even when that input is explicitly provided:

>>> x = [1]
>>> with_appended_one(x)
[1, 1]
>>> x # not modified!
[1]

It works fine without an argument, even repeatedly:

>>> with_appended_one()
[1]
>>> with_appended_one()
[1]

And it has gained some flexibility:

>>> with_appended_one('example') # a string is a sequence of its characters.
['e', 'x', 'a', 'm', 'p', 'l', 'e', 1]

PEP 671

PEP 671 proposes to introduce new syntax to Python that would allow for explicit late binding of a parameter's default value. The proposed syntax is:

def append_and_show_future(a_list=>None): # note => instead of =
    a_list.append(1)
    print(a_list)

However, while this draft PEP proposed to introduce the feature in Python 3.12, that did not happen, and no such syntax is yet available. There has been some more recent discussion of the idea, but it seems unlikely to be supported by Python in the near future.

摇划花蜜的午后 2024-07-17 11:41:56

引用自 https://docs.python.org/3/reference/ compound_stmts.html#function-definitions

执行函数定义时,默认参数值从左到右计算。 这意味着在定义函数时,表达式会被计算一次,并且每次调用都会使用相同的“预先计算”值。 当默认参数是可变对象(例如列表或字典)时,理解这一点尤其重要:如果函数修改对象(例如,通过将项目附加到列表),则默认值实际上被修改。 这通常不是我们想要的。 解决这个问题的方法是使用 None 作为默认值,并在函数体内显式测试它,例如:

def whats_on_the_telly(penguin=None):
    if penguin is None:
        penguin = []
    penguin.append("property of the zoo")
    return penguin

Quote from https://docs.python.org/3/reference/compound_stmts.html#function-definitions

Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, e.g.:

def whats_on_the_telly(penguin=None):
    if penguin is None:
        penguin = []
    penguin.append("property of the zoo")
    return penguin
梦归所梦 2024-07-17 11:41:56

已经提供了良好且正确的答案。 我只是想提供另一种语法来编写您想要执行的操作,例如,当您想要创建一个具有默认空列表的类时,我发现这种语法更漂亮:

class Node(object):
    def __init__(self, _id, val, parents=None, children=None):
        self.id = _id
        self.val = val
        self.parents = parents if parents is not None else []
        self.children = children if children is not None else []

此代码片段使用了 if else 运算符语法。 我特别喜欢它,因为它是一个简洁的小单行,没有冒号等,读起来几乎就像一个正常的英语句子。 :)

在你的情况下你可以写

def myFunc(working_list=None):
    working_list = [] if working_list is None else working_list
    working_list.append("a")
    print working_list

There have already been good and correct answers provided. I just wanted to give another syntax to write what you want to do which I find more beautiful when you for instance want to create a class with default empty lists:

class Node(object):
    def __init__(self, _id, val, parents=None, children=None):
        self.id = _id
        self.val = val
        self.parents = parents if parents is not None else []
        self.children = children if children is not None else []

This snippet makes use of the if else operator syntax. I like it especially because it's a neat little one-liner without colons, etc. involved and it nearly reads like a normal English sentence. :)

In your case you could write

def myFunc(working_list=None):
    working_list = [] if working_list is None else working_list
    working_list.append("a")
    print working_list
趁微风不噪 2024-07-17 11:41:56

也许最简单的事情就是在脚本中创建列表或元组的副本。 这避免了检查的需要。 例如,

    def my_funct(params, lst = []):
        liste = lst.copy()
         . . 

Perhaps the simplest thing of all is to just create a copy of the list or tuple within the script. This avoids the need for checking. For example,

    def my_funct(params, lst = []):
        liste = lst.copy()
         . . 
紫轩蝶泪 2024-07-17 11:41:56

我采用了 UCSC 扩展类 Python for Programmer

这是正确的: def Fn(data = []):

a) 是一个好主意,这样您的数据列表在每次调用时都会从空开始。

b) 是一个好主意,这样所有不提供任何参数的函数调用都会得到空列表作为数据。

c) 是一个合理的想法,只要您的数据是字符串列表。

d) 是一个坏主意,因为默认的 [] 会累积数据,并且默认的 [] 会随着后续调用而改变。

回答:

d) 是一个坏主意,因为默认的 [] 会累积数据,并且默认的 [] 会随着后续调用而改变。

I took the UCSC extension class Python for programmer

Which is true of: def Fn(data = []):

a) is a good idea so that your data lists start empty with every call.

b) is a good idea so that all calls to the function that do not provide any arguments on the call will get the empty list as data.

c) is a reasonable idea as long as your data is a list of strings.

d) is a bad idea because the default [] will accumulate data and the default [] will change with subsequent calls.

Answer:

d) is a bad idea because the default [] will accumulate data and the default [] will change with subsequent calls.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文