检查 Python 中的可变性?

发布于 2024-10-06 12:46:30 字数 487 浏览 6 评论 0原文

考虑这个代码

a = {...} # a is an dict with arbitrary contents
b = a.copy()
  1. 可变性在键和值中扮演什么角色字典?
  2. 如何确保一个字典的键或值的更改不会反映在另一个字典中?
  3. 这与 dict 键的可散列 约束 有什么关系?
  4. Python 2.x 和 Python 3.x 之间的行为有什么差异吗?

如何检查 Python 中的类型是否可变?

Consider this code:

a = {...} # a is an dict with arbitrary contents
b = a.copy()
  1. What role does mutability play in the keys and values of the dicts?
  2. How do I ensure changes to keys or values of one dict are not reflected in the other?
  3. How does this relate to the hashable constraint of the dict keys?
  4. Are there any differences in behaviour between Python 2.x and Python 3.x?

How do I check if a type is mutable in Python?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

梦冥 2024-10-13 12:46:30
  1. 密钥必须是可散列的 - 这就是强加给你的一切。特别是,您可以拥有一个用户定义的类,其实例是可散列的,但也是可变的 - 但是这通常是一个坏主意

  2. 通过不在两个字典之间共享值。共享密钥通常是可以的,因为它们应该是不可变的(对于内置类型来说也是如此)。从copy标准库模块的意义上来说,复制字典是绝对安全的。在这里调用 dict 构造函数也可以:b = dict(a)。您还可以使用不可变值。

  3. 所有内置不可变类型都是可哈希的。所有内置可变类型都不可散列。对 dict 键的约束只需要内置的 hash 函数作用于键,这又要求其类实现 __hash__ 魔术方法。

    但是,如果对象的哈希值在其生命周期内可能发生更改,则代码可能会巧妙地或意外地中断。举一个病理学的例子:

    <前><代码>>>>随机导入
    >>>>> X 类:
    ... def __hash__(self): 返回 random.randint(0, 100)
    ...
    >>>>> a, b = x(), x()
    >>>>> c = {a:1, b:2}
    >>>>> c[a]
    回溯(最近一次调用最后一次):
    文件“”,第 1 行,在
    KeyError: <__main__.x 对象位于...>

    这就是为什么尝试创建可变的、可散列的类型是不明智的:散列应该不会改变,但也应该反映对象的状态。

  4. 没有。

如果类型不是不可变的,那么它就是可变的。如果类型是内置不可变类型,则该类型是不可变的:strintlongboolfloattuple,可能还有其他几个我忘记了。用户定义的类型始终是可变的。

如果一个对象不是不可变的,那么它就是可变的。如果一个对象递归地仅由不可变类型的子对象组成,则该对象是不可变的。因此,列表元组是可变的;您无法替换元组的元素,但可以通过列表接口修改它们,从而更改整体数据。

  1. Keys must be hashable - that's all that's forced upon you. In particular, you can have a user-defined class whose instances are hashable but also mutable - but this is generally a bad idea.

  2. By not sharing values between the two dicts. It's generally OK to share the keys, because they should be immutable (and will be, for built-in types). Copying the dictionary, in the sense of the copy standard library module, is definitely safe. Calling the dict constructor here works, too: b = dict(a). You could also use immutable values.

  3. All built-in immutable types are hashable. All built-in mutable types are not hashable. The constraint on dict keys simply requires that the built-in hash function works on the key, which in turn requires that its class implements the __hash__ magic method.

    However, the code may break subtly or unexpectedly if an object's hash could ever change during its lifetime. For a pathological example:

    >>> import random
    >>> class x:
    ...     def __hash__(self): return random.randint(0, 100)
    ... 
    >>> a, b = x(), x()
    >>> c = {a:1, b:2}
    >>> c[a]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyError: <__main__.x object at ...>
    

    This is why trying to make a mutable, hashable type is ill-advised: the hash is expected not to change, but is also expected to reflect the state of the object.

  4. No.

A type is mutable if it is not immutable. A type is immutable if it is a built-in immutable type: str, int, long, bool, float, tuple, and probably a couple others I'm forgetting. User-defined types are always mutable.

An object is mutable if it is not immutable. An object is immutable if it consists, recursively, of only immutable-typed sub-objects. Thus, a tuple of lists is mutable; you cannot replace the elements of the tuple, but you can modify them through the list interface, changing the overall data.

东走西顾 2024-10-13 12:46:30

Python 中的语言级别实际上不存在诸如可变性或不变性之类的东西。有些对象无法提供更改它们的方法(例如字符串和元组),因此实际上是不可变的,但这纯粹是概念性的;无论是你的代码还是 Python 本身,都没有语言级别的属性表明这一点。

不变性实际上与 dict 无关;使用可变值作为键是完全可以的。重要的是比较和散列:对象必须始终保持等于自身。例如:

class example(object):
    def __init__(self, a):
        self.value = a
    def __eq__(self, rhs):
        return self.value == rhs.value
    def __hash__(self):
        return hash(self.value)

a = example(1)
d = {a: "first"}
a.data = 2
print d[example(1)]

这里,example 不是不可变的;我们使用 a.data = 2 修改它。然而,我们将它用作哈希的键,没有任何问题。为什么?我们正在更改的属性对相等性没有影响:哈希值保持不变,并且 example(1) 始终等于 example(1),忽略任何其他属性。

最常见的用途是缓存和记忆:缓存或不缓存属性在逻辑上不会更改对象,并且通常对相等性没有影响。

(我要在这里停下来——请不要一次问五个问题。)

There isn't actually any such thing as mutability or immutability at the language level in Python. Some objects provide no way to change them (eg. strings and tuples), and so are effectively immutable, but it's purely conceptual; there's no property at the language level indicating this, neither to your code nor to Python itself.

Immutability is not actually relevant to dicts; it's perfectly fine to use mutable values as keys. What matters is comparison and hashing: the object must always remain equal to itself. For example:

class example(object):
    def __init__(self, a):
        self.value = a
    def __eq__(self, rhs):
        return self.value == rhs.value
    def __hash__(self):
        return hash(self.value)

a = example(1)
d = {a: "first"}
a.data = 2
print d[example(1)]

Here, example is not immutable; we're modifying it with a.data = 2. Yet, we're using it as a key of a hash without any trouble. Why? The property we're changing has no effect on equality: the hash is unchanged, and example(1) is always equal to example(1), ignoring any other properties.

The most common use of this is caching and memoization: having a property cached or not doesn't logically change the object, and usually has no effect on equality.

(I'm going to stop here--please don't ask five questions at once.)

把回忆走一遍 2024-10-13 12:46:30

模块集合中有MutableSequence、MutableSet、MutableMapping。它可用于检查预制类型的可变性。

issubclass(TYPE, (MutableSequence, MutableSet, MutableMapping))

如果要在用户定义的类型上使用此类型,则该类型必须从其中之一继承或注册为虚拟子类。

class x(MutableSequence):
    ...

或者

class x:
    ...

abc.ABCMeta.register(MutableSequence,x)

There are MutableSequence, MutableSet, MutableMapping in module collections. Which can be used to check mutability of premade types.

issubclass(TYPE, (MutableSequence, MutableSet, MutableMapping))

If you want use this on user defined types, the type must be either inherited from one of them or registered as a virtual subclass.

class x(MutableSequence):
    ...

or

class x:
    ...

abc.ABCMeta.register(MutableSequence,x)
唔猫 2024-10-13 12:46:30

实际上并不能保证可哈希的类型也是不可变的,但至少,正确实现 __hash__ 要求该类型相对于其自身的哈希而言是不可变的,并达到平等。这不是以任何特定方式强制执行的。

然而,我们都是成年人了。除非你真的这么想,否则实现__hash__是不明智的。粗略地说,这归结为如果一个类型实际上可以用作字典键,那么它就应该以这种方式使用。

如果您正在寻找类似于字典但又不可变的东西,那么namedtuple 可能是标准库中的最佳选择。诚然,这不是一个很好的近似,但它是一个开始。

There's really no guarantee that a type which is hashable is also immutable, but at very least, correctly implementing __hash__ requires that the type is immutable, with respect to its own hash, and to equality. This is not enforced in any particular way.

However, we are all adults. It would be unwise to implement __hash__ unless you really meant it. Roughly speaking, this just boils down to saying that if a type actually can be used as a dictionary key, then it is intended to be used in that way.

If you're looking for something that is like a dict, but also immutable, then namedtuple might be your best bet from what's in the standard library. Admittedly it's not a very good approximation, but it's a start.

嗼ふ静 2024-10-13 12:46:30
  1. 字典必须是可散列的,这意味着它们具有不可变的散列值。 dict 可能是可变的,也可能不是可变的;但是,如果它们是可变的,这会影响您的第二个问题。

  2. “对键的更改”不会反映在两个字典之间。对不可变值(例如字符串)的更改也不会得到反映。对可变对象(例如用户定义的类)的更改将得到反映,因为对象是按 id(即引用)存储的。

    类 T(对象):
      def __init__(self, v):
        自我.v = v
    
    
    t1 = T(5)
    
    
    d1 = {'a': t1}
    d2 = d1.copy()
    
    
    d2['a'].v = 7
    d1['a'].v # = 7
    
    
    d2['a'] = T(2)
    d2['a'].v # = 2
    d1['a'].v # = 7
    
    
    导入副本
    d3 = copy.deepcopy(d2) # 执行“深层复制”
    d3['a'].v = 12
    d3['a'].v # = 12
    d2['a'].v # = 2
    
  3. 我认为前两个答案可以解释这一点。

  4. 据我所知,这方面没有。

一些额外的想法

要理解键的行为,需要了解两件主要事情:键必须是hashable (这意味着它们实现了 object.__hash__(self))并且它们还必须是“可比较的”(这意味着它们实现了类似 object.__cmp__(self))。从文档中得到的一个重要信息:默认情况下,用户定义对象的哈希函数返回 <代码>id()

考虑这个例子:

class K(object):
  def __init__(self, x, y):
     self.x = x
     self.y = y
  def __hash__(self):
     return self.x + self.y

k1 = K(1, 2)
d1 = {k1: 3}
d1[k1] # outputs 3
k1.x = 5
d1[k1] # KeyError!  The key's hash has changed!
k2 = K(2, 1)
d1[k2] # KeyError!  The key's hash is right, but the keys aren't equal.
k1.x = 1
d1[k1] # outputs 3

class NewK(object):
  def __init__(self, x, y):
     self.x = x
     self.y = y
  def __hash__(self):
     return self.x + self.y
  def __cmp__(self, other):
     return self.x - other.x

nk1 = NewK(3, 4)
nd1 = {nk1: 5}
nd1[nk1] # outputs 5
nk2 = NewK(3, 7)
nk1 == nk2 # True!
nd1[nk2] # KeyError! The keys' hashes differ.
hash(nk1) == hash(nk2) # False
nk2.y = 4
nd1[nk2] # outputs 5

# Where this can cause issues:
nd1.keys()[0].x = 5
nd1[nk1] # KeyError! nk1 is no longer in the dict!
id(nd1.keys()[0]) == id(nk1)  # Yikes. True?!
nd1.keys()[0].x = 3
nd1[nk1]  # outputs 5
id(nd1.keys()[0]) == id(nk1)  # True!

更容易理解,字典存储对对象的引用。阅读有关 hashable 的部分。像字符串这样的东西是不可变的,如果你“改变”它们,你改变它的字典现在引用一个新对象。可变的对象可以“就地更改”,因此两个字典的值都会改变。

d1 = {1: 'a'}
d2 = d1.copy()
id(d1[1]) == id(d2[1]) # True
d2[1] = 'z'
id(d1[1]) == id(d2[1]) # False

# the examples in section 2 above have more examples of this.

无论如何,这一切的要点如下:

  • 对于,您关心的可能不是可变性,而是散列性和可比性关于。
  • 您关心值的可变性,因为根据定义,可以更改可变对象的值而不更改对其的引用。

我认为没有通用的方法来测试这两点。适用性测试取决于您的用例。例如,检查对象是否实现 __hash__ 和比较(__eq____cmp__)函数可能就足够了。同样,您可以通过某种方式“检查”对象的 __setattr__ 方法来确定它是否可变。

  1. dict keys must be hashable, which implies they have an immutable hash value. dict values may or may not be mutable; however, if they are mutable this impacts your second question.

  2. "Changes to the keys" will not be reflected between the two dicts. Changes to immutable values, such as strings will also not be reflected. Changes to mutable objects, such as user defined classes will be reflected because the object is stored by id (i.e. reference).

    class T(object):
      def __init__(self, v):
        self.v = v
    
    
    t1 = T(5)
    
    
    d1 = {'a': t1}
    d2 = d1.copy()
    
    
    d2['a'].v = 7
    d1['a'].v   # = 7
    
    
    d2['a'] = T(2)
    d2['a'].v   # = 2
    d1['a'].v   # = 7
    
    
    import copy
    d3 = copy.deepcopy(d2) # perform a "deep copy"
    d3['a'].v = 12
    d3['a'].v   # = 12
    d2['a'].v   # = 2
    
  3. I think this is explained by the first two answers.

  4. Not that I know of in this respect.

some additional thoughts:

There are two main things to know for understanding the behavior of keys: keys must be hashable (which means they implement object.__hash__(self)) and they must also be "comparable" (which means they implement something like object.__cmp__(self)). One important take-away from the docs: by default, user-defined objects' hash functions return id().

Consider this example:

class K(object):
  def __init__(self, x, y):
     self.x = x
     self.y = y
  def __hash__(self):
     return self.x + self.y

k1 = K(1, 2)
d1 = {k1: 3}
d1[k1] # outputs 3
k1.x = 5
d1[k1] # KeyError!  The key's hash has changed!
k2 = K(2, 1)
d1[k2] # KeyError!  The key's hash is right, but the keys aren't equal.
k1.x = 1
d1[k1] # outputs 3

class NewK(object):
  def __init__(self, x, y):
     self.x = x
     self.y = y
  def __hash__(self):
     return self.x + self.y
  def __cmp__(self, other):
     return self.x - other.x

nk1 = NewK(3, 4)
nd1 = {nk1: 5}
nd1[nk1] # outputs 5
nk2 = NewK(3, 7)
nk1 == nk2 # True!
nd1[nk2] # KeyError! The keys' hashes differ.
hash(nk1) == hash(nk2) # False
nk2.y = 4
nd1[nk2] # outputs 5

# Where this can cause issues:
nd1.keys()[0].x = 5
nd1[nk1] # KeyError! nk1 is no longer in the dict!
id(nd1.keys()[0]) == id(nk1)  # Yikes. True?!
nd1.keys()[0].x = 3
nd1[nk1]  # outputs 5
id(nd1.keys()[0]) == id(nk1)  # True!

Values are much easier to understand, the dict stores references to objects. Read the sections on hashable. Things like strings are immutable, if you "change" them, the dict you changed it in now references a new object. Objects which are mutable can be "changed in-place", hence the value of both dicts will change.

d1 = {1: 'a'}
d2 = d1.copy()
id(d1[1]) == id(d2[1]) # True
d2[1] = 'z'
id(d1[1]) == id(d2[1]) # False

# the examples in section 2 above have more examples of this.

Anyway, here are the main points of all this:

  • For keys, it may not be mutability, but rather hashability and comparability, that you care about.
  • You care about mutability for values, because by definition, a mutable object's value can be changed without changing the reference to it.

I do not think there is a general way to test either of those points. The tests for suitability would depend on your use-case. For instance, it may be sufficient to check that an object does or does not implement __hash__ and comparison (__eq__ or __cmp__) functions. Like-wise, you might be able to "check" an object's __setattr__ method in some way to determine if it is mutable.

命比纸薄 2024-10-13 12:46:30

您可以通过打印 id 或该数据类型的内存位置地址轻松检查数据类型是可变还是不可变如果数据类型是不可变的,则内存位置的地址将在您更新变量时发生变化,例如:

stn = 'Hello'
print(id(stn)) 

您将获取该变量stn的内存位置的地址,但是当您将该变量与某个值连接起来,然后继续print内存位置的地址时,您将得到与第一个不同的输出同样,

stn += ' world'
print(id(stn))

您肯定会从第一个地址获得另一个内存位置地址,但是当您对可变数据类型执行此操作时,内存位置的地址将保持不变,例如

lists = [1, 2, 3]
print(id(lists))

在这里您将获得内存位置的地址,并且如果您去向前并在该列表中附加一些数字,内存位置的地址将继续相同

lists.append(4)
print(id(lists))

,并且您已经注意到内存位置的地址对于所有计算机而言并不相同
所以你无法在不同的计算机上检查相同的数据类型并得到相同的结果

you can easily check if datatype is mutable or immutable by print an id or address of memory location of that datatype if datatype is immutable the address of memory location will change as you update the variable for example:

stn = 'Hello'
print(id(stn)) 

you will get address of memory location of that variable stn but when you concatenate that variable with some value and then go ahead print an address of memory location you will get different output from the first one likewise

stn += ' world'
print(id(stn))

for sure you will get another address of memory location from the first one but when you do it to mutable datatype an address of memory location will stay the same for example

lists = [1, 2, 3]
print(id(lists))

here you will get an address of memory location and also also if you go ahead and append some numbers to that lists an address of memory location will continue to be the same

lists.append(4)
print(id(lists))

and you have notice that an address of memory location is not the same to all computers
so you couldn't check the same datatype to different computers and get same result

傲性难收 2024-10-13 12:46:30

字典是无序的键:值对集合。键必须是不可变的,因此是可散列的。要确定对象是否可散列,您可以使用 hash() 函数:

>>> hash(1)
1
>>> hash('a')
12416037344
>>> hash([1,2,3])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> hash((1,2,3))
2528502973977326415
>>> hash({1: 1})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

另一方面,值可以是任何对象。如果您需要检查对象是否不可变,那么我会使用hash()

Dicts are unordered sets of key:value pairs. The keys must be immutable, and therefore hashable. To determine if an object is hashable, you can use the hash() function:

>>> hash(1)
1
>>> hash('a')
12416037344
>>> hash([1,2,3])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> hash((1,2,3))
2528502973977326415
>>> hash({1: 1})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

The values, on the other hand, can be any object. If you need to check if an object is immutable, then I would use hash().

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文