什么是数据类,它们与普通类别有何不同?

发布于 2025-02-04 21:37:56 字数 744 浏览 4 评论 0原文

pep 557 将数据类引入Python Standard库中。它说,通过应用 @dataclass 下面显示的装饰器,它将生成“除其他事项外,__ INIT __()”。

 从数据级导入数据级

@DataClass
类库存ITEM:
    “”“用于跟踪库存中的项目的课程。”“”
    名称:str
    unit_price:float
    Quantity_on_hand:int = 0

    def tod total_cost(self) - >漂浮:
        返回self.unit_price * self.quantity_on_hand
 

它还说,数据级是“默认情况下的可变名称”,但我不明白这意味着什么,也不了解数据类别与通用类别不同。

什么是数据类,什么时候最好使用它们?

PEP 557 introduces data classes into the Python standard library. It says that by applying the @dataclass decorator shown below, it will generate "among other things, an __init__()".

from dataclasses import dataclass

@dataclass
class InventoryItem:
    """Class for keeping track of an item in inventory."""
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

It also says dataclasses are "mutable namedtuples with default", but I don't understand what this means, nor how data classes are different from common classes.

What are data classes and when is it best to use them?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

甜嗑 2025-02-11 21:37:56

数据类只是针对存储状态的常规类,而不是包含很多逻辑。每次创建主要由属性组成的类时,您都会进行数据类。

dataclasses模块的作用是使其更容易创建数据类。它可以为您提供很多样板。

当您的数据类必须具有可用时,这特别有用;因为这需要一个__哈希__方法以及__ eq __方法。如果添加一个自定义__ epr __易于调试的方法,它可能会变得非常详细:

class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def __init__(
            self, 
            name: str, 
            unit_price: float,
            quantity_on_hand: int = 0
        ) -> None:
        self.name = name
        self.unit_price = unit_price
        self.quantity_on_hand = quantity_on_hand

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand
    
    def __repr__(self) -> str:
        return (
            'InventoryItem('
            f'name={self.name!r}, unit_price={self.unit_price!r}, '
            f'quantity_on_hand={self.quantity_on_hand!r})'
        )

    def __hash__(self) -> int:
        return hash((self.name, self.unit_price, self.quantity_on_hand))

    def __eq__(self, other) -> bool:
        if not isinstance(other, InventoryItem):
            return NotImplemented
        return (
            (self.name, self.unit_price, self.quantity_on_hand) == 
            (other.name, other.unit_price, other.quantity_on_hand))

使用dataclasses您可以将其简化为:(

from dataclasses import dataclass

@dataclass(unsafe_hash=True)
class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

基于 pep示例)。

同一类装饰器还可以生成比较方法(__ lt ____ gt __等)并处理不变性。

名为Tuple类也是数据类,但默认情况下是不变的(以及序列)。 dataclasses在这方面更加灵活,并且可以轻松地进行构造,以便它们可以填充与nequ nationtuple class 相同的角色。

PEP的灵感来自 attrs attry project 更多(包括插槽,验证器,转换器,元数据等)。

如果您想查看一些示例,我最近使用dataclasses用于我的几个代码出现解决方案,请参阅第7天, href =“ https://github.com/mjpieters/adventofcode/blob/master/2017/day%2008.ipynb” rel =“ noreferrer”>第8天>第8 ,第11天 and 第20天

如果您想在Python版本中使用dataclasses模块< 3.7,然后您可以安装 backported module (需要3.6)或使用attrs 上述项目。

Data classes are just regular classes that are geared towards storing state, rather than containing a lot of logic. Every time you create a class that mostly consists of attributes, you make a data class.

What the dataclasses module does is to make it easier to create data classes. It takes care of a lot of boilerplate for you.

This is especially useful when your data class must be hashable; because this requires a __hash__ method as well as an __eq__ method. If you add a custom __repr__ method for ease of debugging, that can become quite verbose:

class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def __init__(
            self, 
            name: str, 
            unit_price: float,
            quantity_on_hand: int = 0
        ) -> None:
        self.name = name
        self.unit_price = unit_price
        self.quantity_on_hand = quantity_on_hand

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand
    
    def __repr__(self) -> str:
        return (
            'InventoryItem('
            f'name={self.name!r}, unit_price={self.unit_price!r}, '
            f'quantity_on_hand={self.quantity_on_hand!r})'
        )

    def __hash__(self) -> int:
        return hash((self.name, self.unit_price, self.quantity_on_hand))

    def __eq__(self, other) -> bool:
        if not isinstance(other, InventoryItem):
            return NotImplemented
        return (
            (self.name, self.unit_price, self.quantity_on_hand) == 
            (other.name, other.unit_price, other.quantity_on_hand))

With dataclasses you can reduce it to:

from dataclasses import dataclass

@dataclass(unsafe_hash=True)
class InventoryItem:
    '''Class for keeping track of an item in inventory.'''
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

(Example based on the PEP example).

The same class decorator can also generate comparison methods (__lt__, __gt__, etc.) and handle immutability.

namedtuple classes are also data classes, but are immutable by default (as well as being sequences). dataclasses are much more flexible in this regard, and can easily be structured such that they can fill the same role as a namedtuple class.

The PEP was inspired by the attrs project, which can do even more (including slots, validators, converters, metadata, etc.).

If you want to see some examples, I recently used dataclasses for several of my Advent of Code solutions, see the solutions for day 7, day 8, day 11 and day 20.

If you want to use dataclasses module in Python versions < 3.7, then you could install the backported module (requires 3.6) or use the attrs project mentioned above.

偷得浮生 2025-02-11 21:37:56

概述

该问题已经解决。但是,此答案添加了一些实践示例,以帮助对数据阶层的基本理解。

python数据类是什么,什么时候最好使用它们?

  1. 代码生成器:生成样板代码;您可以选择在常规类中实现特殊方法,也可以使Dataclass自动实现它们。
  2. 数据容器:持有数据的结构(例如元组和dicts),通常带有点缀的属性访问,例如类,nequtuple等等

“默认[s]的可变命名为“

这是后一个短语的含义:

  • untable :默认情况下,可以重新分配数据级别属性。您可以选择使它们不可变(请参见下面的示例)。
  • 名为tuple :您已经点缀了属性访问,例如nequ tuple或常规类。
  • default :您可以将默认值分配给属性。

与普通类相比,您主要保存打字机代码。


功能

这是Dataclass功能的概述(TL; DR?请参见下一节中的摘要表)。

您在这里获得的

是您默认从数据级别获得的功能。

属性 +表示 +比较

import dataclasses


@dataclasses.dataclass
#@dataclasses.dataclass()                                       # alternative
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

这些默认值是通过自动将以下关键字设置为true

@dataclasses.dataclass(init=True, repr=True, eq=True)

则可以打开

如果将适当的关键字设置为true 。

顺序

@dataclasses.dataclass(order=True)
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

现在实现了订购方法(超载运算符:&lt;&lt;&lt; =&gt; =),与 functools.total_ordering 具有更强的平等测试。

可见的,可变

@dataclasses.dataclass(unsafe_hash=True)                        # override base `__hash__`
class Color:
    ...

尽管该对象可能是可变的(可能是不希望的),但实现了哈希。

hashable,不可变的

@dataclasses.dataclass(frozen=True)                             # `eq=True` (default) to be immutable 
class Color:
    ...

hash现在正在实现,并更改对象或分配给属性。

总体而言,如果unsafe_hash = truefrozen = true,则该对象是可用的。

另请参见原始

优化

@dataclasses.dataclass(slots=True)              # py310+
class SlottedColor:
    #__slots__ = ["r", "b", "g"]                # alternative
    r : int
    g : int
    b : int

现在减少对象大小:

>>> imp sys
>>> sys.getsizeof(Color)
1056
>>> sys.getsizeof(SlottedColor)
888

slots = true在Python 3.10中添加。 (感谢@ajskateboarder)。

在某些情况下,插槽= true/__插槽__还提高了创建实例和访问属性的速度。另外,插槽不允许默认分配;否则,将提出valueerror。如果__插槽__已经存在,则slots = true将导致typeerror

请参阅此

请参阅参数>“ noreferrer”>参数match_args ,kw_only插槽feekref_slot

您无法

获得以下功能,必须手动实现特殊方法:

dunkaging

@dataclasses.dataclass
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

    def __iter__(self):
        yield from dataclasses.astuple(self)

摘要表

+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+
|       Feature        |       Keyword        |                      Example                       |           Implement in a Class          |
+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+
| Attributes           |  init                |  Color().r -> 0                                    |  __init__                               |
| Representation       |  repr                |  Color() -> Color(r=0, g=0, b=0)                   |  __repr__                               |
| Comparision*         |  eq                  |  Color() == Color(0, 0, 0) -> True                 |  __eq__                                 |
|                      |                      |                                                    |                                         |
| Order                |  order               |  sorted([Color(0, 50, 0), Color()]) -> ...         |  __lt__, __le__, __gt__, __ge__         |
| Hashable             |  unsafe_hash/frozen  |  {Color(), {Color()}} -> {Color(r=0, g=0, b=0)}    |  __hash__                               |
| Immutable            |  frozen + eq         |  Color().r = 10 -> TypeError                       |  __setattr__, __delattr__               |
| Optimization         |  slots               |  sys.getsizeof(SlottedColor) -> 888                |  __slots__                              |
|                      |                      |                                                    |                                         |
| Unpacking+           |  -                   |  r, g, b = Color()                                 |  __iter__                               |
+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+

* __ ne ne __ ne __不因此,

+这些方法不是自动生成的,需要在数据级别中进行手动实现。


附加功能

post initialization

@dataclasses.dataclass
class RGBA:
    r : int = 0
    g : int = 0
    b : int = 0
    a : float = 1.0

    def __post_init__(self):
        self.a : int =  int(self.a * 255)


RGBA(127, 0, 255, 0.5)
# RGBA(r=127, g=0, b=255, a=127)

继承

@dataclasses.dataclass
class RGBA(Color):
    a : int = 0

转换

将数据级转换为元组或dict, recursively

>>> dataclasses.astuple(Color(128, 0, 255))
(128, 0, 255)
>>> dataclasses.asdict(Color(128, 0, 255))
{'r': 128, 'g': 0, 'b': 255}

限制


参考R.Hettinger

  • R. Hettinger 谈话 dataclasses上:结束所有代码生成器
  • 的 代码生成器=“ https://www.youtube.com/watch?v=comrnkavesi” rel =“ noreferrer”> talk 在更轻松的课程:python类,没有所有cruft
  • python 文档>
  • =“ https://docs.python.org/3/library/dataclasses.html#module-level-decorators-classes-classes-and-functions” rel =“ noreferrer ” href =“ https://realpython.com/python-data-classes/” rel =“ noreferrer”>指南 python 3.7
  • A. shaw's A. a href =“ https://hackernoon.com/a-brief-tour-tour-of-python-3-7-data-classes-22ee5e046517” rel =“ noreferrer”>博客文章 python 3.7数据类
  • E. Smith's github repository >

Overview

The question has been addressed. However, this answer adds some practical examples to aid in the basic understanding of dataclasses.

What exactly are python data classes and when is it best to use them?

  1. code generators: generate boilerplate code; you can choose to implement special methods in a regular class or have a dataclass implement them automatically.
  2. data containers: structures that hold data (e.g. tuples and dicts), often with dotted, attribute access such as classes, namedtuple and others.

"mutable namedtuples with default[s]"

Here is what the latter phrase means:

  • mutable: by default, dataclass attributes can be reassigned. You can optionally make them immutable (see Examples below).
  • namedtuple: you have dotted, attribute access like a namedtuple or a regular class.
  • default: you can assign default values to attributes.

Compared to common classes, you primarily save on typing boilerplate code.


Features

This is an overview of dataclass features (TL;DR? See the Summary Table in the next section).

What you get

Here are features you get by default from dataclasses.

Attributes + Representation + Comparison

import dataclasses


@dataclasses.dataclass
#@dataclasses.dataclass()                                       # alternative
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

These defaults are provided by automatically setting the following keywords to True:

@dataclasses.dataclass(init=True, repr=True, eq=True)

What you can turn on

Additional features are available if the appropriate keywords are set to True.

Order

@dataclasses.dataclass(order=True)
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

The ordering methods are now implemented (overloading operators: < > <= >=), similarly to functools.total_ordering with stronger equality tests.

Hashable, Mutable

@dataclasses.dataclass(unsafe_hash=True)                        # override base `__hash__`
class Color:
    ...

Although the object is potentially mutable (possibly undesired), a hash is implemented.

Hashable, Immutable

@dataclasses.dataclass(frozen=True)                             # `eq=True` (default) to be immutable 
class Color:
    ...

A hash is now implemented and changing the object or assigning to attributes is disallowed.

Overall, the object is hashable if either unsafe_hash=True or frozen=True.

See also the original hashing logic table with more details.

Optimization

@dataclasses.dataclass(slots=True)              # py310+
class SlottedColor:
    #__slots__ = ["r", "b", "g"]                # alternative
    r : int
    g : int
    b : int

The object size is now reduced:

>>> imp sys
>>> sys.getsizeof(Color)
1056
>>> sys.getsizeof(SlottedColor)
888

slots=True was added in Python 3.10. (Thanks @ajskateboarder).

In some circumstances, slots=True/__slots__ also improves the speed of creating instances and accessing attributes. Also, slots do not allow default assignments; otherwise, a ValueError is raised. If __slot__ already exists, slots=True will cause a TypeError.

See more on slots in this blog post.

See more on arguments added in Python 3.10+: match_args, kw_only, slots, weakref_slot.

What you don't get

To get the following features, special methods must be manually implemented:

Unpacking

@dataclasses.dataclass
class Color:
    r : int = 0
    g : int = 0
    b : int = 0

    def __iter__(self):
        yield from dataclasses.astuple(self)

Summary Table

+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+
|       Feature        |       Keyword        |                      Example                       |           Implement in a Class          |
+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+
| Attributes           |  init                |  Color().r -> 0                                    |  __init__                               |
| Representation       |  repr                |  Color() -> Color(r=0, g=0, b=0)                   |  __repr__                               |
| Comparision*         |  eq                  |  Color() == Color(0, 0, 0) -> True                 |  __eq__                                 |
|                      |                      |                                                    |                                         |
| Order                |  order               |  sorted([Color(0, 50, 0), Color()]) -> ...         |  __lt__, __le__, __gt__, __ge__         |
| Hashable             |  unsafe_hash/frozen  |  {Color(), {Color()}} -> {Color(r=0, g=0, b=0)}    |  __hash__                               |
| Immutable            |  frozen + eq         |  Color().r = 10 -> TypeError                       |  __setattr__, __delattr__               |
| Optimization         |  slots               |  sys.getsizeof(SlottedColor) -> 888                |  __slots__                              |
|                      |                      |                                                    |                                         |
| Unpacking+           |  -                   |  r, g, b = Color()                                 |  __iter__                               |
+----------------------+----------------------+----------------------------------------------------+-----------------------------------------+

* __ne__ is not needed and thus not implemented.

+These methods are not automatically generated and require manual implementation in a dataclass.


Additional features

Post-initialization

@dataclasses.dataclass
class RGBA:
    r : int = 0
    g : int = 0
    b : int = 0
    a : float = 1.0

    def __post_init__(self):
        self.a : int =  int(self.a * 255)


RGBA(127, 0, 255, 0.5)
# RGBA(r=127, g=0, b=255, a=127)

Inheritance

@dataclasses.dataclass
class RGBA(Color):
    a : int = 0

Conversions

Convert a dataclass to a tuple or a dict, recursively:

>>> dataclasses.astuple(Color(128, 0, 255))
(128, 0, 255)
>>> dataclasses.asdict(Color(128, 0, 255))
{'r': 128, 'g': 0, 'b': 255}

Limitations


References

  • R. Hettinger's talk on Dataclasses: The code generator to end all code generators
  • T. Hunner's talk on Easier Classes: Python Classes Without All the Cruft
  • Python's documentation on hashing details
  • Real Python's guide on The Ultimate Guide to Data Classes in Python 3.7
  • A. Shaw's blog post on A brief tour of Python 3.7 data classes
  • E. Smith's github repository on dataclasses
北座城市 2025-02-11 21:37:56

来自 pep Specification

提供了一个班级装饰师
PEP 526中定义的类型注释的变量,“语法”
可变注释”。在本文档中,此类变量被调用
字段。使用这些字段,装饰器添加了生成的方法
班级的定义以支持实例初始化,一个ret,
比较方法,以及可选的其他方法
规范部分。这样的类称为数据类,但是
班上真的没有什么特别的:装饰者添加
生成的方法到全班并返回相同的班级
给定的。

@DataClass Generator将方法添加到类,否则您必须定义自己,例如__ epr ____ INT __ INT __ INT ____ lt __ lt __ 和__ GT __

From the PEP specification:

A class decorator is provided which inspects a class definition for
variables with type annotations as defined in PEP 526, "Syntax for
Variable Annotations". In this document, such variables are called
fields. Using these fields, the decorator adds generated method
definitions to the class to support instance initialization, a repr,
comparison methods, and optionally other methods as described in the
Specification section. Such a class is called a Data Class, but
there's really nothing special about the class: the decorator adds
generated methods to the class and returns the same class it was
given.

The @dataclass generator adds methods to the class that you'd otherwise have to define yourself like __repr__, __init__, __lt__, and __gt__.

稳稳的幸福 2025-02-11 21:37:56

考虑此简单类foo

from dataclasses import dataclass
@dataclass
class Foo:    
    def bar():
        pass  

这是dir()内置比较。左侧是foo没有@Dataclass Decorator,右侧是@DataClass Decorator。

这是另一个差异,在使用Inspect模块进行比较之后。

Consider this simple class Foo

from dataclasses import dataclass
@dataclass
class Foo:    
    def bar():
        pass  

Here is the dir() built-in comparison. On the left-hand side is the Foo without the @dataclass decorator, and on the right is with the @dataclass decorator.

enter image description here

Here is another diff, after using the inspect module for comparison.

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文