attrs 和 dataclasses

mrchi 收录于 Python

2019-03-16 约 1057 字预计阅读 3 分钟

简述

在 Python 编程过程中，有一些 class 是业务模型，用于存储数据，我们称为“数据类”，例如商品等。

attrs 是 Python 核心开发 Hynek Schlawack 设计并实现的一个项目，用于解决数据类定义和使用上的繁琐。

dataclasses 是 Python 3.7 中新增的类似 attrs 的标准库模块。

数据类的痛点

每个数据类在定义时，大多都要实现以下方法：

__init__ 初始化大量参数；
__repr__ 挑几个参数用于表示该类；
__eq__ 和 __lt__ 等比较方法，其他比较方法可以使用 functools.total_ordering 装饰器实现；
__hash__ 用户对对象去重；
给类定义 to_dict 或者 to_json 的方法，将类的属性便捷打包。

以上方法几乎每个数据类都要实现，所做的不过是定义某些参数用于某些方法，非常繁琐。我们希望能有一种便捷的方法定义参数和以上方法。

attrs

安装

1
pipenv install attrs

基本使用

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
import attr

@attr.s(hash=True)
class Product(object):
    id = attr.ib()
    author_id = attr.ib()
    brand_id = attr.ib()
    spu_id = attr.ib()
    title = attr.ib(repr=False, cmp=False, hash=False)
    item_id = attr.ib(repr=False, cmp=False, hash=False)
    n_comments = attr.ib(repr=False, cmp=False, hash=False)
    creation_time = attr.ib(repr=False, cmp=False, hash=False)
    update_time = attr.ib(repr=False, cmp=False, hash=False)
    source = attr.ib(default='', repr=False, cmp=False, hash=False)
    parent_id = attr.ib(default=0, repr=False, cmp=False, hash=False)
    ancestor_id = attr.ib(default=0, repr=False, cmp=False, hash=False)

attr.s 装饰器是一个全局设置，决定了所有参数的情况。默认情况下，参数参与 repr、cmp 和 init ，不会参与 hash。

attr.ib 是函数参数级别的设置，优先级更高，默认情况下，参数参与 repr、cmp 和 init，不会参与 hash，没有默认值。

上面的例子中，所有参数都在 init 方法中，只有前 4 个参数参与了 repr、cmp 和 hash，只有最后三个参数拥有默认值。

字段类型验证

attrs 也可以对字段类型进行验。

通过装饰器方式添加验证函数

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import attr

@attr.s
class C(object):
    x = attr.ib()

    @x.validator
    def check(self, attribute, value):
        if value > 42:
            raise ValueError("x must be smaller or equal to 42")

通过属性参数方式添加验证函数，可以传一个验证函数列表

1
2
3
4
5
6
7
8
def x_smaller_than_y(instance, attribute, value):
    if value >= instance.y:
        raise ValueError("'x' has to be smaller than 'y'!")

@attr.s
class C(object):
    x = attr.ib(validator=[attr.validators.instance_of(int), x_smaller_than_y])
    y = attr.ib()

上述代码首先验证了参数 x 的类型为 int，再验证 x < y。

属性类型转化

自动转化传入参数的类型：

1
2
3
4
5
import attr

@attr.s
class C(object):
    x = attr.ib(converter=int)

传入 C 的参数会自动被转化为 int 型。

包含元数据

属性可以包含元数据。

1
2
3
4
5
6
7
>>> @attr.s
... class C(object):
...    x = attr.ib(metadata={'my_metadata': 1})
>>> attr.fields(C).x.metadata
mappingproxy({'my_metadata': 1})
>>> attr.fields(C).x.metadata['my_metadata']
1

dataclasses

在 Python 3.7 中添加了新的 dataclasses 模块，基于 PEP 557。

Python 3.6 可以通过 pip 安装：

1
pipenv install dataclasses

举个例子：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from dataclasses import dataclass, field

@dataclass(hash=True, order=True)
class Product(object):
    id: int
    author_id: int
    brand_id: int
    spu_id: int
    title: str = field(hash=False, repr=False, compare=False)
    item_id: int = field(hash=False, repr=False, compare=False)
    n_comments: int = field(hash=False, repr=False, compare=False)
    creation_time: datetime = field(default=None, repr=False, compare=False,hash=False)
    update_time: datetime = field(default=None, repr=False, compare=False, hash=False)
    source: str = field(default='', repr=False, compare=False, hash=False)
    parent_id: int = field(default=0, repr=False, compare=False, hash=False)
    ancestor_id: int = field(default=0, repr=False, compare=False, hash=False)

同 attrs 类似，dataclass 是一个全局配置，默认函数参与 init、repr 和 eq，不参与 order、unsafe_hash。eq 和 order 的区别是：eq 只实现 __eq__ 方法，order 还实现其他比较方法。

field 默认参与 init 和 compare，compare 即是 eq 和 order。

使用 Python 的 type annotations 特性进行字段类型验证。

目录