5 pydantic、类型提示和模型

fastapi主要基于pydantic。它使用模型（python对象类）来定义数据结构。这些模型在fastapi应用程序中被大量使用，是编写大型应用程序时的真正优势。

5.1 类型提示

在许多计算机语言中，变量直接指向内存中的值。这就要求程序员声明它的类型，以便确定值的大小和位数。在python中，变量只是与对象相关联的名称，而对象才有类型。

变量通常与同一个对象相关联。如果我们将类型提示与变量关联起来，就可以避免一些编程错误。因此，python 在语言的标准类型模块中加入了类型提示。python解释器会忽略类型提示语法，运行程序时就像没有类型提示一样。那有什么意义呢？

您可能在一行中将一个变量视为字符串，但后来却忘记了，并将其赋值给一个不同类型的对象。虽然其他语言的编译器会提示，但 python不会。标准 python解释器会捕获正常的语法错误和运行时异常，但不会捕获变量的混合类型。像mypy这样的辅助工具会关注类型提示，并对任何不匹配发出警告。

python开发人员也可以使用这些提示，他们可以编写比类型错误检查更多的工具。下面几节将介绍pydantic软件包是如何开发的，以满足一些并不明显的需求。稍后，您将看到它与fastapi的集成如何使许多网络开发问题变得更容易处理。

变量类型提示可能只包括类型：name: type

或者用一个值初始化变量：name: type = value
类型可以是标准的python简单类型，如int或str，也可以是复杂类型，如tuple、list或dict：name: type = value

在python 3.9之前，需要从typing模块导入这些标准类型名的大写版本：

from typing import str
thing: str = "yeti"

下面是一些带有初始化的示例：

physics_magic_number: float = 1.0/137.03599913
hp_lovecraft_noun: str = "ichor"
exploding_sheep: tuple = "sis", "boom", bah!"
responses: dict = {"marco": "polo", "answer": 42}

还可以指定子类型：name: dict[keytype, valtype] = {key1: val1, key2: val2}
最常见的子类如下：

any：任意类型
union：任何指定类型，如 union[str,int]。

在python 3.10及更高版本中，可以用type1 | type2 代替 union[type1,type2]。

python dict的 pydantic定义示例如下：

from typing import any
responses: dict[str, any] = {"marco": "polo", "answer": 42}

或者更具体一点

from typing import any
responses: dict[str, any] = {"marco": "polo", "answer": 42}

或 (python 3.10 及更高版本)：

responses: dict[str, str | int] = {"marco": "polo", "answer": 42}

请注意，类型提示的变量行是合法的python，而裸变量行则不是：

$ python
...
>>> thing0
traceback (most recent call last):
  file "<stdin>", line 1, in <module>
nameerror: name thing0 is not defined
>>> thing0: str

此外，python解释器不会捕获不正确的类型使用：

$ python
...
>>> thing1: str = "yeti"
>>> thing1 = 47

但它们会被mypy捕获。如果还没有，运行 pip install mypy。将前面两行保存到名为 stuff.py,1 的文件中，然后试试下面的方法：

$ mypy stuff.py
stuff.py:2: error: incompatible types in assignment
(expression has type "int", variable has type "str")
found 1 error in 1 file (checked 1 source file)

函数返回类型提示使用了箭头而不是冒号：function(args) -> type:

下面是一个函数返回的 pydantic 示例：

def get_thing() -> str:
   return "yeti"

5.2 数据分组

通常，我们需要将一组相关的变量放在一起，而不是传递大量的单个变量。如何将多个变量整合为一组并保持类型提示呢？在本书的其他章节中，我们将使用密码生物（想象中的生物）和寻找它们的探险家（也是想象中的）的例子。我们最初的密码生物定义将只包含以下字符串变量：

name：关键字
country：两字符 iso 国家代码（3166-1 alpha 2）或 * = 全部
area 可选；美国州或其他国家分区
description：自由格式
aka：又称

而探险者将拥有以下内容：

name：关键字
country：两字符 iso 国家代码（3166-1 alpha 2）或 * = 全部
area 可选；美国州或其他国家分区
description：自由格式

这里列出了python数据分组结构（除了基本的 int、字符串之类）：

元组：不可变的对象序列
列表：可变的对象序列
集合：可变的不同对象
字典：可变的键值对象对（键必须是不可变的类型）

# 使用元组
>>> tuple_thing = ("yeti", "cn", "himalayas",
    "hirsute himalayan", "abominable snowman")
>>> print("name is", tuple_thing[0])
name is yeti

# 使用列表
>>> list_thing = ["yeti", "cn", "himalayas",
    "hirsute himalayan", "abominable snowman"]
>>> print("name is", list_thing[0])
name is yeti

# 使用元组和命名偏移量
>>> name = 0
>>> country = 1
>>> area = 2
>>> description = 3
>>> aka = 4
>>> tuple_thing = ("yeti", "cn", "himalayas",
    "hirsute himalayan", "abominable snowman")
>>> print("name is", tuple_thing[name])
name is yeti

# 使用字典
>>> dict_thing = {"name": "yeti",
...     "country": "cn",
...     "area": "himalayas",
...     "description": "hirsute himalayan",
...     "aka": "abominable snowman"}
>>> print("name is", dict_thing["name"])
name is yeti

# 使用命名元组
>>> from collections import namedtuple
>>> creaturenamedtuple = namedtuple("creaturenamedtuple",
...     "name, country, area, description, aka")
>>> namedtuple_thing = creaturenamedtuple("yeti",
...     "cn",
...     "himalaya",
...     "hirsute himalayan",
...     "abominable snowman")
>>> print("name is", namedtuple_thing[0])
name is yeti
>>> print("name is", namedtuple_thing.name)
name is yeti

# 标准类
>>> class creatureclass():
...     def __init__(self,
...       name: str,
...       country: str,
...       area: str,
...       description: str,
...       aka: str):
...         self.name = name
...         self.country = country
...         self.area = area
...         self.description = description
...         self.aka = aka
...
>>> class_thing = creatureclass(
...     "yeti",
...     "cn",
...     "himalayas"
...     "hirsute himalayan",
...     "abominable snowman")
>>> print("name is", class_thing.name)
name is yeti

# 数据类
>>> from dataclasses import dataclass
>>>
>>> @dataclass
... class creaturedataclass():
...     name: str
...     country: str
...     area: str
...     description: str
...     aka: str
...
>>> dataclass_thing = creaturedataclass(
...     "yeti",
...     "cn",
...     "himalayas"
...     "hirsute himalayan",
...     "abominable snowman")
>>> print("name is", dataclass_thing.name)
name is yeti

这对于保持变量的一致性来说已经很不错了。但我们还想要更多：

可能的替代类型的联合
缺失/可选值
默认值
数据验证
与json等格式的序列化

5.3 替代方案

使用 python 内置的数据结构（尤其是字典）很有诱惑力。但你不可避免地会发现字典有点过于 “松散”。自由是有代价的。你需要检查一切：

键是否可选？
如果键缺失，是否有默认值？
键是否存在？
如果存在，键值的类型是否正确？
如果存在，值是否在正确的范围内或与模式匹配？

至少有三种解决方案可以满足这些要求中的至少一部分：dataclasses、attrs（dataclasses的超集）、pydantic（集成到了fastapi中）。

pydantic 在验证方面非常突出，它与fastapi的集成可以捕捉到许多潜在的数据错误。pydantic依赖于继承（从 basemodel 类继承），而其他两个软件则使用python装饰器来定义对象。pydantic 的另一大优点是它使用了标准的 python 类型提示语法，而旧版库则在类型提示之前就自行推出了类型提示。

因此，我在本书中将使用 pydantic，但如果你不使用 fastapi，你也可能会发现这两种库中的任何一种都有用武之地。

pydantic 提供了指定这些检查的任意组合的方法：

必须与可选
未指定但需要的默认值
预期的数据类型
值范围限制
其他基于函数的检查（如果需要）
序列化和反序列化

参考资料

软件测试精品书籍文档下载持续更新 https://github.com/china-testing/python-testing-examples 请点赞，谢谢！
本文涉及的python测试开发库谢谢点赞！ https://github.com/china-testing/python_cn_resouce
python精品书籍下载 https://github.com/china-testing/python_cn_resouce/blob/main/python_good_books.md
linux精品书籍下载

5.4简单示例

这个初始示例将使用三个文件：

model.py定义pydantic 模型。
data.py 假数据源，定义了一个模型实例。
web.py 定义了返回假数据的fastapi网络端点。

定义生物模型：model.py

from pydantic import basemodel


class creature(basemodel):
    name: str
    country: str
    area: str
    description: str
    aka: str

thing = creature(
    name="yeti",
    country="cn",
    area="himalayas",
    description="hirsute himalayan",
    aka="abominable snowman")

print("name is", thing.name)

creature类继承自pydantic的basemodel。name、country、area、description和aka后面的 : str部分是类型提示，表明每个字符串都是python字符串。

>>> thing = creature(
...     name="yeti",
...     country="cn",
...     area="himalayas"
...     description="hirsute himalayan",
...     aka="abominable snowman")
>>> print("name is", thing.name)
name is yeti

在 data.py 中定义假数据
从模型导入生物

from model import creature

_creatures: list[creature] = [
    creature(name="yeti",
             country="cn",
             area="himalayas",
             description="hirsute himalayan",
             aka="abominable snowman"
             ),
    creature(name="sasquatch",
             country="us",
             area="*",
             description="yeti's cousin eddie",
             aka="bigfoot")
]

def get_creatures() -> list[creature]:
    return _creatures

这段代码导入了我们刚刚编写的 model.py。通过调用它的生物对象列表 _creatures，并提供 get_creatures() 函数来返回它们，它做了一点数据隐藏。

web.py：

from model import creature
from fastapi import fastapi

app = fastapi()

@app.get("/creature")
def get_all() -> list[creature]:
    from data import get_creatures
    return get_creatures()

if __name__ == "__main__":
    import uvicorn
    uvicorn.run("web:app", reload=true)

现在启动服务器。

$ python web.py
info:     will watch for changes in these directories: ['d:\\code\\fastapi-main\\example']
info:     uvicorn running on http://127.0.0.1:8000 (press ctrl+c to quit)
info:     started reloader process [19124] using watchfiles
info:     started server process [22344]
info:     waiting for application startup.
info:     application startup complete.

验证：

$ http http://localhost:8000/creature
http/1.1 200 ok
content-length: 211
content-type: application/json
date: sat, 08 jun 2024 02:20:40 gmt
server: uvicorn

[
    {
        "aka": "abominable snowman",
        "area": "himalayas",
        "country": "cn",
        "description": "hirsute himalayan",
        "name": "yeti"
    },
    {
        "aka": "bigfoot",
        "area": "*",
        "country": "us",
        "description": "yeti's cousin eddie",
        "name": "sasquatch"
    }
]

5.5 验证类型

试着给一个或多个 “生物 ”字段分配一个错误类型的值。让我们使用独立测试来实现这一点（pydantic 并不适用于任何网页代码；这是一个数据问题）。

from model import creature

dragon = creature(
    name="dragon",
    description=["incorrect", "string", "list"],
    country="*"
    )

运行测试

$ python 5-14.py
name is yeti
traceback (most recent call last):
  file "d:\code\fastapi-main\example\5-14.py", line 3, in <module>
    dragon = creature(
  file "c:\users\xuron\appdata\roaming\python\python310\site-packages\pydantic\main.py", line 176, in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.validationerror: 3 validation errors for creature
area
  field required [type=missing, input_value={'name': 'dragon', 'descr...'list'], 'country': '*'}, input_type=dict]
    for further information visit https://errors.pydantic.dev/2.7/v/missing
description
  input should be a valid string [type=string_type, input_value=['incorrect', 'string', 'list'], input_type=list]
    for further information visit https://errors.pydantic.dev/2.7/v/string_type
aka
  field required [type=missing, input_value={'name': 'dragon', 'descr...'list'], 'country': '*'}, input_type=dict]
    for further information visit https://errors.pydantic.dev/2.7/v/missing

5.6 验证值

即使值的类型符合 creature 类中的说明，也可能需要通过更多检查。可以对值本身进行一些限制：

integer (conint)或float：
- gt: 大于
- lt:小于
- ge:大于或等于
- le:小于或等于
- multiple_of: 数值的整数倍
string (constr)：
- min_length：字符（非byte）的最小长度
- max_length：最大字符长度
- to_upper：转换为大写字母
- to_lower：转为小写
- regex：匹配python正则表达式
元组、列表或集合：
- min_items：最小元素数
- max_items：元素的最大数量

这些在模型的类型部分中指定。

实例：确保名称字段总是至少有两个字符长。否则，“”（空字符串）就是有效字符串。

from pydantic import basemodel, constr

class creature(basemodel):
    name: constr(min_length=2)
    country: str
    area: str
    description: str
    aka: str

bad_creature = creature(name="!", description="it's a raccoon", area="your attic")

执行：

traceback (most recent call last):

  file d:\programdata\anaconda3\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  file d:\code\test5.py:10
    bad_creature = creature(name="!", description="it's a raccoon", area="your attic")

  file ~\appdata\roaming\python\python310\site-packages\pydantic\main.py:176 in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)

validationerror: 3 validation errors for creature
name
  string should have at least 2 characters [type=string_too_short, input_value='!', input_type=str]
    for further information visit https://errors.pydantic.dev/2.7/v/string_too_short
country
  field required [type=missing, input_value={'name': '!', 'descriptio...", 'area': 'your attic'}, input_type=dict]
    for further information visit https://errors.pydantic.dev/2.7/v/missing
aka
  field required [type=missing, input_value={'name': '!', 'descriptio...", 'area': 'your attic'}, input_type=dict]
    for further information visit https://errors.pydantic.dev/2.7/v/missing

下列使用了另一种方法，即pydantic字段规范。

from pydantic import basemodel, constr

class creature(basemodel):
    name: constr(min_length=2)
    country: str
    area: str
    description: str
    aka: str

bad_creature = creature(name="!", description="it's a raccoon", area="your attic")

执行：

traceback (most recent call last):

  file d:\programdata\anaconda3\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  file d:\code\test5.py:10
    bad_creature = creature(name="!", description="it's a raccoon", area="your attic")

  file ~\appdata\roaming\python\python310\site-packages\pydantic\main.py:176 in __init__
    self.__pydantic_validator__.validate_python(data, self_instance=self)

validationerror: 3 validation errors for creature
name
  string should have at least 2 characters [type=string_too_short, input_value='!', input_type=str]
    for further information visit https://errors.pydantic.dev/2.7/v/string_too_short
country
  field required [type=missing, input_value={'name': '!', 'descriptio...", 'area': 'your attic'}, input_type=dict]
    for further information visit https://errors.pydantic.dev/2.7/v/missing
aka
  field required [type=missing, input_value={'name': '!', 'descriptio...", 'area': 'your attic'}, input_type=dict]
    for further information visit https://errors.pydantic.dev/2.7/v/missing

field() 的...参数表示需要一个值，而且没有默认值

这只是对 pydantic 的简单介绍。主要的收获是，它能让你自动验证数据。在从网络层或数据层获取数据时，您将看到这一点有多么有用。