跳到主要内容

06. Unions

Unions 的验证方式与其他类型不同,它只需要其中一种类型验证通过即为合法

有三种验证方案可以选择:

  1. left to right mode - each member of the union is tried in order and the first match is returned
  2. smart mode - similar to "left to right mode" members are tried in order; however, validation will proceed past the first match to attempt to find a better match, this is the default mode for most union validation
  3. discriminated unions - only one member of the union is tried, based on a discriminator

验证模式

自左而右 Left to Right Mode

Because this mode often leads to unexpected validation results, it is not the default in Pydantic >=2, instead union_mode='smart' is the default.

验证逻辑按照给定的类型顺序执行,直到对某个类型首次验证成功

如果全部验证失败,报错信息会包括所有类型

union_mode='left_to_right' must be set as a Field parameter on union fields where you want to use it.

from typing import Union

from pydantic import BaseModel, Field, ValidationError


class User(BaseModel):
id: Union[str, int] = Field(union_mode='left_to_right')


print(User(id=123))
#> id=123
print(User(id='hello'))
#> id='hello'

try:
User(id=[])
except ValidationError as e:
print(e)
"""
2 validation errors for User
id.str
Input should be a valid string [type=string_type, input_value=[], input_type=list]
id.int
Input should be a valid integer [type=int_type, input_value=[], input_type=list]
"""

类型成员的顺序对于验证至关重要!

from typing import Union

from pydantic import BaseModel, Field


class User(BaseModel):
id: Union[int, str] = Field(union_mode='left_to_right')


print(User(id=123))
#> id=123
print(User(id='456'))
#> id=456

机智模式 Smart Mode

In this mode, pydantic scores a match of a union member into one of the following three groups (得分从高到低):

  • 精确类型匹配。for example an int input to a float | int union validation is an exact type match for the int member
  • 严格模式能够验证通过
  • 宽松模式能够验证通过

哪个类型得分最高,就优先用哪个类型作验证

以下是选择最佳匹配的步骤

  1. 按照联合类型的成员顺序,从左到右尝试匹配,任何成功的匹配都会是上述三种之一
  2. 如果通过精确类型匹配验证成功,那么立即返回该成员,不会尝试后续成员。
  3. 如果至少有一个成员以“严格”匹配的方式验证成功,那么返回那些“严格”匹配中最左边的成员。
  4. 如果至少有一个成员以“宽松”模式验证成功,那么返回最左边的匹配。
  5. 如果所有成员的验证都失败,则返回所有错误。
from typing import Union
from uuid import UUID

from pydantic import BaseModel


class User(BaseModel):
id: Union[int, str, UUID]
name: str


user_01 = User(id=123, name='John Doe')
print(user_01)
#> id=123 name='John Doe'
print(user_01.id)
#> 123

user_02 = User(id='1234', name='John Doe')
print(user_02)
#> id='1234' name='John Doe'
print(user_02.id)
#> 1234

user_03_uuid = UUID('cf57432e-809e-4353-adbd-9d5c0d733868')
user_03 = User(id=user_03_uuid, name='John Doe')
print(user_03)
#> id=UUID('cf57432e-809e-4353-adbd-9d5c0d733868') name='John Doe'
print(user_03.id)
#> cf57432e-809e-4353-adbd-9d5c0d733868
print(user_03_uuid.int)
#> 275603287559914445491632874575877060712

区分联合 Discriminated Unions

Discriminated unions are sometimes referred to as "Tagged Unions".

We can use discriminated unions to more efficiently validate Union types, by choosing which member of the union to validate against.

这使得验证更加高效,也避免了验证失败时错误的不断增加。

使用 str discriminator

通常情况下,一个包含多 model 的 Union 类型,其所有 model 都有一个共同的字段,可以用来区分数据应该验证为哪个情况;在 OpenAPI 中,这称为“区分器”(discriminator)

被区分的字段必须是 Literal

from typing import Literal, Union

from pydantic import BaseModel, Field, ValidationError


class Cat(BaseModel):
pet_type: Literal['cat']
meows: int


class Dog(BaseModel):
pet_type: Literal['dog']
barks: float


class Lizard(BaseModel):
pet_type: Literal['reptile', 'lizard']
scales: bool


class Model(BaseModel):
pet: Union[Cat, Dog, Lizard] = Field(..., discriminator='pet_type')
n: int


print(Model(pet={'pet_type': 'dog', 'barks': 3.14}, n=1))
#> pet=Dog(pet_type='dog', barks=3.14) n=1
try:
Model(pet={'pet_type': 'dog'}, n=1)
except ValidationError as e:
print(e)
"""
1 validation error for Model
pet.dog.barks
Field required [type=missing, input_value={'pet_type': 'dog'}, input_type=dict]
"""

使用 Discriminator

有时没有一个可以用作为 discriminator 的单一统一字段。这是可以用 Discriminator

from typing import Any, Literal, Union
from typing_extensions import Annotated
from pydantic import BaseModel, Discriminator, Tag

class Pie(BaseModel):
time_to_cook: int
num_ingredients: int

class ApplePie(Pie):
fruit: Literal['apple'] = 'apple'

class PumpkinPie(Pie):
filling: Literal['pumpkin'] = 'pumpkin'


def get_discriminator_value(v: Any) -> str:
if isinstance(v, dict):
return v.get('fruit', v.get('filling'))
return getattr(v, 'fruit', getattr(v, 'filling', None))

class ThanksgivingDinner(BaseModel):
dessert: Annotated[
Union[
Annotated[ApplePie, Tag('apple')],
Annotated[PumpkinPie, Tag('pumpkin')],
],
Discriminator(get_discriminator_value),
]


apple_variation = ThanksgivingDinner.model_validate(
{'dessert': {'fruit': 'apple', 'time_to_cook': 60, 'num_ingredients': 8}}
)
print(repr(apple_variation))
"""
ThanksgivingDinner(dessert=ApplePie(time_to_cook=60, num_ingredients=8, fruit='apple'))
"""

pumpkin_variation = ThanksgivingDinner.model_validate(
{
'dessert': {
'filling': 'pumpkin',
'time_to_cook': 40,
'num_ingredients': 6,
}
}
)
print(repr(pumpkin_variation))
"""
ThanksgivingDinner(dessert=PumpkinPie(time_to_cook=40, num_ingredients=6, filling='pumpkin'))
"""

当然,基础类型和 model 类型混合的场景,也是可以胜任的

from typing import Any, Union
from typing_extensions import Annotated
from pydantic import BaseModel, Discriminator, Tag, ValidationError

def model_x_discriminator(v: Any) -> str:
if isinstance(v, int):
return 'int'
if isinstance(v, (dict, BaseModel)):
return 'model'
else:
# return None if the discriminator value isn't found
return None

class SpecialValue(BaseModel):
value: int

class DiscriminatedModel(BaseModel):
value: Annotated[
Union[
Annotated[int, Tag('int')],
Annotated['SpecialValue', Tag('model')],
],
Discriminator(model_x_discriminator),
]


model_data = {'value': {'value': 1}}
m = DiscriminatedModel.model_validate(model_data)
print(m)
#> value=SpecialValue(value=1)

int_data = {'value': 123}
m = DiscriminatedModel.model_validate(int_data)
print(m)
#> value=123

try:
DiscriminatedModel.model_validate({'value': 'not an int or a model'})
except ValidationError as e:
print(e)
"""
1 validation error for DiscriminatedModel
value
Unable to extract tag using discriminator model_x_discriminator() [type=union_tag_not_found, input_value='not an int or a model', input_type=str]
"""

There are a few ways to set a discriminator for a field, all varying slightly in syntax.

For str discriminators:

some_field: Union[...] = Field(discriminator='my_discriminator' 
some_field: Annotated[Union[...], Field(discriminator='my_discriminator')]

For callable Discriminators:

some_field: Union[...] = Field(discriminator=Discriminator(...))
some_field: Annotated[Union[...], Discriminator(...)]
some_field: Annotated[Union[...], Field(discriminator=Discriminator(...))]

Discriminated unions cannot be used with only a single variant, such as Union[Cat]. Python changes Union[T] into T at interpretation time, so it is not possible for pydantic to distinguish fields of Union[T] from T.

嵌套的 discriminator

一个字段只能设置一个区分器,但有时你想要结合多个区分器。你可以通过创建嵌套的 Annotated 类型来实现,例如:

from typing import Literal, Union
from typing_extensions import Annotated
from pydantic import BaseModel, Field, ValidationError


class BlackCat(BaseModel):
pet_type: Literal['cat']
color: Literal['black']
black_name: str

class WhiteCat(BaseModel):
pet_type: Literal['cat']
color: Literal['white']
white_name: str


Cat = Annotated[Union[BlackCat, WhiteCat], Field(discriminator='color')]


class Dog(BaseModel):
pet_type: Literal['dog']
name: str


Pet = Annotated[Union[Cat, Dog], Field(discriminator='pet_type')]


class Model(BaseModel):
pet: Pet
n: int


m = Model(pet={'pet_type': 'cat', 'color': 'black', 'black_name': 'felix'}, n=1)
print(m)
#> pet=BlackCat(pet_type='cat', color='black', black_name='felix') n=1
try:
Model(pet={'pet_type': 'cat', 'color': 'red'}, n='1')
except ValidationError as e:
print(e)
"""
1 validation error for Model
pet.cat
Input tag 'red' found using 'color' does not match any of the expected tags: 'black', 'white' [type=union_tag_invalid, input_value={'pet_type': 'cat', 'color': 'red'}, input_type=dict]
"""
try:
Model(pet={'pet_type': 'cat', 'color': 'black'}, n='1')
except ValidationError as e:
print(e)
"""
1 validation error for Model
pet.cat.black.black_name
Field required [type=missing, input_value={'pet_type': 'cat', 'color': 'black'}, input_type=dict]
"""