键入使用Python中Pydantic在JSON文件中解析字符串到范围的注释

发布于 2025-01-26 05:16:16 字数 956 浏览 2 评论 0 原文

我已经设置了一个旨在解析JSON文件的Pydantic类。 range 属性是从“ 11-34” 的字符串中解析的(或更准确地说是从所示的以下等级):

    RANGE_STRING_REGEX = r"^(?P<first>[1-6]+)(-(?P<last>[1-6]+))?$"

    class RandomTableEvent(BaseModel):
        name: str
        range: Annotated[str, Field(regex=RANGE_STRING_REGEX)]
    
        @validator("range", allow_reuse=True)
        def convert_range_string_to_range(cls, r) -> "range":
            match_groups = re.fullmatch(RANGE_STRING_REGEX, r).groupdict()
            first = int(match_groups["first"])
            last = int(match_groups["last"]) if match_groups["last"] else first
            return range(first, last + 1)

生成的模式作品工作和验证通过。

但是, range 属性的类型注释严格来说是不正确的,因为 range 属性从字符串(类型注释)转换为 范围验证器函数中的对象。

注释它并仍然保持模式一代的正确方法是什么? 是否有另一种处理这种隐式类型转换的方法(例如,将字符串自动转换为Pydantic的INT-自定义类型有类似的东西)吗?

I've set up a Pydantic class that's intended to parse JSON files. The range attribute is parsed from a string of the form "11-34" (or more precisely from the regex shown):

    RANGE_STRING_REGEX = r"^(?P<first>[1-6]+)(-(?P<last>[1-6]+))?
quot;

    class RandomTableEvent(BaseModel):
        name: str
        range: Annotated[str, Field(regex=RANGE_STRING_REGEX)]
    
        @validator("range", allow_reuse=True)
        def convert_range_string_to_range(cls, r) -> "range":
            match_groups = re.fullmatch(RANGE_STRING_REGEX, r).groupdict()
            first = int(match_groups["first"])
            last = int(match_groups["last"]) if match_groups["last"] else first
            return range(first, last + 1)

The generated schema works and the validation passes.

However, the type annotation for the range attribute in the class is strictly speaking not correct, as the range attribute is converted from a string (type annotation) to a range object in the validator function.

What would be the correct way of annotating this and still maintaining the schema generation?
Is there another way of dealing with this implicit type conversion (e.g. strings are automatically converted to int in Pydantic - is there something similar for custom types)?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

挽梦忆笙歌 2025-02-02 05:16:16

range 不是 pydantic 的支持类型,并且在尝试创建JSON模式时将其用作字段的类型将导致错误,但是 Pydantic 支持自定义数据类型

您还可以定义自己的自定义数据类型。有几种方法可以实现它。

get_validators

您使用具有ClassMethod的自定义类 __ get_validators __ 。它将被要求获得验证器来解析和验证输入数据。

但是此自定义数据类型不能从 range 继承,因为它是最终的。因此,您可以创建一种自定义数据类型,该类型在内部使用 range 并公开范围方法:它将像 range 一样工作,但它不会是 range isInstance(...,range)将为 false )。

相同的 pydantic 文档显示了如何使用 __ modify_schema __ 方法自定义自定义数据类型的JSON模式。

完整示例:

import re
from typing import Any, Callable, Dict, Iterator, SupportsIndex, Union

from pydantic import BaseModel


class Range:
    _RANGE_STRING_REGEX = r"^(?P<first>[1-6]+)(-(?P<last>[1-6]+))?$"

    @classmethod
    def __get_validators__(cls) -> Iterator[Callable[[Any], Any]]:
        yield cls.validate

    @classmethod
    def validate(cls, v: Any) -> "Range":
        if not isinstance(v, str):
            raise ValueError("expected string")

        match = re.fullmatch(cls._RANGE_STRING_REGEX, v)
        if not match:
            raise ValueError("invalid string")

        match_groups = match.groupdict()
        first = int(match_groups["first"])
        last = int(match_groups["last"]) if match_groups["last"] else first

        return cls(range(first, last + 1))

    def __init__(self, r: range) -> None:
        self._range = r

    @classmethod
    def __modify_schema__(cls, field_schema: Dict[str, Any]) -> None:
        # Customize the JSON schema as you want
        field_schema["pattern"] = cls._RANGE_STRING_REGEX
        field_schema["type"] = "string"

    # Implement the range methods and use self._range

    @property
    def start(self) -> int:
        return self._range.start

    @property
    def stop(self) -> int:
        return self._range.stop

    @property
    def step(self) -> int:
        return self._range.step

    def count(self, value: int) -> int:
        return self._range.count(value)

    def index(self, value: int) -> int:
        return self._range.index(value)

    def __len__(self) -> int:
        return self._range.__len__()

    def __contains__(self, o: object) -> bool:
        return self._range.__contains__(o)

    def __iter__(self) -> Iterator[int]:
        return self._range.__iter__()

    def __getitem__(self, key: Union[SupportsIndex, slice]) -> int:
        return self._range.__getitem__(key)

    def __reversed__(self) -> Iterator[int]:
        return self._range.__reversed__()

    def __repr__(self) -> str:
        return self._range.__repr__()


class RandomTableEvent(BaseModel):
    name: str
    range: Range


event = RandomTableEvent(name="foo", range="11-34")

print("event:", event)
print("event.range:", event.range)
print("schema:", event.schema_json(indent=2))
print("is instance of range:", isinstance(event.range, range))
print("event.range.start:", event.range.start)
print("event.range.stop:", event.range.stop)
print("event.range[0:5]", event.range[0:5])
print("last 3 elements:", list(event.range[-3:]))

输出:

event: name='foo' range=range(11, 35)
event.range: range(11, 35)
schema: {
  "title": "RandomTableEvent",
  "type": "object",
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "range": {
      "title": "Range",
      "pattern": "^(?P<first>[1-6]+)(-(?P<last>[1-6]+))?$",
      "type": "string"
    }
  },
  "required": [
    "name",
    "range"
  ]
}
is instance of range: False
event.range.start: 11
event.range.stop: 35
event.range[0:5] range(11, 16)
last 3 elements: [32, 33, 34]

range is not a supported type by pydantic and using it as a type for a field will cause an error when trying to create a JSON schema, but pydantic supports Custom Data Types:

You can also define your own custom data types. There are several ways to achieve it.

Classes with get_validators

You use a custom class with a classmethod __get_validators__. It will be called to get validators to parse and validate the input data.

But this custom data type cannot inherit from range because it is final. So you could create a custom data type that uses a range internally and exposes the range methods: it will work like a range but it will not be a range (isinstance(..., range) will be False).

The same pydantic documentation shows how to use a __modify_schema__ method to customize the JSON schema of a custom data type.

Full example:

import re
from typing import Any, Callable, Dict, Iterator, SupportsIndex, Union

from pydantic import BaseModel


class Range:
    _RANGE_STRING_REGEX = r"^(?P<first>[1-6]+)(-(?P<last>[1-6]+))?
quot;

    @classmethod
    def __get_validators__(cls) -> Iterator[Callable[[Any], Any]]:
        yield cls.validate

    @classmethod
    def validate(cls, v: Any) -> "Range":
        if not isinstance(v, str):
            raise ValueError("expected string")

        match = re.fullmatch(cls._RANGE_STRING_REGEX, v)
        if not match:
            raise ValueError("invalid string")

        match_groups = match.groupdict()
        first = int(match_groups["first"])
        last = int(match_groups["last"]) if match_groups["last"] else first

        return cls(range(first, last + 1))

    def __init__(self, r: range) -> None:
        self._range = r

    @classmethod
    def __modify_schema__(cls, field_schema: Dict[str, Any]) -> None:
        # Customize the JSON schema as you want
        field_schema["pattern"] = cls._RANGE_STRING_REGEX
        field_schema["type"] = "string"

    # Implement the range methods and use self._range

    @property
    def start(self) -> int:
        return self._range.start

    @property
    def stop(self) -> int:
        return self._range.stop

    @property
    def step(self) -> int:
        return self._range.step

    def count(self, value: int) -> int:
        return self._range.count(value)

    def index(self, value: int) -> int:
        return self._range.index(value)

    def __len__(self) -> int:
        return self._range.__len__()

    def __contains__(self, o: object) -> bool:
        return self._range.__contains__(o)

    def __iter__(self) -> Iterator[int]:
        return self._range.__iter__()

    def __getitem__(self, key: Union[SupportsIndex, slice]) -> int:
        return self._range.__getitem__(key)

    def __reversed__(self) -> Iterator[int]:
        return self._range.__reversed__()

    def __repr__(self) -> str:
        return self._range.__repr__()


class RandomTableEvent(BaseModel):
    name: str
    range: Range


event = RandomTableEvent(name="foo", range="11-34")

print("event:", event)
print("event.range:", event.range)
print("schema:", event.schema_json(indent=2))
print("is instance of range:", isinstance(event.range, range))
print("event.range.start:", event.range.start)
print("event.range.stop:", event.range.stop)
print("event.range[0:5]", event.range[0:5])
print("last 3 elements:", list(event.range[-3:]))

Output:

event: name='foo' range=range(11, 35)
event.range: range(11, 35)
schema: {
  "title": "RandomTableEvent",
  "type": "object",
  "properties": {
    "name": {
      "title": "Name",
      "type": "string"
    },
    "range": {
      "title": "Range",
      "pattern": "^(?P<first>[1-6]+)(-(?P<last>[1-6]+))?
quot;,
      "type": "string"
    }
  },
  "required": [
    "name",
    "range"
  ]
}
is instance of range: False
event.range.start: 11
event.range.stop: 35
event.range[0:5] range(11, 16)
last 3 elements: [32, 33, 34]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文