从pydantic对象列表中创建JSON字符串列表

发布于 2025-01-21 05:52:16 字数 567 浏览 2 评论 0原文

我正在尝试从pydantic对象列表中获取JSON字符串列表,并以下面的方式完成:

raw_books = [json.dumps(x.dict()) for x in batch.books]

一切正常,但是当列表包含数千个元素时,似乎需要很多时间。更有效地进行这种转变的另一种方法是什么?

例如: batch.books中的2000个列表元素导致执行时间约为5秒。

Pydantic模型:

class Book(BaseModel):
    title: constr(min_length=1, max_length=128)
    authors: List[constr(min_length=1, max_length=128)]
    price: float
    year_of_publishing: int
    publisher: constr(min_length=1, max_length=128)


class BulkBook(BaseModel):
    books: List[Book]

I am trying to get a list of json strings from a list of Pydantic objects and have done it in the following way:

raw_books = [json.dumps(x.dict()) for x in batch.books]

Everything works fine, but it seems to take a lot of time when the list holds thousands of elements. What would be another way to do this transformation more efficiently?

For example: 2000 list elements in batch.books result in execution times of around 5 seconds.

The pydantic model(s):

class Book(BaseModel):
    title: constr(min_length=1, max_length=128)
    authors: List[constr(min_length=1, max_length=128)]
    price: float
    year_of_publishing: int
    publisher: constr(min_length=1, max_length=128)


class BulkBook(BaseModel):
    books: List[Book]

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

半世晨晓 2025-01-28 05:52:16

2000列表元素在batch中。

因此我们谈论的是5000/2000 每个list的MS,对吗?

2.5用于转换list的MS(您在哪些元素上json.dumps(x.dict())))不是这样我猜很长一段时间。

2000 list elements in batch.books result in execution times of around 5 seconds

So we are talking of about 5000/2000ms for each list, right?

2.5ms for the conversion of a list (on which elements you are calling json.dumps(x.dict())) it's not such a long time I guess.

眼趣 2025-01-28 05:52:16
raw_books = [json.dumps(x.dict()) for x in batch.books]

您正在调用json.dumps <代码>列表理解,这似乎是一个问题。预计这确实会导致性能大幅下降,因为dumps在循环中的每次迭代都调用。

最简单的解决方案是将json.dumps移出list完全理解:

raw_books = json.dumps([x.dict() for x in batch.books])
raw_books = [json.dumps(x.dict()) for x in batch.books]

You are calling json.dumps inside the list comprehension, that seems to be a problem. That would be expected do cause a huge dip in performance, because dumps is called for each iteration in the loop.

The simplest solution would be to move json.dumps out of the list comprehension entirely:

raw_books = json.dumps([x.dict() for x in batch.books])
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文