Django - 大序列化器和小序列化器之间的时间差异
我正在创建一个音乐评级应用程序,并使用 REST Framework 在 Django 中创建 API。这非常简单,但我想知道使用大序列化器模型和小序列化器模型时加载时间是否有很大差异。 我所说的大和小是指获取更多数据。例如,我有一个专辑页面,我需要在其中使用此序列化器。
"id": 2,
"title": "OK Computer",
"slug": "ok-computer",
"created_at": "2022-02-22T21:51:52.528148Z",
"artist": {
"id": 13,
"name": "Radiohead",
"slug": "radiohead",
"image": "http://127.0.0.1:8000/media/artist/images/radiohead.jpg",
"background_image": "http://127.0.0.1:8000/media/artist/bg_images/radiohead.jpg",
"created_at": "2022-02-22T00:00:00Z"
},
"art_cover": "http://127.0.0.1:8000/media/album/art_covers/ok-computer_cd5Vv6U.jpg",
"genres": [
"Alternative Rock",
"Art Rock"
],
"overall_score": null,
"number_of_ratings": 0,
"release_date": "1997-05-28",
"release_type": "LP",
"tracks": [
{
"position": 1,
"title": "Airbag",
"duration": "00:04:47"
},
{
"position": 2,
"title": "Paranoid Android",
"duration": "00:06:27"
}
],
"links": [
{
"service_name": "spotify",
"url": "https://open.spotify.com/album/6dVIqQ8qmQ5GBnJ9shOYGE?si=L_VNH3HeSMmGBqfiqKiGWA"
}
],
"aoty": null
这个序列化程序相当庞大,我只需要专辑详细信息页面的这些数据。我还在专辑列表页面中提取这些数据,其中列出了我的所有专辑,并且几乎所有这些数据都没有被使用。 如果我制作另一个序列化程序,稍微简单一些并在专辑列表页面中使用它,加载速度会有很大差异吗?
如果是这样,我可以创建一个视图集,当我访问 /albums
api url 时,可以看到不太复杂的序列化器,而当我访问 /albums 等更具体的 url 时,会显示更复杂的序列化器/1 ?
I'm creating a music rating app and I'm using REST Framework to create API in Django. It's super easy but I'm wondering if there is any big difference in loading time when using big serializers model and small.
By big and small I mean like in getting more data. For instance I have a album page where I need to use this serializer.
"id": 2,
"title": "OK Computer",
"slug": "ok-computer",
"created_at": "2022-02-22T21:51:52.528148Z",
"artist": {
"id": 13,
"name": "Radiohead",
"slug": "radiohead",
"image": "http://127.0.0.1:8000/media/artist/images/radiohead.jpg",
"background_image": "http://127.0.0.1:8000/media/artist/bg_images/radiohead.jpg",
"created_at": "2022-02-22T00:00:00Z"
},
"art_cover": "http://127.0.0.1:8000/media/album/art_covers/ok-computer_cd5Vv6U.jpg",
"genres": [
"Alternative Rock",
"Art Rock"
],
"overall_score": null,
"number_of_ratings": 0,
"release_date": "1997-05-28",
"release_type": "LP",
"tracks": [
{
"position": 1,
"title": "Airbag",
"duration": "00:04:47"
},
{
"position": 2,
"title": "Paranoid Android",
"duration": "00:06:27"
}
],
"links": [
{
"service_name": "spotify",
"url": "https://open.spotify.com/album/6dVIqQ8qmQ5GBnJ9shOYGE?si=L_VNH3HeSMmGBqfiqKiGWA"
}
],
"aoty": null
This serializer is rather massive and I only need this data for Albums details page. I also pull this data in Albums list page where I list all my albums and almost all of this data is not used.
If I make another serializer, little less complex and use it in albums list page, will there be a drastic difference in load speed?
And if so, can I make a Viewset where the less complex serializer is visible when I access my /albums
api url and the more complex serializer is displayed when I access more specific url like /albums/1
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当您关心加载对象的速度时,还有另一种方法可以提高性能。
就像有很多方法我们可以使用
因为在可写 ModelSerializer 中,很多时间是花费在验证上。因此,我们可以通过将所有字段标记为只读来加快速度。
有很多关于 Python 序列化性能的文章。正如预期的那样,大多数文章都专注于使用 select_lated 和 prefetch_lated 等技术来改进数据库访问。虽然这两种方法都是提高 API 请求整体响应时间的有效方法,但它们并没有解决序列化本身的问题。
是的,你应该选择多个序列化器,而不是一个大嵌套的序列化器
As you are concerned about the speed to load the objects, There is another way to improve performance.
Like there are a number of methods we can use
Because In the writable ModelSerializer, a lot of time is spent on validations. So we can make it faster by marking all fields as read-only.
A lot of articles were written about serialization performance in Python. As expected, most articles focus on improving DB access using techniques like select_related and prefetch_related. While both are valid ways to improve the overall response time of an API request, they don't address the serialization itself.
And Yes you should go for multiple serializers instead of a BIG NESTED ONE
这取决于。通常,用只有前端/用户需要的数据来限制响应是一个很好的做法。当然,这些需求将在前端的视图和页面中不断发展。克服这个问题的一种方法是为不同的视图或查询参数提供不同的序列化器,例如在视图集上使用 get_serializer_class 方法,或者在序列化器本身中使用请求对象。另外,如果我没记错的话,有一个扩展允许您定义一些您想要或不想要的字段。
DRF 序列化器不仅仅是“纯粹地”执行序列化器,因为您可以重新定义字段,您可以拥有方法字段,显然还有关系字段。
大多数时候,在 IntegerField、CharFields 等字段上,序列化大量数据不会出现问题,因为它很简单。但是对于相关字段(外键、ManyToMany ...),如果您不预取它们,这可能会导致一些问题:关系和嵌套关系将为您的所有项目创建一个新的数据库查询。
例如,在您的示例中,您有专辑和曲目。如果您在获取专辑之前不预取曲目,则将为查询中的每个专辑创建一个请求!这是因为序列化器将尝试序列化每个字段,当它看到
track
字段时,Django 将从数据库中获取对象。这样做会很明显,并且根本无法扩展,即使数据集很小。处理这些问题的另一种方法是进行分页,虽然有时仍然很慢,但它允许您只提供数据库的子集,因此只有有限的项目可以序列化。
总结一下:取决于您序列化的内容,通常会因为未预取数据或方法字段而变慢。尝试尽可能多地使用分页,是的,如果您知道它的数据会更少(因此响应更快/在线数据更少),您可以使用不同的序列化器。
注意:当处理非常复杂的对象时,您不想使用大量序列化器,您可以使用 GraphQL,其中用户知道他想要什么,并向后端询问这些字段。
It depends. Usually, limiting the response with data that only the frontend/users needs it a good practice. Of course, what happens is that those needs will evolve accross your views and pages on frontend. One way to overcome that is to provide different serializer for different views or query params, using for instance the
get_serializer_class method on your viewset'
or in the serializer itself with the request object. Also, if I remember correctly, there is an extension that allows you to define some fields you want to have back or not.A DRF serializer isn't only doing serializer 'purely speaking', because you can redefine fields, you can have method fields, and obvisouly, relantionfield.
Most of the time, on fields like IntegerField, CharFields, etc, you will not have of issues with serializing lots of data, because it's straightforward. But with RelatedFields (foreignkey, ManyToMany ...), this can cause some problems if you don't prefetch them: relationship and nested relationship will create a new database query for all your items.
For instance, in your example you have Album and tracks. If you don't prefetch tracks before fetching album, you will create a request for each album in your query ! This is because the serializer will try to serializer each field, and when it see the
track
field, Django will fetch the object from the database. Doing so will be noticeable, and will not scale at all, even with a small dataset.Another way to deal with those problems it to do pagination, while it will still sometimes be slow, it allows you to only give a subset of your database, and thus only have limited of items to serialize.
To summarize: depends on what you serialize, usually will be slow because of not prefetching data, or because of method fields. Try to use pagination as much as you can, and yes, you can use a different serializer if you know that it will have way less data (and so having a faster response/less data on wire).
Note: when dealing with really complex objects where you don't want to have a large number of serializers, you could use GraphQL, where the user knows what he wants and ask the backend about those fields.