Django Paginate CPU 时间缩放与选定对象的数量未显示的对象

发布于 2024-09-30 19:27:11 字数 3133 浏览 8 评论 0 原文

我有一个包含大约 3900 个条目的简单数据库,并使用通用视图(django.views.generic.list_detail.object_list)及其 django-pagination(通过 paginate_by)来浏览数据库中的数据,但某些查询非常慢。

奇怪的是,尽管每页仅显示 50 个对象,但渲染时间与选择的对象数量大致呈线性比例(并且我不对对象进行任何排序)。例如,如果我对 ~3900、~1800、~900、~54 个选定对象进行查询,则分别需要 ~8500 ms、~4000 ms、~2500 ms、~800 ms 的 CPU 时间(使用 django-debug-toolbar)而 SQL 只花费了约 50 毫秒、约 40 毫秒、约 35 毫秒、约 30 毫秒,而所有页面正好有 50 个对象。我已经按照 django 优化页面<中的建议使用 select_lated 最大限度地减少了 SQL 查询的数量< /a>.

使用 分析中间件 长查询的绝大多数时间都花在数据库上:

         735924 function calls (702255 primitive calls) in 11.950 CPU seconds

   Ordered by: internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
35546/3976    4.118    0.000    9.585    0.002 /usr/local/lib/python2.6/dist-packages/django/db/models/query.py:1120(get_cached_row)
    30174    3.589    0.000    3.991    0.000 /usr/local/lib/python2.6/dist-packages/django/db/models/base.py:250(__init__)

 ---- By file ----

      tottime
47.0%   3.669 /usr/local/lib/python2.6/dist-packages/django/db/models/base.py
 7.7%   0.601 /usr/local/lib/python2.6/dist-packages/django/db/models/options.py
 6.8%   0.531 /usr/local/lib/python2.6/dist-packages/django/db/models/query_utils.py
 6.6%   0.519 /usr/local/lib/python2.6/dist-packages/django/db/backends/sqlite3/base.py
 6.4%   0.496 /usr/local/lib/python2.6/dist-packages/django/db/models/sql/compiler.py
 5.0%   0.387 /usr/local/lib/python2.6/dist-packages/django/db/models/fields/__init__.py
 3.1%   0.244 /usr/local/lib/python2.6/dist-packages/django/db/backends/util.py
 2.9%   0.225 /usr/local/lib/python2.6/dist-packages/django/db/backends/__init__.py
 2.7%   0.213 /usr/local/lib/python2.6/dist-packages/django/db/models/query.py
 2.2%   0.171 /usr/local/lib/python2.6/dist-packages/django/dispatch/dispatcher.py
 1.7%   0.136 /usr/local/lib/python2.6/dist-packages/django/template/__init__.py
 1.7%   0.131 /usr/local/lib/python2.6/dist-packages/django/utils/datastructures.py
 1.1%   0.088 /usr/lib/python2.6/posixpath.py
 0.8%   0.066 /usr/local/lib/python2.6/dist-packages/django/db/utils.py
...
 ---- By group ---

      tottime
89.5%   6.988 /usr/local/lib/python2.6/dist-packages/django/db
 3.6%   0.279 /usr/local/lib/python2.6/dist-packages/django/utils
...

我可以理解为什么SQL 查询可以根据所选条目的数量进行扩展。但是,我不明白为什么其余的 CPU 时间会受到影响。这是非常违反直觉的,我想知道是否有人可以帮助我提供任何调试/分析技巧。

将 django-1.2.3 与 sqlite、python2.6、apache2-prefork 一起使用(尽管切换到 mpm-worker 并没有显着改变)。任何提示/技巧将不胜感激。内存使用似乎也不是一个因素(机器有 2Gb RAM,免费表示仅使用 300Mb(另外 600Mb 缓存)),并且数据库与机器位于同一服务器上。

发现我的错误。我发现了我的错误。我检查了原始查询集的长度,看看它的长度是否为 1(如果是,则转到 object_detail)。这导致评估完整的查询集(根据 django-debug-toolbar 仍然只需要 5 毫秒),但显着减慢了速度。

基本上有一些愚蠢的事情,比如:

    if len(queryset) == 1:                                 
        return HttpResponseRedirect( fwd to object_detail url ...)
    return object_list(request, queryset=queryset, paginate_by=  ...)

评估完整的查询;不是分页查询。

I have a simple database with about 3900 entries, and am using a generic view (django.views.generic.list_detail.object_list) with its django-pagination (through paginate_by) to browse the data in the database, but some queries are very slow.

The weird thing is that despite only showing 50 objects per page the rendering time scales roughly linearly with how many objects are selected (and I do not do any sorting of objects). E.g., if I do a query with ~3900, ~1800, ~900, ~54 selected objects it respectively takes ~8500 ms, ~4000 ms, ~2500 ms, ~800 ms of CPU time (using django-debug-toolbar) while the SQL only took ~50 ms, ~40 ms, ~35 ms, ~30 ms, again while all pages had exactly 50 objects. I have minimized the number of SQL queries using select_related as suggested in the django optimization page.

Using profiling middleware the vast majority of the time on long queries is spent doing db stuff:

         735924 function calls (702255 primitive calls) in 11.950 CPU seconds

   Ordered by: internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
35546/3976    4.118    0.000    9.585    0.002 /usr/local/lib/python2.6/dist-packages/django/db/models/query.py:1120(get_cached_row)
    30174    3.589    0.000    3.991    0.000 /usr/local/lib/python2.6/dist-packages/django/db/models/base.py:250(__init__)

 ---- By file ----

      tottime
47.0%   3.669 /usr/local/lib/python2.6/dist-packages/django/db/models/base.py
 7.7%   0.601 /usr/local/lib/python2.6/dist-packages/django/db/models/options.py
 6.8%   0.531 /usr/local/lib/python2.6/dist-packages/django/db/models/query_utils.py
 6.6%   0.519 /usr/local/lib/python2.6/dist-packages/django/db/backends/sqlite3/base.py
 6.4%   0.496 /usr/local/lib/python2.6/dist-packages/django/db/models/sql/compiler.py
 5.0%   0.387 /usr/local/lib/python2.6/dist-packages/django/db/models/fields/__init__.py
 3.1%   0.244 /usr/local/lib/python2.6/dist-packages/django/db/backends/util.py
 2.9%   0.225 /usr/local/lib/python2.6/dist-packages/django/db/backends/__init__.py
 2.7%   0.213 /usr/local/lib/python2.6/dist-packages/django/db/models/query.py
 2.2%   0.171 /usr/local/lib/python2.6/dist-packages/django/dispatch/dispatcher.py
 1.7%   0.136 /usr/local/lib/python2.6/dist-packages/django/template/__init__.py
 1.7%   0.131 /usr/local/lib/python2.6/dist-packages/django/utils/datastructures.py
 1.1%   0.088 /usr/lib/python2.6/posixpath.py
 0.8%   0.066 /usr/local/lib/python2.6/dist-packages/django/db/utils.py
...
 ---- By group ---

      tottime
89.5%   6.988 /usr/local/lib/python2.6/dist-packages/django/db
 3.6%   0.279 /usr/local/lib/python2.6/dist-packages/django/utils
...

I can understand why the SQL query could scale with the number of selected entries. However, I don't see why the rest of the CPU time should be in anyway affected. This is very counterintuitive and I was wondering if there's any debugging/profiling tips someone could help me with.

Using django-1.2.3 with sqlite, python2.6, apache2-prefork (though switching to mpm-worker didn't significantly change things). Any tips/tricks would be greatly appreciated. Memory usage doesn't seem to be a factor (machine has 2Gb RAM and free says only using 300Mb in use (additionally 600Mb of cache)) either and the database is on the same server as the machine.

Found my mistake. I found my mistake. I checked the length of the original queryset to see if it was length 1 (and then went to object_detail if so). This resulted in evaluating the full queryset (which still only took 5ms according to django-debug-toolbar), but slowed everything down significantly.

Basically had something stupid like:

    if len(queryset) == 1:                                 
        return HttpResponseRedirect( fwd to object_detail url ...)
    return object_list(request, queryset=queryset, paginate_by=  ...)

which evaluated the full query; not the paginated query.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

不喜欢何必死缠烂打 2024-10-07 19:27:11

当 django 进行分页时,它将使用标准 QuerySet 切片来获取结果,这意味着它将使用 LIMITOFFSET

您可以通过在 QuerySet 的 .query 属性上调用 str() 来查看 ORM 生成的 SQL:

    print MyModel.objects.all().query
    print MyModel.objects.all()[50:100].query

然后您可以要求 sqlite EXPLAIN查询并查看数据库正在尝试执行的操作。我猜您正在对某个没有索引的字段进行排序。根据 EXPLAIN QUERY PLAN 将告诉您将使用哪些索引>http://www.sqlite.org/lang_explain.html

When django does pagination it will use standard QuerySet slicing to get the results, this means it will use LIMIT and OFFSET.

You can view the SQL the ORM generates by calling str() on the .query attribute of the QuerySet:

    print MyModel.objects.all().query
    print MyModel.objects.all()[50:100].query

You can then ask sqlite to EXPLAIN the query and see what the database is trying to do. I'm guessing you are sorting on some field that does not have an index. EXPLAIN QUERY PLAN will tell you what indices would have been used, according to the sqlite documentation at http://www.sqlite.org/lang_explain.html

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文