对看起来简单的 postgresql 查询的算法改进

发布于 2024-09-07 21:05:51 字数 1362 浏览 2 评论 0原文

高级：我可以根据sum执行order by、group by吗更快吗？（PG 8.4，fwiw.，在一个非小表上......想想O（百万行））

假设我有一个像这样的表：

                                 Table "public.summary"
   Column    |       Type        |                      Modifiers
-------------+-------------------+------------------------------------------------------
 ts          | integer           | not null default nextval('summary_ts_seq'::regclass)
 field1      | character varying | not null
 otherfield  | character varying | not null
 country     | character varying | not null
 lookups     | integer           | not null


Indexes:
    "summary_pk" PRIMARY KEY, btree (ts, field1, otherfield, country)
    "ix_summary_country" btree (country)
    "ix_summary_field1" btree (field1)
    "ix_summary_otherfield" btree (otherfield)
    "ix_summary_ts" btree (ts)

并且我想要的查询是：（

select summary.field1,
    summary.country,
    summary.ts,
    sum(summary.lookups) as lookups,
from summary
where summary.country = 'za' and
    summary.ts = 1275177600
group by summary.field1, summary.country, summary.ts
order by summary.ts, lookups desc, summary.field1
limit 100;

英语：在特定（的前100个field1） ts,country) 其中“topness”是总和任何匹配行的查找次数，无论其他字段的值如何）

我真的可以做些什么来加快速度吗？算法上这似乎是全表扫描之类的事情，但我可能会遗漏一些东西。

原文

High-level: Can I do this order by, group by based on sum
any faster? (PG 8.4, fwiw., on a non-tiny table .... think O(millions of rows) )

Suppose I had a table like this:

                                 Table "public.summary"
   Column    |       Type        |                      Modifiers
-------------+-------------------+------------------------------------------------------
 ts          | integer           | not null default nextval('summary_ts_seq'::regclass)
 field1      | character varying | not null
 otherfield  | character varying | not null
 country     | character varying | not null
 lookups     | integer           | not null


Indexes:
    "summary_pk" PRIMARY KEY, btree (ts, field1, otherfield, country)
    "ix_summary_country" btree (country)
    "ix_summary_field1" btree (field1)
    "ix_summary_otherfield" btree (otherfield)
    "ix_summary_ts" btree (ts)

And the query I want is:

select summary.field1,
    summary.country,
    summary.ts,
    sum(summary.lookups) as lookups,
from summary
where summary.country = 'za' and
    summary.ts = 1275177600
group by summary.field1, summary.country, summary.ts
order by summary.ts, lookups desc, summary.field1
limit 100;

(English: top 100 field1's at a particular (ts,country) where 'topness' is the sum
of lookups for any matching row, regardless of value of otherfield)

Is there anything I can really do to speed this up? Algorithmically
this seems to be a full table scan kind of thing, but I might be missing something.

分享到QQ

分享到微博