优化 PostgreSQL 中的查询
SELECT count(*)
FROM contacts_lists
JOIN plain_contacts
ON contacts_lists.contact_id = plain_contacts.contact_id
JOIN contacts
ON contacts.id = plain_contacts.contact_id
WHERE plain_contacts.has_email
AND NOT contacts.email_bad
AND NOT contacts.email_unsub
AND contacts_lists.list_id =67339
我怎样才能优化这个查询..你能解释一下吗...
SELECT count(*)
FROM contacts_lists
JOIN plain_contacts
ON contacts_lists.contact_id = plain_contacts.contact_id
JOIN contacts
ON contacts.id = plain_contacts.contact_id
WHERE plain_contacts.has_email
AND NOT contacts.email_bad
AND NOT contacts.email_unsub
AND contacts_lists.list_id =67339
how can i optimize this query.. could you please explain...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
为了清晰起见,重新格式化您的查询计划:
两个部分索引可能消除序列扫描,具体取决于您的数据分布:
您可能会丢失外键索引:
最后但并非最不重要的一点是,如果您从未分析过数据,您需要运行:
如果完成所有操作后仍然很慢,则除了合并 plain_contacts 和联系人表外,您无能为力:尽管有上述索引,但获取上述查询计划意味着大多数/全部您的订阅者订阅了该特定列表 - 在这种情况下,上述查询计划是您获得的最快的查询计划。
Reformatting your query plan for clarity:
Two partial indexes might eliminate seq scans depending on your data distribution:
You might be missing an index on a foreign key:
Last but not least if you've never analyzed your data, you need to run:
If it's still slow once all that is done, there isn't much you can do short of merging your plain_contacts and your contacts tables: getting the above query plan in spite of the above indexes means most/all of your subscribers are subscribed to that particular list -- in which case the above query plan is the fastest you'll get.
这已经是一个非常简单的查询,只要统计信息是最新的,数据库将以最有效的方式运行,
因此就查询本身而言,没有太多可做的。
在数据库管理方面,您可以添加索引 - 数据库中应该有针对所有连接条件以及 where 子句中最具选择性的部分的索引(list_id、contact_id 作为 plain_contacts 和 contacts_lists 中的 FK)。这是提高该查询性能(数量级)的最重要机会。正如 SpliFF 所指出的,您可能已经拥有这些索引,因此请检查一下。
另外,postgres 有很好的 explain 命令,你应该学习和使用。它将有助于优化查询。
This is already a very simple query that the database will run in the most efficient way providing that statistics are up to date
So in terms of the query itself there's not much to do.
In terms of database administration you can add indexes - there should be indexes in the database for all the join conditions and also for the most selective part of the where clause (list_id, contact_id as FK in plain_contacts and contacts_lists). This is the most significant opportunity to improve performance of this query (orders of magnitude). Still as SpliFF notes, you probably already have those indexes, so check.
Also, postgres has good explain command that you should learn and use. It will help with optimizing queries.
由于您只想包含在连接表中设置了一些标志的行,因此我会将这些语句移至连接子句中:
我不确定这是否会对性能产生很大影响,但值得一试。为了获得最佳性能,您可能还应该在连接表上建立索引,如下所示:
Since you only want to inlude rows that has some flags set in the joined tables, I would move that statements into the join clause:
I'm not sure if this would make a great impact on performance, but worth a try. You should probably have indexes on the joined tables as well for optimal performance, like this:
您最近对数据库运行过 ANALYZE 吗? EXPLAIN 计划中的行计数看起来有意义吗? (看起来您只运行了 EXPLAIN。EXPLAIN ANALYZE 给出了估计时间和实际时间。)
Have you run ANALYZE on the database recently? Do the row counts in the EXPLAIN plan look like they make sense? (Looks like you ran only EXPLAIN. EXPLAIN ANALYZE gives both estimated and actual timings.)
您可以使用 SELECT count(1) ... ,但除此之外我认为它看起来不错。如果您真的很困难,您总是可以使用视图缓存查询的某些部分,或者在 contact_id 和 list_id 上放置索引(我假设您已经在 id 上有一个索引)。
You can use
SELECT count(1) ...
but other than that I'd say it looks fine. You could always cache some parts of the query using views or put indexes on contact_id and list_id if you're really struggling (I assume you have one on id already).