PSQL = 快,远程 sql = v.慢
好吧,我知道这个问题有点模糊,但是经过一天的谷歌搜索,我一无所获,任何帮助将不胜感激,我愿意尝试任何事情。
问题是我们有一个 PostgreSQL 数据库,特定表中有大约 10-1500 万行。
我们正在根据表中的日期时间字段对所有列进行选择。没有连接,只是带有 where 子句的标准选择(时间 >= x AND 时间 <= y)。该字段上也有一个索引...
当我在本地计算机上使用 psql 执行 sql 时,它运行大约 15-20 秒,并返回 50 万行,其中之一是一个包含每行大量数据(程序堆栈跟踪)。当我们使用相同的sql并通过Npgsql或Windows上的pgadmin III运行它时,大约需要2分钟。
这让我认为这是一个网络问题。我在查询运行时检查过机器,它没有使用大量内存或 CPU,并且网络速度可以忽略不计。
我也浏览了 Postgres 站点上关于内存设置的建议。包括更新 shmmax 和 shmall。
它是 Ubuntu 10.04、PSQL 8.4、4GB RAM、2.8GHz Quad Xeon(虚拟但专用的资源)。该机器上也有其对应的 Windows 版本(2008 R2、SS2008),但已关闭。使用具有相同架构和数据的 SS,查询会在大约 10-15 秒内返回,我知道这不是直接比较,但想表明这不是磁盘性能问题。
所以问题是...有什么建议吗?我应该更改任何网络设置吗?我错过了什么吗?我不能提供太多有关数据库的信息,但这里有一个混淆的 EXPLAIN ANALYZE...
Index Scan using "IDX_column1" on "table1" (cost=0.00..45416.20 rows=475130 width=148) (actual time=0.025..170.812 rows=482266 loops=1)
Index Cond: (("column1" >= '2011-03-14 00:00:00'::timestamp without time zone) AND ("column1" <= '2011-03-14 23:59:59'::timestamp without time zone))
Total runtime: 196.898 ms
Okay, I appreciate that the question is a tad vague, but after a day of googling, I'm getting nowhere, any help will be appreciated, and I'm willing to try anything.
The issue is that we have a PostgreSQL db, that has arount 10-15 million rows in a particular table.
We're doing a select for all the columns, based on a DateTime field in the table. No joins, just a standard select with a where clause (time >= x AND time <= y). There is an index on the field as well...
When I perform the sql using psql on the local machine, it runs in around 15-20 seconds, and brings back .5 million rows, one of which is a text field holding a large amount of data per row (a program stack trace). When we use the same sql and run it through Npgsql, or pgadmin III on windows, it takes around 2minutes.
This is leading me to think that it's a network issue. I've checked on the machine when the query is running and it's not using a huge amount of memory or CPU, and the network speed is negligible.
I've gone through the recommendations on the Postgres Site for the memory settings as well. including updating shmmax and shmall.
It's Ubuntu 10.04, PSQL 8.4, 4GB RAM, 2.8GHz Quad Xeon (virtual but dedicated resources). the machine has it's windows counterpart (2008 R2, SS2008) on there as well, but turned off. The Query returns in around 10-15 seconds using SS with the same schema and data, I know this wouldn't be a direct comparison, but wanted to show that it wasn't a disk performance issue.
So the question is... any suggestions? Are there any network settings I should be changing? Anything that I've missed? I can't give too much information about the database, but here is a EXPLAIN ANALYZE that's obfuscated...
Index Scan using "IDX_column1" on "table1" (cost=0.00..45416.20 rows=475130 width=148) (actual time=0.025..170.812 rows=482266 loops=1)
Index Cond: (("column1" >= '2011-03-14 00:00:00'::timestamp without time zone) AND ("column1" <= '2011-03-14 23:59:59'::timestamp without time zone))
Total runtime: 196.898 ms
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
尝试在 psql 中将 Cursor_tuple_fraction 设置为 1 并查看是否会更改结果。如果是这样,那么与获取全部结果相比,优化器可能会根据仅获取前 10% 左右的结果来选择更好的计划。 Istr psql 使用游标逐段获取结果,而不是“firehose”executequery 方法。
如果是这种情况,它并不直接指向解决方案,但您将需要调整计划程序设置,并且至少如果您可以在 psql 中重现该行为,那么可能会更容易看到差异并测试更改。
Try setting
cursor_tuple_fraction
to 1 in psql and see if it changes the results. If so, then it is likely that the optimiser is picking a better plan based on only getting the top 10% or so of results compared to getting the whole lot. Istr psql uses a cursor to fetch results piece by piece rather than the "firehose" executequery method.If this is the case, it doesn't point directly to a solution, but you will need to tweak your planner settings, and at least if you can reproduce the behaviour in psql than it may be easier to see the differences and test changes.