Postgres 不使用带有“ORDER BY”的索引。
- 我有一张非常大的桌子。
- 我在列 col1 上有一个索引。
- 我想获取按 col1 排序的数据。
- 从查询计划我可以说它不使用索引。
- 当我添加“LIMIT”时,它开始使用索引
- 。对于“LIMIT”的大值,它停止使用索引。
有什么线索吗?
PS 我想获取按 col1 值聚类的数据(不一定排序),以及除“ORDER BY”之外的任何建议。
谢谢 !!
- I have a very large table.
- I have an index on columns col1.
- I would like to get the data ordered by col1.
- From the query plan I can say it doesn't use the index.
- When I add "LIMIT", it starts to use the index
- For large value for "LIMIT", it stops using the index.
Any clue?
P.S. I would like to get the data clustered by values of col1 (not necessarily sorted), any suggestions other than "ORDER BY".
THANKS !!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果返回表中的所有行,索引扫描将比表扫描慢。
您认为为什么需要索引使用?
您可能会尝试使用
set enable_seqscan = false
来禁用顺序扫描,但我确信这会比顺序扫描慢。
ORDER BY 是对数据进行排序的唯一方法。您可能看到的任何其他排序纯属巧合
编辑
澄清一下:我不建议关闭 seq 扫描。我刚刚发布此文章是为了表明 seq 扫描确实比索引扫描更快。一旦关闭,使用索引扫描的执行计划很可能会比显示 OP 不需要索引扫描的 seq 扫描慢。
If your return all rows from the table an index scan will be slower than a table scan.
Why do you think you need the index usage?
You might try to use
set enable_seqscan = false
in order to disable the sequential scan, but I'm sure that will be slower than with the sequential scan.
ORDER BY is the only method to sort your data. Any other sorting you might see is pure coincidence
Edit
To clear things up: I do not recommend to turn seq scan off. I just posted this as a way to show that the seq scan is indeed faster than the index scan. Once turned off the execution plan using the index scan will most probably be slower than the seq scan showing the OP that there is no need for an index scan.
除了a_horse_with_no_name的答案:
使用索引实际上是两个不同的操作:首先在索引中查找您想要的值。索引中是完整记录的地址,然后将其取消引用。对于特定查询来说,这两种操作都非常快。
如果您无论如何都打算使用全部或大部分记录,那么这种好处就会消失。如果您想要所有记录并遍历索引,则需要更长的时间,因为每个记录都有两次查找。在没有索引的情况下运行整个表会更容易,因为这需要每列一次搜索(是的,我知道,实际上它比这要少,因为读取了整个块等......我只是想保持简单)。
In addition to the answer of a_horse_with_no_name:
Using an index is actually two distinct operations: First the value you desire is looked for in the index. In the index is the address of the complete record which gets then dereferenced. Both operations are very fast for specific queries.
If you intend to use all or most records anyway, the benefit goes away. If you want all records and you go through the index, it takes longer because for every record there are two seeks. It's easier to just run over the whole table without the index as this takes one seek per column (yes, I know, actually it's less than that because whole blocks are read etc... I just want to keep it simple).