SQL 中的主键 Select Distinct 如何工作?
SQL 优化器如何处理主键上的 Select Distinct?例如,
员工表 - (EmpId int 主键,EmpName)
从 Employee 中选择不同的 EmpId。
考虑到 EmpId 已经不同,优化器会拒绝 unique 关键字吗?
How does the SQL Optimizer handle Select Distinct on a primary key? For eg
Employee Table - (EmpId int primary key, EmpName)
select distinct EmpId from Employee.
Would the optimizer reject the distinct keyword considering EmpId is already distinct?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
尝试使用和不使用 DISTINCT 关键字并比较执行计划。
SQL Server 将为两者生成相同的计划。对于其他 RDBMS 来说是 YMMV。
Try it both with and without the DISTINCT keyword and compare execution plans.
SQL Server will produce identical plans for both. YMMV for other RDBMSs.
在 Oracle 11g 下尝试过,结果不同。
生成:
和:
生成:
USERID
是TPM_USER
上的主键,以防不清楚。更新:
我有点恼火 Oracle 可能这么愚蠢,所以我在我们的生产服务器上尝试了相同的查询,该服务器的数据量大约是原来的一千倍。这次,两个查询计划是相同的(两次都使用了索引,并且没有进行全表扫描)。这让我相信规划者在决定走哪条路线时会考虑表统计数据。
生产(有或没有
DISTINCT
):Tried under Oracle 11g and they produce different results.
Produces:
And:
Produces:
USERID
is the primary key onTPM_USER
in case that wasn't clear.UPDATE:
I was somewhat irked that Oracle could be this stupid, so I tried the same queries on our production server which has about a thousand times more data. This time, the two query plans were identical (the index was used both times, and no full table scan was done). This leads me to believe the planner will take the table statistics into account when deciding which route to go.
Production (with or without
DISTINCT
):你的问题不可能有一个固定的规则,每个数据库都可以用自己的方式处理这个问题。您必须查看数据库的查询执行计划。
但我相信现代生产数据库,如 Oracle、MySQL、Postgres...只是忽略不同的单词,只是迭代索引(而不是表)。
There can't be a fixed rule for your question, and each database can handle this by its own way. You must look at your database's execution plan for the query.
But I believe modern productional databases such as Oracle, MySQL, Postgres, ... just ignore the distinct word and just iterate over the index (and not the table).
SQL Server 不会为我生成相同的计划。
在非索引列上使用不同的查询使用哈希匹配,而在主键上查询则不需要这样做。
SQL Server doesn't produce identical plans for me.
Querying with distinct on a non-indexed column uses a hash match, whereas querying on the primary key doesn't bother to do that.
我不确定你所说的拒绝是什么意思,但考虑到,根据定义,主键是非空且唯一的,我确信排序/分组操作被简单地忽略,因为对于大多数情况下生成执行计划的目的来说是不必要的RDBMS 系统。
I'm not sure what you mean by rejected, but given that, by definition, primary keys are non-NULL and unique, I'm sure the sorting/grouping operation is simply ignored as unnecessary for purposes of generating an execution plan in most RDBMS systems.