实践“存在(从...中选择 1)”在哪里?来自?

发布于 2024-11-09 20:14:08 字数 1077 浏览 1 评论 0原文

绝大多数人支持我自己的观点,即两者之间没有区别以下陈述:

SELECT * FROM tableA WHERE EXISTS (SELECT * FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT y FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT 1 FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT NULL FROM tableB WHERE tableA.x = tableB.y)

然而今天,在我们的内部开发人员会议上,我遇到了相反的说法,提倡 select 1 是正确的方法,而 select *选择所有(不必要的)数据,从而损害性能。

我似乎记得有一些旧版本的 Oracle 或其他版本,但我找不到对此的引用。所以,我很好奇 - 这种做法是如何诞生的?这个神话从何而来?

添加:因为有些人坚持认为有证据表明这确实是一个错误的信念,这里 - a google 查询 显示了很多人们这么说。如果你太懒了,请检查这个直接链接,其中有人甚至比较执行计划以发现它们是等效的。

The overwhelming majority of people support my own view that there is no difference between the following statements:

SELECT * FROM tableA WHERE EXISTS (SELECT * FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT y FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT 1 FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT NULL FROM tableB WHERE tableA.x = tableB.y)

Yet today I came face-to-face with the opposite claim when in our internal developer meeting it was advocated that select 1 is the way to go and select * selects all the (unnecessary) data, hence hurting performance.

I seem to remember that there was some old version of Oracle or something where this was true, but I cannot find references to that. So, I'm curious - how was this practice born? Where did this myth originate from?

Added: Since some people insist on having evidence that this is indeed a false belief, here - a google query which shows plenty of people saying it so. If you're too lazy, check this direct link where one guy even compares execution plans to find that they are equivalent.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

黄昏下泛黄的笔记 2024-11-16 20:14:08

你的问题的主要部分是——“这个神话是从哪里来的?”

因此,为了回答这个问题,我猜想人们通过 sql 学到的第一个性能提示是 select * 在大多数情况下效率很低。事实上,在这种特定情况下,它并不是效率低下,因此有些违反直觉。因此,人们对此表示怀疑也就不足为奇了。但一些简单的研究或实验应该足以消除大多数神话。尽管人类历史表明神话很难消除。

The main part of your question is - "where did this myth come from?"

So to answer that, I guess one of the first performance hints people learn with sql is that select * is inefficient in most situations. The fact that it isn't inefficient in this specific situation is hence somewhat counter intuitive. So its not surprising that people are skeptical about it. But some simple research or experiments should be enough to banish most myths. Although human history kinda shows that myths are quite hard to banish.

秋叶绚丽 2024-11-16 20:14:08

作为演示,尝试这些

SELECT * FROM tableA WHERE EXISTS (SELECT 1/0 FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT CAST('bollocks' as int) FROM tableB WHERE tableA.x = tableB.y)

现在阅读 ANSI 标准。 ANSI-92,第 191 页,案例 3a

If the <select list> "*" is simply contained in a <subquery>
          that is immediately contained in an <exists predicate>, then
          the <select list> is equivalent to a <value expression> that
          is an arbitrary <literal>.

最后,大多数 RDBMS 上的行为应忽略 EXISTS 子句中的 *。根据昨天的这个问题( Sql Server 2005 - 如果不存在则插入 )这在 SQL Server 2000 上不起作用,但我知道它在 SQL Server 2005+ 上起作用

As a demo, try these

SELECT * FROM tableA WHERE EXISTS (SELECT 1/0 FROM tableB WHERE tableA.x = tableB.y)
SELECT * FROM tableA WHERE EXISTS (SELECT CAST('bollocks' as int) FROM tableB WHERE tableA.x = tableB.y)

Now read the ANSI standard. ANSI-92, page 191, case 3a

If the <select list> "*" is simply contained in a <subquery>
          that is immediately contained in an <exists predicate>, then
          the <select list> is equivalent to a <value expression> that
          is an arbitrary <literal>.

Finally, the behaviour on most RDBMS should ignore THE * in the EXISTS clause. As per this question yesterday ( Sql Server 2005 - Insert if not exists ) this doesn't work on SQL Server 2000 but I know it does on SQL Server 2005+

咿呀咿呀哟 2024-11-16 20:14:08

对于 SQL Server 来自 Conor Cunningham查询优化器团队解释了他通常使用 SELECT 1 的原因

QP 将采用并扩展所有 * 的
在管道的早期并将它们绑定到
对象(在这种情况下,列表
列)。然后它将删除
由于性质而不需要的列
查询。

对于一个简单的 EXISTS 子查询来说
这个:

从存在的 MyTable 中选择 col1
(从表 2 中选择 *,其中
MyTable.col1=Table2.col2)* 将是
扩展到一些潜在的大
列列表,然后它将是
确定了语义
EXISTS 不需要任何这些
列,所以基本上所有的列都可以
被删除。

“SELECT 1”将避免
检查任何不需要的元数据
查询编译期间的表。

但是,在运行时,这两种形式
查询将是相同的并且将
具有相同的运行时间。

编辑:但是,自从发布此答案以来,我对此进行了一些详细研究,并得出结论:SELECT 1确实不会避免此列扩展。 完整详情请参见此处

For SQL Server Conor Cunningham from the Query Optimiser team explains why he typically uses SELECT 1

The QP will take and expand all *'s
early in the pipeline and bind them to
objects (in this case, the list of
columns). It will then remove
unneeded columns due to the nature of
the query.

So for a simple EXISTS subquery like
this:

SELECT col1 FROM MyTable WHERE EXISTS
(SELECT * FROM Table2 WHERE
MyTable.col1=Table2.col2)The * will be
expanded to some potentially big
column list and then it will be
determined that the semantics of the
EXISTS does not require any of those
columns, so basically all of them can
be removed.

"SELECT 1" will avoid having to
examine any unneeded metadata for that
table during query compilation.

However, at runtime the two forms of
the query will be identical and will
have identical runtimes.

Edit: However I have looked at this in some detail since posting this answer and come to the conclusion that SELECT 1 does not avoid this column expansion. Full details here.

南城旧梦 2024-11-16 20:14:08

这个问题有一个答案,说是 MS Access 的某个版本实际上没有忽略 SELECT 子句的字段。我做过一些 Access 开发,并且听说 SELECT 1 是最佳实践,因此在我看来这很可能是“神话”的根源。

SQL EXISTS 使用变体的性能

This question has an answer that says it was some version of MS Access that actually did not ignore the field of the SELECT clause. I have done some Access development, and I have heard that SELECT 1 is best practice, so this seems very likely to me to be the source of the "myth."

Performance of SQL EXISTS usage variants

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文