SQL - 表别名范围
我学会了使用“存在”而不是“在”。
不好
select * from table where nameid in (
select nameid from othertable where otherdesc = 'SomeDesc' )
好
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
解释是:“这样做更好的原因是因为只会返回匹配的值,而不是构建大量可能结果的列表”。 这是否意味着第一个子查询可能返回 900 个结果,而第二个子查询将仅返回 1?
我曾遇到 RDBMS 抱怨:“只能检索前 1000 行”。 第二种方法可以解决这个问题吗?
第二个子查询中别名的范围是什么? 别名是否只存在于括号中?
例如
从存在的表 t 中选择 * ( 从 othertable o 中选择 nameid,其中 t.nameid = o.nameid 且 otherdesc = 'SomeDesc' )
和
从 othertable o 中选择 nameid,其中 t.nameid = o.nameid 且 otherdesc = 'SomeOtherDesc' )
也就是说,如果我在第二个
exist
中使用相同的别名(o
表示table
和othertable
) > 它会对第一个存在
带来任何问题吗? 或者他们是独立的?
这仅与 Oracle 相关还是对大多数 RDBMS 有效?
I learned to use "exists" instead of "in".
BAD
select * from table where nameid in (
select nameid from othertable where otherdesc = 'SomeDesc' )
GOOD
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
The explanation was: "The reason why this is better is because only the matching values will be returned instead of building a massive list of possible results". Does that mean that while the first subquery might return 900 results the second will return only 1?
I have had the RDBMS complain: "only the first 1000 rows might be retrieved". Would this second approach solve that problem?
What is the scope of the alias in the second subquery? Does the alias only live in the parenthesis?
For example
select * from table t where exists ( select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
and
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeOtherDesc' )
That is, if I use the same alias (
o
fortable
andothertable
) in the secondexist
will it present any problem with the firstexist
? Or are they independent?
Is this only Oracle-related or it is valid for most RDBMSs?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
它特定于每个 DBMS,并且取决于查询优化器。 一些优化器检测 IN 子句并对其进行翻译。
在我测试的所有 DBMS 中,别名仅在 ( ) 内有效
BTW,您可以将查询重写为:
并且,回答您的问题:
It's specific to each DBMS and depends on the query optimizer. Some optimizers detect IN clause and translate it.
In all DBMSes I tested, alias is only valid inside the ( )
BTW, you can rewrite the query as:
And, to answer your questions:
您正在踏入复杂的领域,称为“相关子查询”。 由于我们没有有关您的表和关键结构的详细信息,因此某些答案只能是“也许”。
在您的初始 IN 查询中,无论 OtherTable 是否包含列 NameID (实际上,OtherDesc 是否作为 Table 或 OtherTable 中的列存在,该符号都有效 - 这在您的任何示例中都不清楚,但大概是一列)其他表)。 这种行为使得相关子查询成为相关子查询。 当人们第一次遇到它时,它也是一种常见的焦虑来源——总是偶然的。 由于 SQL 标准规定,如果子查询中提到的表中没有具有相关名称的列,但存在具有以下名称的列,则将子查询中的名称解释为引用外部查询中的列。外部(主)查询中提到的表中的相关名称,想要声称符合(这一点)SQL 标准的产品不会做任何不同的事情。
Q1 的答案是“这取决于”,但考虑到合理的假设(NameID 作为两个表中的一列存在;OtherDesc 仅存在于 OtherTable 中),就返回的数据集而言,结果应该相同,但可能不同在性能方面相当。
Q2 的答案是,过去您使用的是劣质的 DBMS(如果不是有缺陷的话)。 如果它支持 EXISTS,那么 DBMS 可能仍然会抱怨结果的基数。
应用于第一个 EXISTS 查询的 Q3 的答案是“t 在整个语句中可用作别名,但 o 只能用作括号内的别名”。 应用于您的第二个示例框 - AND 连接两个子选择(当我查看它时,第二个子选择缺少左括号),然后“t 在整个语句中可用作别名,并指代相同的表,但有两个不同的别名都标记为“o”,每个子查询一个”。 请注意,如果 OtherTable 中给定的 NameID 值的 OtherDesc 是唯一的,则查询可能不会返回任何数据; 否则,它需要 OtherTable 中具有相同 NameID 的两行以及具有该 NameID 值的 Table 中的每行的两个 OtherDesc 值。
You are treading into complicated territory, known as 'correlated sub-queries'. Since we don't have detailed information about your tables and the key structures, some of the answers can only be 'maybe'.
In your initial IN query, the notation would be valid whether or not OtherTable contains a column NameID (and, indeed, whether OtherDesc exists as a column in Table or OtherTable - which is not clear in any of your examples, but presumably is a column of OtherTable). This behaviour is what makes a correlated sub-query into a correlated sub-query. It is also a routine source of angst for people when they first run into it - invariably by accident. Since the SQL standard mandates the behaviour of interpreting a name in the sub-query as referring to a column in the outer query if there is no column with the relevant name in the tables mentioned in the sub-query but there is a column with the relevant name in the tables mentioned in the outer (main) query, no product that wants to claim conformance to (this bit of) the SQL standard will do anything different.
The answer to your Q1 is "it depends", but given plausible assumptions (NameID exists as a column in both tables; OtherDesc only exists in OtherTable), the results should be the same in terms of the data set returned, but may not be equivalent in terms of performance.
The answer to your Q2 is that in the past, you were using an inferior if not defective DBMS. If it supported EXISTS, then the DBMS might still complain about the cardinality of the result.
The answer to your Q3 as applied to the first EXISTS query is "t is available as an alias throughout the statement, but o is only available as an alias inside the parentheses". As applied to your second example box - with AND connecting two sub-selects (the second of which is missing the open parenthesis when I'm looking at it), then "t is available as an alias throughout the statement and refers to the same table, but there are two different aliases both labelled 'o', one for each sub-query". Note that the query might return no data if OtherDesc is unique for a given NameID value in OtherTable; otherwise, it requires two rows in OtherTable with the same NameID and the two OtherDesc values for each row in Table with that NameID value.
就我个人而言,我会使用联接,而不是子查询。
Personally I would use a join, rather than a subquery for this.
很难一概而论,EXISTS 总是比 IN 更好。 从逻辑上来说,如果是这种情况,那么 SQL 社区就会用 EXISTS 替换 IN...
另外,请注意 IN 和 EXISTS 并不相同,当您使用两者时,结果可能会不同...
使用 IN,通常是对内表进行一次全表扫描,而不删除 NULL(因此,如果您的文件中有 NULL)内部表,IN 默认情况下不会删除 NULL)...而 EXISTS 会删除 NULL,并且在相关子查询的情况下,它会对外部查询中的每一行运行内部查询。
假设没有 NULL 并且它是一个简单的查询(没有相关性),如果您找到的行不是最后一行,则 EXIST 可能会执行得更好。 如果它恰好是最后一行,则 EXISTS 可能需要像 IN 一样扫描到最后......所以性能相似......
但是 IN 和 EXISTS 不能互换......
It is difficult to generalize that EXISTS is always better than IN. Logically if that is the case, then SQL community would have replaced IN with EXISTS...
Also, please note that IN and EXISTS are not same, the results may be different when you use the two...
With IN, usually its a Full Table Scan of the inner table once without removing NULLs (so if you have NULLs in your inner table, IN will not remove NULLS by default)... While EXISTS removes NULL and in case of correlated subquery, it runs inner query for every row from outer query.
Assuming there are no NULLS and its a simple query (with no correlation), EXIST might perform better if the row you are finding is not the last row. If it happens to be the last row, EXISTS may need to scan till the end like IN.. so similar performance...
But IN and EXISTS are not interchangeable...