SQL WHERE 子句是否短路评估?

发布于 2024-07-18 16:58:45 字数 391 浏览 8 评论 0原文

SQL WHERE 子句中的布尔表达式是否被短路计算 ?

例如:

SELECT * 
FROM Table t 
WHERE @key IS NULL OR (@key IS NOT NULL AND @key = t.Key) 

如果 @key IS NULL 计算结果为 true,则 @key IS NOT NULL AND @key = t.Key 是否被计算?

如果不是,为什么不呢?

如果有的话,有保证吗? 它是 ANSI SQL 的一部分还是特定于数据库的?

如果是特定数据库,SQLServer? 甲骨文? MySQL?

Are boolean expressions in SQL WHERE clauses short-circuit evaluated
?

For example:

SELECT * 
FROM Table t 
WHERE @key IS NULL OR (@key IS NOT NULL AND @key = t.Key) 

If @key IS NULL evaluates to true, is @key IS NOT NULL AND @key = t.Key evaluated?

If no, why not?

If yes, is it guaranteed? Is it part of ANSI SQL or is it database specific?

If database specific, SQLServer? Oracle? MySQL?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(16

つ低調成傷 2024-07-25 16:58:45

ANSI SQL 草案 2003 5WD-01-Framework-2003-09.pdf

6.3.3.3 规则评估顺序

[...]

优先级不是由格式或由
括号中,表达式的有效求值通常是
从左到右执行。 然而,它是
依赖于实现表达式是否实际上是从左到右计算的,特别是当操作数或运算符可能
导致引发条件或者如果表达式的结果
无需完全评估所有部分即可确定
表达。

ANSI SQL Draft 2003 5WD-01-Framework-2003-09.pdf

6.3.3.3 Rule evaluation order

[...]

Where the precedence is not determined by the Formats or by
parentheses, effective evaluation of expressions is generally
performed from left to right. However, it is
implementation-dependent whether expressions are actually evaluated left to right, particularly when operands or operators might
cause conditions to be raised or if the results of the expressions
can be determined without completely evaluating all parts of the
expression.

神魇的王 2024-07-25 16:58:45

综上所述,短路实际上是不可用的。

如果您需要它,我建议使用 Case 语句:

Where Case when Expr1 then Expr2 else Expr3 end = desiredResult

始终评估 Expr1,但每行仅评估 Expr2Expr3 之一。

From the above, short circuiting is not really available.

If you need it, I suggest a Case statement:

Where Case when Expr1 then Expr2 else Expr3 end = desiredResult

Expr1is always evaluated, but only one of Expr2 and Expr3 will be evaluated per row.

·深蓝 2024-07-25 16:58:45

我认为这是我将其写成没有短路的情况之一,原因有三个。

  1. 因为对于 MSSQL,它不是通过在明显的地方查看 BOL 来解决的,所以对我来说,这使得它典型地含糊不清。

    因为对于

  2. 因为至少我知道我的代码可以工作。 更重要的是,那些追随我的人也会如此,所以我不会让他们一遍又一遍地担心同一个问题。

  3. 我经常为多个 DBMS 产品编写内容,如果可以轻松解决这些差异,我不想记住这些差异。

    我经常为

I think this is one of the cases where I'd write it as if it didn't short-circuit, for three reasons.

  1. Because for MSSQL, it's not resolved by looking at BOL in the obvious place, so for me, that makes it canonically ambiguous.

  2. because at least then I know my code will work. And more importantly, so will those who come after me, so I'm not setting them up to worry through the same question over and over again.

  3. I write often enough for several DBMS products, and I don't want to have to remember the differences if I can work around them easily.

感性 2024-07-25 16:58:45

我不相信 SQL Server (2005) 中的短路是可以保证的。 SQL Server 通过其优化算法运行查询,该算法会考虑很多因素(索引、统计信息、表大小、资源等)以得出有效的执行计划。 经过这样的评估,你不能肯定地说你的短路逻辑得到了保证。

我自己不久前也遇到过同样的问题,但我的研究确实没有给我明确的答案。 您可以编写一个小查询来证明它有效,但是您可以确定随着数据库负载的增加,表变得越来越大,并且数据库中的内容得到优化和更改,该结论将抓住。 我不能,因此出于谨慎的考虑,在 WHERE 子句中使用了 CASE 来确保短路。

I don't believe that short circuiting in SQL Server (2005) is guaranteed. SQL Server runs your query through its optimization algorithm that takes into account a lot of things (indexes, statistics, table size, resources, etc) to come up with an effective execution plan. After this evaluation, you can't say for sure that your short circuit logic is guaranteed.

I ran into the same question myself sometime ago and my research really did not give me a definitive answer. You may write a small query to give you a sense of proof that it works but can you be sure that as the load on your database increases, the tables grow to be bigger, and things get optimized and changed in the database, that conclusion will hold. I could not and therefore erred on the side of caution and used CASE in WHERE clause to ensure short circuit.

对你的占有欲 2024-07-25 16:58:45

您必须记住数据库的工作原理。 给定参数化查询,数据库将基于该查询构建执行计划,而无需参数值。 每次运行查询时都会使用此查询,无论实际提供的值是什么。 查询是否与某些值短路对于执行计划并不重要。

You have to keep in mind how databases work. Given a parameterized query the db builds an execution plan based on that query without the values for the parameters. This query is used every time the query is run regardless of what the actual supplied values are. Whether the query short-circuits with certain values will not matter to the execution plan.

断爱 2024-07-25 16:58:45

我通常将其用于可选参数。 这和短路一样吗?

SELECT  [blah]
FROM    Emp
WHERE  ((@EmpID = -1) OR (@EmpID = EmpID))

这使我可以选择传递 -1 或其他值来解释属性的可选检查。 有时这涉及连接多个表,或者最好是一个视图。

非常方便,但不完全确定它给数据库引擎带来的额外工作。

I typically use this for optional parameters. Is this the same as short circuiting?

SELECT  [blah]
FROM    Emp
WHERE  ((@EmpID = -1) OR (@EmpID = EmpID))

This gives me the option to pass in -1 or whatever to account for optional checking of an attribute. Sometimes this involves joining on multiple tables, or preferably a view.

Very handy, not entirely sure of the extra work that it gives to the db engine.

假装爱人 2024-07-25 16:58:45

刚刚偶然发现了这个问题,并且已经找到了这个博客条目: http://rusanu.com/2009/09/13/on-sql-server-boolean-operator-short- Circuit/

SQL Server 可以自由地在她认为合适的任何地方优化查询,因此在博客文章中给出的示例中,您不能依赖短路。

然而,显然有一个案例被记录下来以书面顺序进行评估 - 检查该博客文章的评论。

Just stumbled over this question, and had already found this blog-entry: http://rusanu.com/2009/09/13/on-sql-server-boolean-operator-short-circuit/

The SQL server is free to optimize a query anywhere she sees fit, so in the example given in the blog post, you cannot rely on short-circuiting.

However, a CASE is apparently documented to evaluate in the written order - check the comments of that blog post.

錯遇了你 2024-07-25 16:58:45

对于 SQL Server,我认为这取决于版本,但我对 SQL Server 2000 的经验是,即使 @key 为 null,它仍然会评估 @key = t.Key。 换句话说,它在评估 WHERE 子句时不会进行有效的短路。

我看到人们推荐像您的示例这样的结构作为进行灵活查询的方式,用户可以在其中输入或不输入各种条件。 我的观察是,当 @key 为 null 时,Key 仍然参与查询计划,并且如果 Key 已建立索引,则它不会有效地使用索引。

这种具有不同条件的灵活查询可能是动态创建 SQL 真正最佳方法的一种情况。 如果 @key 为 null,那么您根本就不将其包含在查询中。

For SQL Server, I think it depends on the version but my experience with SQL Server 2000 is that it still evaluates @key = t.Key even when @key is null. In other words, it does not do efficient short circuiting when evaluating the WHERE clause.

I've seen people recommending a structure like your example as a way of doing a flexible query where the user can enter or not enter various criteria. My observation is that Key is still involved in the query plan when @key is null and if Key is indexed then it does not use the index efficiently.

This sort of flexible query with varying criteria is probably one case where dynamically created SQL is really the best way to go. If @key is null then you simply don't include it in the query at all.

寻找我们的幸福 2024-07-25 16:58:45

短路求值的主要特点是一旦确定结果就停止对表达式求值。 这意味着表达式的其余部分可以被忽略,因为无论是否计算结果都会相同。

二元布尔运算符是可交换的,这意味着:

a AND b == b AND a
a OR  b == b OR  a
a XOR b == b XOR a

因此不能保证计算顺序。 评估顺序将由查询优化器确定。

在带有对象的语言中,有时您可以编写只能通过短路求值来求值的布尔表达式。 您的示例代码构造通常用于此类语言(C#、Delphi、VB)。 例如:

if(someString == null | someString.Length == 0 )
  printf("no text in someString");

如果 someString == null,此 C# 示例将导致异常,因为它将被完全求值。 在短路评估中,它每次都会起作用。

SQL 仅对无法初始化的标量变量(无对象)进行操作,因此无法编写无法计算的布尔表达式。 如果你有一些 NULL 值,任何比较都将返回 false。

这意味着在 SQL 中你不能编写根据使用短路或完整评估而进行不同评估的表达式。

如果 SQL 实现使用短路评估,它只能有望加快查询执行速度。

Main characteristic of short circuit evaluation is that it stops evaluating the expression as soon as the result can be determined. That means that rest of expression can be ignored because result will be same regardless it is evaluated or not.

Binary boolean operators are comutative, meaning:

a AND b == b AND a
a OR  b == b OR  a
a XOR b == b XOR a

so there is no guarantee on order of evaluation. Order of evaluation will be determined by query optimizer.

In languages with objects there can be situations where you can write boolean expressions that can be evaluated only with short circuit evaluation. Your sample code construction is often used in such languages (C#, Delphi, VB). For example:

if(someString == null | someString.Length == 0 )
  printf("no text in someString");

This C# example will cause exception if someString == null because it will be fully evaluated. In short circuit evaluation, it will work every time.

SQL operates only on scalar variables (no objects) that cannot be uninitialized, so there is no way to write boolean expression that cannot be evaluated. If you have some NULL value, any comparison will return false.

That means that in SQL you cannot write expression that is differently evaluated depending on using short circuit or full evaluation.

If SQL implementation uses short circuit evaluation, it can only hopefully speed up query execution.

回忆躺在深渊里 2024-07-25 16:58:45

我不知道短路,但我也会将其写为 if-else 语句

if (@key is null)
begin

     SELECT * 
     FROM Table t 

end
else
begin

     SELECT * 
     FROM Table t 
     WHERE t.Key=@key

end

,变量应该始终位于等式的右侧。 这使得它可以被控制。

http://en.wikipedia.org/wiki/Sargable

i don't know about short circuting, but i'd write it as an if-else statement

if (@key is null)
begin

     SELECT * 
     FROM Table t 

end
else
begin

     SELECT * 
     FROM Table t 
     WHERE t.Key=@key

end

also, variables should always be on the right side of the equation. this makes it sargable.

http://en.wikipedia.org/wiki/Sargable

甜是你 2024-07-25 16:58:45

下面是对 SQL Server 2008 R2 的快速而肮脏的测试:

SELECT *
FROM table
WHERE 1=0
AND (function call to complex operation)

它立即返回,没有记录。 存在某种短路行为。

然后尝试这个:

SELECT *
FROM table
WHERE (a field from table) < 0
AND (function call to complex operation)

知道没有记录会满足这个条件:

(a field from table) < 0

这花了几秒钟,表明短路行为不再存在,并且正在为每个记录评估复杂的操作。

希望这对大家有帮助。

Below a quick and dirty test on SQL Server 2008 R2:

SELECT *
FROM table
WHERE 1=0
AND (function call to complex operation)

This returns immediately with no records. Kind of short circuit behavior was present.

Then tried this:

SELECT *
FROM table
WHERE (a field from table) < 0
AND (function call to complex operation)

knowing no record would satisfy this condition:

(a field from table) < 0

This took several seconds, indicating the short circuit behavior was not there any more and the complex operation was being evaluated for every record.

Hope this helps guys.

绝影如岚 2024-07-25 16:58:45

下面是一个演示,证明MySQL确实执行了WHERE子句短路

http://rextester .com/GVE4880

这将运行以下查询:

SELECT myint FROM mytable WHERE myint >= 3 OR myslowfunction('query #1', myint) = 1;
SELECT myint FROM mytable WHERE myslowfunction('query #2', myint) = 1 OR myint >= 3;

这些查询之间的唯一区别是 OR 条件中操作数的顺序。

myslowfunction 故意休眠一秒钟,并具有每次运行时向日志表添加条目的副作用。 以下是运行上述两个查询时记录的结果:

myslowfunction called for query #1 with value 1
myslowfunction called for query #1 with value 2
myslowfunction called for query #2 with value 1
myslowfunction called for query #2 with value 2
myslowfunction called for query #2 with value 3
myslowfunction called for query #2 with value 4

上面显示,当另一个操作数并不总是 true(由于短)时,慢速函数出现在 OR 条件的左侧时会执行更多次-电路)。

Here is a demo to prove that MySQL does perform WHERE clause short-circuiting:

http://rextester.com/GVE4880

This runs the following queries:

SELECT myint FROM mytable WHERE myint >= 3 OR myslowfunction('query #1', myint) = 1;
SELECT myint FROM mytable WHERE myslowfunction('query #2', myint) = 1 OR myint >= 3;

The only difference between these is the order of operands in the OR condition.

myslowfunction deliberately sleeps for a second and has the side effect of adding an entry to a log table each time it is run. Here are the results of what is logged when running the above two queries:

myslowfunction called for query #1 with value 1
myslowfunction called for query #1 with value 2
myslowfunction called for query #2 with value 1
myslowfunction called for query #2 with value 2
myslowfunction called for query #2 with value 3
myslowfunction called for query #2 with value 4

The above shows that a slow function is executed more times when it appears on the left side of an OR condition when the other operand isn't always true (due to short-circuiting).

落墨 2024-07-25 16:58:45

这在查询分析器中需要额外的 4 秒,所以从我所看到的 IF 甚至没有短路......

SET @ADate = NULL

IF (@ADate IS NOT NULL)
BEGIN
    INSERT INTO #ABla VALUES (1)
        (SELECT bla from a huge view)
END

如果有一个有保证的方法那就太好了!

This takes an extra 4 seconds in query analyzer, so from what I can see IF is not even shorted...

SET @ADate = NULL

IF (@ADate IS NOT NULL)
BEGIN
    INSERT INTO #ABla VALUES (1)
        (SELECT bla from a huge view)
END

It would be nice to have a guaranteed way!

碍人泪离人颜 2024-07-25 16:58:45

它似乎是

select null 其中 1/0=0 或 1=1
-> 除以 0 失败 err

select null where 1=1 1/0=0
-> 成功

it appears to be

select null where 1/0=0 or 1=1
-> fails with a divide by 0 err

select null where 1=1 1/0=0
-> succeeds

鱼窥荷 2024-07-25 16:58:45

简单的回答是:“短路”行为是未记录的实现。
这是一篇很好的文章,解释了这个主题。

了解 T-SQL 表达式短路

The quick answer is: The "short-circuit" behavior is undocumented implementation.
Here's an excellent article that explains this very topic.

Understanding T-SQL Expression Short-Circuiting

暮年慕年 2024-07-25 16:58:45

很明显,MS Sql 服务器支持短路理论,通过避免不必要的检查来提高性能,

支持示例:

SELECT 'TEST'
WHERE 1 = 'A'

SELECT 'TEST'
WHERE 1 = 1 OR 1 = 'A'

这里,第一个示例在将 varchar 值“A”转换为数据类型 int 时会导致错误“转换失败”。 '

而第二个条件很容易运行,因为条件 1 = 1 评估为 TRUE,因此第二个条件根本不运行。

此外,

SELECT 'TEST'
WHERE 1 = 0 OR 1 = 'A'

这里第一个条件将评估为 false,因此 DBMS 将执行第二个条件,并且您将再次得到转换错误,如上例所示。

注意:我写下错误条件只是为了了解该条件是执行还是短路
如果查询结果出错,则表示条件已执行,否则为短路。

简单解释

考虑,

WHERE 1 = 1 OR 2 = 2

当第一个条件被评估为TRUE时,评估第二个条件是没有意义的,因为它的评估值是任何值
根本不会影响结果,因此 Sql Server 有机会通过跳过不必要的条件检查或评估来节省查询执行时间。

“OR”的情况下,如果第一个条件评估为TRUE,则由“OR”连接的整个链将被视为评估为true,而不评估其他的。

condition1 OR condition2 OR ..... OR conditionN

如果条件 1 评估为真,则其余所有条件,直到条件 N 被跳过。
一般来说,在确定第一个 TRUE 时,将跳过由 OR 链接的所有其他条件。

考虑第二个条件

WHERE 1 = 0 AND 1 = 1

,因为第一个条件被评估为FALSE,评估第二个条件是没有意义的,因为它的评估值是任何值
根本不会影响结果,因此 Sql Server 再次有机会通过跳过不必要的条件检查或评估来节省查询执行时间。

“AND”的情况下,如果第一个条件评估为FALSE,则与“AND”连接的整个链将被视为评估为FALSE,而无需评价别人。

condition1 AND condition2 AND ..... conditionN

如果条件1的计算结果为FALSE,则保留所有条件,直到条件N被跳过。
一般来说,在确定第一个FALSE时,将跳过由AND链接的所有其他条件。

因此,明智的程序员应该始终以首先评估成本较低或最具消除性的条件的方式对条件链进行编程,
或者以能够获得最大短路效益的方式安排条件

It is but obvious that MS Sql server supports Short circuit theory, to improve the performance by avoiding unnecessary checking,

Supporting Example:

SELECT 'TEST'
WHERE 1 = 'A'

SELECT 'TEST'
WHERE 1 = 1 OR 1 = 'A'

Here, the first example would result into error 'Conversion failed when converting the varchar value 'A' to data type int.'

While the second runs easily as the condition 1 = 1 evaluated to TRUE and thus the second condition doesn't ran at all.

Further more

SELECT 'TEST'
WHERE 1 = 0 OR 1 = 'A'

here the first condition would evaluate to false and hence the DBMS would go for the second condition and again you will get the error of conversion as in above example.

NOTE: I WROTE THE ERRONEOUS CONDITION JUST TO REALIZE WEATHER THE CONDITION IS EXECUTED OR SHORT-CIRCUITED
IF QUERY RESULTS IN ERROR MEANS THE CONDITION EXECUTED, SHORT-CIRCUITED OTHERWISE.

SIMPLE EXPLANATION

Consider,

WHERE 1 = 1 OR 2 = 2

as the first condition is getting evaluated to TRUE, its meaningless to evaluate the second condition because its evaluation in whatever value
would not affect the result at all, so its good opportunity for Sql Server to save Query Execution time by skipping unnecessary condition checking or evaluation.

in case of "OR" if first condition is evaluated to TRUE the entire chain connected by "OR" would considered as evaluated to true without evaluating others.

condition1 OR condition2 OR ..... OR conditionN

if the condition1 is evaluated to true, rest all of the conditions till conditionN would be skipped.
In generalized words at determination of first TRUE, all other conditions linked by OR would be skipped.

Consider the second condition

WHERE 1 = 0 AND 1 = 1

as the first condition is getting evalutated to FALSE its meaningless to evaluate the second condition because its evaluation in whatever value
would not affect the result at all, so again its good opportunity for Sql Server to save Query Execution time by skipping unnecessary condition checking or evaluation.

in case of "AND" if first condition is evaluated to FALSE the entire chain connected with the "AND" would considered as evaluated to FALSE without evaluating others.

condition1 AND condition2 AND ..... conditionN

if the condition1 is evaluated to FALSE, rest all of the conditions till conditionN would be skipped.
In generalized words at determination of first FALSE, all other conditions linked by AND would be skipped.

THEREFOR, A WISE PROGRAMMER SHOULD ALWAYS PROGRAM THE CHAIN OF CONDITIONS IN SUCH A WAY THAT, LESS EXPENSIVE OR MOST ELIMINATING CONDITION GETS EVALUATED FIRST,
OR ARRANGE THE CONDITION IN SUCH A WAY THAT CAN TAKE MAXIMUM BENEFIT OF SHORT CIRCUIT

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文