简化(别名)T-SQL CASE 语句。 有什么可以改进的吗?
如您所见,这很糟糕。 还有其他选择吗? 我尝试在 group by 子句中使用列别名但无济于事。
select count(callid) ,
case
when callDuration > 0 and callDuration < 30 then 1
when callDuration >= 30 and callDuration < 60 then 2
when callDuration >= 60 and callDuration < 120 then 3
when callDuration >= 120 and callDuration < 180 then 4
when callDuration >= 180 and callDuration < 240 then 5
when callDuration >= 240 and callDuration < 300 then 6
when callDuration >= 300 and callDuration < 360 then 7
when callDuration >= 360 and callDuration < 420 then 8
when callDuration >= 420 and callDuration < 480 then 9
when callDuration >= 480 and callDuration < 540 then 10
when callDuration >= 540 and callDuration < 600 then 11
when callDuration >= 600 then 12
end as duration
from callmetatbl
where programid = 1001 and callDuration > 0
group by case
when callDuration > 0 and callDuration < 30 then 1
when callDuration >= 30 and callDuration < 60 then 2
when callDuration >= 60 and callDuration < 120 then 3
when callDuration >= 120 and callDuration < 180 then 4
when callDuration >= 180 and callDuration < 240 then 5
when callDuration >= 240 and callDuration < 300 then 6
when callDuration >= 300 and callDuration < 360 then 7
when callDuration >= 360 and callDuration < 420 then 8
when callDuration >= 420 and callDuration < 480 then 9
when callDuration >= 480 and callDuration < 540 then 10
when callDuration >= 540 and callDuration < 600 then 11
when callDuration >= 600 then 12
end
编辑: 我真的想问如何拥有单个案例源,但无论如何都欢迎案例修改(尽管不太有用,因为间隔可能会被修改,甚至可能会自动生成)。
正如某些人所考虑的那样, callDuration 确实是一个浮点数,因此通过将值排除在间隔之外,某些列出的解决方案对于我的用例无效。
经验教训:
寻找 case 表达式中的模式,以在可能且值得的情况下减少它
<前><代码>案例 当通话持续时间> 0 AND 通话持续时间 < 30 然后 1 当通话持续时间> 600 然后 12 else 楼层(callDuration/60) + 2 结束 结束为持续时间使用内联视图获得 case 的单一来源
选择计数(d.callid), d.duration 从 ( 选择卡利德 , 案件 当通话持续时间> 0 AND 通话持续时间 < 30 然后 1 当通话持续时间> 600 然后 12 else 楼层(callDuration/60) + 2 结束 结束为持续时间 来自 callmetatbl 其中programid = 1001 并且通话持续时间> 0 ) d 按 d.duration 分组
或使用公用表表达式
,其中uration_case为( 选择卡利德, 案件 当通话持续时间> 0 AND 通话持续时间 < 30 然后 1 当通话持续时间> 600 然后 12 else 楼层(callDuration/60) + 2 结束 结束为持续时间 来自 callmetatbl 其中programid = 1001并且callDuration > 0) 选择计数(callid)、持续时间 来自持续时间案例 按持续时间分组
或者使用用户定义的函数(目前还没有示例 :-) )
或者使用查找表和联接
声明@t TABLE(durationFrom float,durationTo float,结果INT) --用值填充表以便查询有效 选择计数(callid),COALESCE(t.结果,12) from callmetatbl JOIN @t AS t ON callDuration >= t.durationFrom AND 通话持续时间 < t.durationTo 其中programid = 1001并且callDuration > 0
感谢大家,我很难选择一个可接受的答案,因为许多人涵盖了问题的不同部分(我在那里认为这是一个简单的问题和简单的答案:-),很抱歉造成混乱)。
As you can see, this sucks big time. Any alternative? I've tried using the column alias in the group by clause to no avail.
select count(callid) ,
case
when callDuration > 0 and callDuration < 30 then 1
when callDuration >= 30 and callDuration < 60 then 2
when callDuration >= 60 and callDuration < 120 then 3
when callDuration >= 120 and callDuration < 180 then 4
when callDuration >= 180 and callDuration < 240 then 5
when callDuration >= 240 and callDuration < 300 then 6
when callDuration >= 300 and callDuration < 360 then 7
when callDuration >= 360 and callDuration < 420 then 8
when callDuration >= 420 and callDuration < 480 then 9
when callDuration >= 480 and callDuration < 540 then 10
when callDuration >= 540 and callDuration < 600 then 11
when callDuration >= 600 then 12
end as duration
from callmetatbl
where programid = 1001 and callDuration > 0
group by case
when callDuration > 0 and callDuration < 30 then 1
when callDuration >= 30 and callDuration < 60 then 2
when callDuration >= 60 and callDuration < 120 then 3
when callDuration >= 120 and callDuration < 180 then 4
when callDuration >= 180 and callDuration < 240 then 5
when callDuration >= 240 and callDuration < 300 then 6
when callDuration >= 300 and callDuration < 360 then 7
when callDuration >= 360 and callDuration < 420 then 8
when callDuration >= 420 and callDuration < 480 then 9
when callDuration >= 480 and callDuration < 540 then 10
when callDuration >= 540 and callDuration < 600 then 11
when callDuration >= 600 then 12
end
EDIT:
I really meant to ask how to have a single case source, but case modifications are welcome anyway (although less useful because the intervals probably will be modified and might even be automatically generated).
As has been considered by some people, callDuration is indeed a float so some listed solutions are not valid for my use case, by leaving values out of the intervals.
Lessons:
Look for patterns in the case expression to reduce it if possible and worthwhile
case when callDuration > 0 AND callDuration < 30 then 1 when callDuration > 600 then 12 else floor(callDuration/60) + 2 end end as duration
Use inline views to have a single source of the case
select count(d.callid), d.duration from ( select callid , case when callDuration > 0 AND callDuration < 30 then 1 when callDuration > 600 then 12 else floor(callDuration/60) + 2 end end as duration from callmetatbl where programid = 1001 and callDuration > 0 ) d group by d.duration
Or use common table expressions
with duration_case as ( select callid , case when callDuration > 0 AND callDuration < 30 then 1 when callDuration > 600 then 12 else floor(callDuration/60) + 2 end end as duration from callmetatbl where programid = 1001 and callDuration > 0 ) select count(callid), duration from duration_case group by duration
Or use an user defined function (no example so far :-) )
Or use a lookup table and a join
DECLARE @t TABLE(durationFrom float, durationTo float, result INT) --populate table with values so the query works select count(callid) , COALESCE(t.result, 12) from callmetatbl JOIN @t AS t ON callDuration >= t.durationFrom AND callDuration < t.durationTo where programid = 1001 and callDuration > 0
Thanks to everybody and I'm having a very difficult time choosing an accepted answer, as many covered different parts of the question (and I was there thinking it was a simple question with a straightforward answer :-), sorry for the confusion).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(11)
问:如何获取要在 GROUP BY 子句中使用的别名
一种方法是使用内联视图。 [编辑] Remus Rusanu(+1!)的答案给出了一个公共表表达式的示例来完成同样的事情。 [/编辑]
内联视图为复杂表达式提供一个简单的“别名”,然后您可以在外部查询的 GROUP BY 子句中引用它:
让我们将其解压。
d
),我们可以引用别名duration
来自d
这应该足以回答您的问题。 如果您正在寻找等效的替换表达式,那么 tekBlues (+1 !) 中的表达式是正确的答案(它适用于边界和非整数。 )
使用 tekBlues 的替换表达式(+1!):(
这应该足以回答您的问题。)
[更新:]示例用户定义函数(内联 CASE 表达式的替换)
注意:请注意,用户定义的函数会增加开销,并且(当然)会添加对另一个数据库对象的依赖关系。
此示例函数与原始表达式等效。 OP CASE 表达式没有任何间隙,但它确实引用每个“断点”两次,我更喜欢仅测试下限。 (满足条件时 CASE 返回。反向执行测试会让未处理的情况(<=0 或 NULL)在没有测试的情况下失败,
ELSE NULL
不是必需的,但可以添加 。其他详细信息
(请务必检查性能和优化器计划,以确保它与原始版本相同(或没有明显差于)。过去,我在获取性能方面遇到了问题 谓词推入内联视图,看起来在您的情况下不会出现问题。)
存储视图
请注意,内联视图也可以存储为视图定义但是,除了从语句中“隐藏”复杂表达式之外,没有理由这样做。
简化复杂表达式
另一种使复杂表达式“更简单”的方法是使用但是用户定义的函数有其自身的一系列问题(包括性能下降)。
添加数据库“查找”表
一些答案建议向数据库添加“查找”表。 我不认为这真的有必要。 当然可以这样做,如果您希望能够从
callDuration
动态地派生不同的duration
值,无需 必须修改查询并且无需运行任何DDL 语句(例如更改视图定义或修改用户定义的函数)。通过联接到“查找”表,一个好处是您只需在“查找”表上执行 DML 操作即可使查询返回不同的结果集。
但同样的优势实际上也可能是一个缺点。
仔细考虑好处是否真的大于坏处。 考虑新表对单元测试的影响,如何验证查找表的内容有效且未更改(任何重叠?任何间隙?),对代码持续维护的影响(由于额外的复杂性)。
一些重大假设
这里给出的很多答案似乎都假设
callDuration
是一个INTEGER数据类型。 看来他们忽略了它不是整数的可能性,但也许我错过了问题中的那个金块。这是一个相当简单的测试用例来证明:
不等于
Q: how to get an alias to use in the GROUP BY clause
One approach is to use an inline view. [EDIT] The answer from Remus Rusanu (+1!) gives an example of a Common Table Expression to accomplish the same thing. [/EDIT]
The inline view gets you a simple "alias" for the complex expression which you can then reference in a GROUP BY clause in an outer query:
Let's unpack that.
d
)duration
fromd
That should be sufficient to answer your question. If you're looking for an equivalent replacement expression, the one from tekBlues (+1 !) is the right answer (it works on the boundary and for non-integers.)
With the replacement expression from tekBlues (+1!):
(This should be sufficient to answer your question.)
[UPDATE:] sample user defined function (a replacement for inline CASE expression)
NOTES: be aware that the user defined function will add overhead, and (of course) add a dependency on another database object.
This example function is equivalent to the original expression. The OP CASE expression doesn't have any gaps, but it does reference each "breakpoint" twice, I prefer to test only the lower bound. (CASE returns when a condition is satisfied. Doing the tests in reverse lets the unhandled case (<=0 or NULL) fall through without test, an
ELSE NULL
is not necessary, but could be added for completeness.ADDITIONAL DETAILS
(Be sure to check the performance and the optimizer plan, to make sure it's the same as (or not significantly worse than) the original. In the past, I've had problems getting predicates pushed into the inline view, doesn't look like it will be a problem in your case.)
stored view
Note that the inline view could also be stored as view definition in the database. But there's no reason to do that, other than to "hide" the complex expression from your statement.
simplifying the complex expression
Another way to make a complex expression "simpler" is to use a user defined function. But a user defined function comes with its own set of issues (including degraded performance.)
add database "lookup" table
Some answers recommend adding a "lookup" table to the database. I don't see that this is really necessary. It could be done of course, and could make sense if you want to be able to derive different values for
duration
fromcallDuration
, on the fly, without having to modify your query and without having to run any DDL statements (e.g. to alter a view definition, or modify a user defined function).With a join to a "lookup" table, one benefit is that you could make the query return different result sets by just performing DML operations on the "lookup" table.
But that same advantage may actually be a drawback as well.
Consider carefully if the benefit actually outweighs the downside. Consider the impact that new table will have on unit testing, how to verify the contents of the lookup table are valid and not changed (any overlaps? any gaps?), impact on ongoing maintenance to the code (due to the additional complexity).
some BIG assumptions
A lot of the answers given here seem to assume that
callDuration
is an INTEGER datatype. It seems they have overlooked the possibility that it's not an integer, but maybe I missed that nugget in the question.It's fairly simple test case to demonstrate that:
is NOT equivalent to
您有什么原因不使用
Between
吗? 案例陈述本身看起来并不算太糟糕。 如果您真的讨厌它,您可以将所有这些放入表格中并绘制出来。等等...
编辑:或者,在使用浮动并且
Between 变得很麻烦的情况下。
Is there any reason you're not using
between
? The case statements themselves don't look too bad. If you really hate it you could throw all this into a table and map it.etc...
EDIT: Or, in a case where floats are being used and
between
becomes cumbersome.这种情况可以这样写:
不需要,将其替换为“where callduration> 0”
我喜欢之前给出的翻译表答案! 这是最好的解决方案
the case can be written like this:
The having is not needed, replace it by a "where callduration>0"
I like the translate table answer given before! that's the best solution
您需要将 CASE 进一步推向查询树,以便其投影对 GROUP BY 可见。 这可以通过两种方式实现:
使用通用表表达式
两种解决方案在各方面都是等效的。 我发现 CTE 更具可读性,有些人更喜欢派生表,因为它更便携。
You need to push the CASE further down the query tree so that its projection is visible to the GROUP BY. This can be achieve in two ways:
Use a common table expressions
Both solutions are equivalent in every respect. I find CTEs more readable, some prefer derived tables as more portable.
将
callDuration
除以 60:请注意,
Between
包含边界,并且我假设 callDuration 将被视为整数。更新:
将其与其他一些答案结合起来,您可以将整个查询归结为:
Divide
callDuration
by 60:Note that
between
is inclusive of the bounds, and I'm assuming callDuration will be treated as an integer.Update:
Combine this with some of the other answers, and you can get the entire query down to this:
未经测试:
Untested:
将所有案例添加到表变量中并进行外连接
Add all the cases into a table variable and do an outer join
这是我的尝试。 您需要的所有组件都可以直接使用 SQL 完成。
这是我用于测试的 SQL。 (我没有自己的个人 callmetatbl ;)
SQL 输出如下所示,为每个持续时间(存储桶)1 到 12 计数 2 条记录。
以下是“foo”子项的结果-查询:
干杯。
Here's my shot at it. All of the components you need can be done in straight SQL.
Here's the SQL I used for testing. (I don't have my own personal callmetatbl ;)
The SQL output is shown below, as 2 records counted for each duration (bucket) 1 through 12.
Here are the results from the "foo" sub-query:
Cheers.
这里的用户定义函数有什么问题吗? 您可以通过这种方式直观地清理代码并集中功能。 就性能而言,我认为打击不会太可怕,除非您在上述 UDF 中做了一些真正落后的事情。
What's so wrong with a User Defined Function here? You could both visually clean up the code and centralize the functionality that way. Performance-wise, I can't see the hit being too horrible unless you are doing something really retarded within said UDF.
创建
持续时间
的查找表使用查找表也会加快
SELECT
语句的速度。这是使用查找表的最终结果。
这是查找表。
Create a lookup table for
duration
Using a look up table will speed up the
SELECT
statement as well.Here is the end result of how it will look with lookup table.
Here is the look up table.