计算 SQL 公用表表达式中列的中位数
在 MSSQL2008 中,我尝试使用经典中值查询从公共表表达式计算一列数字的中值,如下所示:
WITH cte AS
(
SELECT number
FROM table
)
SELECT cte.*,
(SELECT
(SELECT (
(SELECT TOP 1 cte.number
FROM
(SELECT TOP 50 PERCENT cte.number
FROM cte
ORDER BY cte.number) AS medianSubquery1
ORDER BY cte.number DESC)
+
(SELECT TOP 1 cte.number
FROM
(SELECT TOP 50 PERCENT cte.number
FROM cte
ORDER BY cte.number DESC) AS medianSubquery2
ORDER BY cte.number ASC) ) / 2)) AS median
FROM cte
ORDER BY cte.number
我得到的结果集如下:
NUMBER MEDIAN
x1 x1
x1 x1
x1 x1
x2 x2
x3 x3
换句话说,“中值”列是相同的当我期望中位数列一直向下为“x1”时,作为“数字”列。我使用类似的表达式来计算模式,并且它在相同的公用表表达式上运行良好。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是一种稍微不同的方法:
Here's a slightly different way to do it:
您的查询的问题是您正在执行
SELECT TOP 1 cte.number FROM... ,
但它与子查询不相关,它与外部查询相关,因此子查询不相关。这就解释了为什么你最终会得到相同的值。删除
cte.
(如下)得出 CTE 的中值。这是一个恒定值。你想做什么?退货
The problem with your query is that you are doing
SELECT TOP 1 cte.number FROM...
but it isn't correlated with the sub query it is correlated with the Outer query so the subquery is irrelevant. Which explains why you simply end up with the same value all the way down. Removing the
cte.
(as below) gives the median of the CTE. Which is a constant value. What are you trying to do?Returns
这并不是一个全新的答案,因为它主要扩展了 Mark Byer 的答案,但有几个选项可以进一步简化查询。
首先是真正利用 CTE。您不仅可以有多个 CTE,而且它们可以互相引用。考虑到这一点,我们可以创建一个额外的 CTE 来根据第一个结果计算中值。这封装了中值计算,并让实际的 SELECT 只做它需要做的事情。请注意,ROW_NUMBER() 必须移至第一个 CTE。
为了进一步降低复杂性,您“可以”使用自定义 CLR 聚合来处理中位数(例如 http://www.SQLsharp.com/ [我是其作者])。
This is not an entirely new answer as it mostly expands on Mark Byer's answer, but there are a couple of options for simplifying the query even further.
The first thing is to really make use of CTE's. Not only can you have multiple CTE's, but they can refer to each other. With this in mind, we can create an additional CTE to compute the median based on the results of the first. This encapsulates the median computation and leaves the actual SELECT to do only what it needs to do. Note that the ROW_NUMBER() had to be moved into the first CTE.
And to further reduce complexity, you "could" use a custom CLR Aggregate to handle the Median (such as the one provided in the free SQL# library at http://www.SQLsharp.com/ [which I am the author of]).