条件排名
我正在尝试按客户的首次付款日期对客户的帐户进行排名。有时,他们开设的第一个帐户永远不会存入资金,因此“First_pay_date”的日期为“1900-01-01”。我想保留该行信息,但不想将其包含在排名中。
当前结果及代码:
CUST | ACCT | FIRST_PAY_DATE | RANK_BY_FIRST_PAY |
---|---|---|---|
JOHN H | JOHNH1 | 1900-01-01 | NULL |
JOHN H | JOHNH2 | 2000-02-25 | 2 |
JOHN H | JOHNH3 | 2001-03-21 | 3 |
JOHN H | JOHNH4 | 2002-12-01 | 4 |
所需结果:
CUST | ACCT | FIRST_PAY_DATE | RANK_BY_FIRST_PAY |
---|---|---|---|
JOHN H | JOHNH1 | 1900-01-01 | 0 |
JOHN H | JOHNH2 | 2000-02-25 | 1 |
JOHN H | JOHNH3 | 2001-03-21 | 2 |
JOHN H | JOHNH4 | 2002-12-01 | 3 |
SELECT
cust,
acct,
first_pay_date,
CASE
WHEN first_pay_date <> '1900-01-01' THEN RANK() OVER(PARTITION BY cust
ORDER BY
first_pay_date)
END rank_by_first_pay
FROM
acct_table
ORDER BY
first_pay_date ASC
I'm trying to rank accounts of a customer by their first payment date. Sometimes the first account they open is never funded and thus has a date '1900-01-01' for "First_pay_date". I want to keep that row of info but do not want to include it in the rank.
Current outcome with code:
CUST | ACCT | FIRST_PAY_DATE | RANK_BY_FIRST_PAY |
---|---|---|---|
JOHN H | JOHNH1 | 1900-01-01 | NULL |
JOHN H | JOHNH2 | 2000-02-25 | 2 |
JOHN H | JOHNH3 | 2001-03-21 | 3 |
JOHN H | JOHNH4 | 2002-12-01 | 4 |
Desired Result:
CUST | ACCT | FIRST_PAY_DATE | RANK_BY_FIRST_PAY |
---|---|---|---|
JOHN H | JOHNH1 | 1900-01-01 | 0 |
JOHN H | JOHNH2 | 2000-02-25 | 1 |
JOHN H | JOHNH3 | 2001-03-21 | 2 |
JOHN H | JOHNH4 | 2002-12-01 | 3 |
SELECT
cust,
acct,
first_pay_date,
CASE
WHEN first_pay_date <> '1900-01-01' THEN RANK() OVER(PARTITION BY cust
ORDER BY
first_pay_date)
END rank_by_first_pay
FROM
acct_table
ORDER BY
first_pay_date ASC
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在示例输出中,对于不需要的“1900”日期,您的 RANK 函数为 0,但这会导致简单的
-1
答案。因此,更合乎逻辑的解决方案是制作
ADJUISTED_FIRST_PAY
您可以使用 NULLIF 并使用 ORDER NULLS LAST,然后使用 IFF 跳过该 VALUE长格式:
短格式:
可以是推到零,例如:
或者只是将 DATE 分区为您想要忽略的值,并忽略结果,或者强制它们为零/空
并清零
In your example output, your RANK function is 0 for the unwanted "1900" date, but what which leads to the simple
-1
answer.So a more logical solution is to making an
ADJUISTED_FIRST_PAY
you can use NULLIF and use the ORDER NULLS LAST, and then us an IFF to skip that VALUElong form:
short form:
which can be pushed to zero like:
OR just partition on DATE being the value you want to ignore, and ignore the results, or force them to zero/null
and zero'ed out
除非我遗漏了某些内容,否则
rank()... - 1
应该返回您想要的结果,不需要case
表达式:并且取决于您想要的行为如果出现平局,您可能需要使用
dense_rank()
而不是rank()
。Unless I'm missing something,
rank()... - 1
should return the result you want, no need for thecase
expression:And depending on the behaviour you want in case of ties, you might want to use
dense_rank()
instead ofrank()
.另一种选择是将“1900-01-01”推入空值并使用上面建议的 DENSE_RANK() 。这样做的好处是,如果帐户没有初始 1900-01-01 数据点,则它不会丢失。 1900-01-01 也很好地放置在排名的末尾 - 如果有多个实例,您会看到它们(而不是默默地隐藏数据错误)。
Another option would be to push the '1900-01-01' to null value and use DENSE_RANK() as recommended above. The benefit is if an account doesn't have the initial 1900-01-01 data point it's not lost. The 1900-01-01 are also nicely placed at the end of the ranking - and if there's more than one instance you'll see them (instead of silently hiding the data error).