条件排名

发布于 2025-01-11 23:59:08 字数 1408 浏览 0 评论 0原文

我正在尝试按客户的首次付款日期对客户的帐户进行排名。有时，他们开设的第一个帐户永远不会存入资金，因此“First_pay_date”的日期为“1900-01-01”。我想保留该行信息，但不想将其包含在排名中。

当前结果及代码：

CUST	ACCT	FIRST_PAY_DATE	RANK_BY_FIRST_PAY
JOHN H	JOHNH1	1900-01-01	NULL
JOHN H	JOHNH2	2000-02-25	2
JOHN H	JOHNH3	2001-03-21	3
JOHN H	JOHNH4	2002-12-01	4

所需结果：

CUST	ACCT	FIRST_PAY_DATE	RANK_BY_FIRST_PAY
JOHN H	JOHNH1	1900-01-01	0
JOHN H	JOHNH2	2000-02-25	1
JOHN H	JOHNH3	2001-03-21	2
JOHN H	JOHNH4	2002-12-01	3

SELECT
       cust,
       acct,
       first_pay_date,
       CASE
             WHEN first_pay_date <> '1900-01-01' THEN RANK() OVER(PARTITION BY cust
       ORDER BY
             first_pay_date)
       END rank_by_first_pay
FROM
       acct_table
ORDER BY
       first_pay_date ASC

原文

I'm trying to rank accounts of a customer by their first payment date. Sometimes the first account they open is never funded and thus has a date '1900-01-01' for "First_pay_date". I want to keep that row of info but do not want to include it in the rank.

Current outcome with code:

CUST	ACCT	FIRST_PAY_DATE	RANK_BY_FIRST_PAY
JOHN H	JOHNH1	1900-01-01	NULL
JOHN H	JOHNH2	2000-02-25	2
JOHN H	JOHNH3	2001-03-21	3
JOHN H	JOHNH4	2002-12-01	4

Desired Result:

CUST	ACCT	FIRST_PAY_DATE	RANK_BY_FIRST_PAY
JOHN H	JOHNH1	1900-01-01	0
JOHN H	JOHNH2	2000-02-25	1
JOHN H	JOHNH3	2001-03-21	2
JOHN H	JOHNH4	2002-12-01	3

SELECT
       cust,
       acct,
       first_pay_date,
       CASE
             WHEN first_pay_date <> '1900-01-01' THEN RANK() OVER(PARTITION BY cust
       ORDER BY
             first_pay_date)
       END rank_by_first_pay
FROM
       acct_table
ORDER BY
       first_pay_date ASC

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

海夕 2025-01-18 23:59:09

在示例输出中，对于不需要的“1900”日期，您的 RANK 函数为 0，但这会导致简单的 -1 答案。

因此，更合乎逻辑的解决方案是制作 ADJUISTED_FIRST_PAY 您可以使用 NULLIF 并使用 ORDER NULLS LAST，然后使用 IFF 跳过该 VALUE

长格式：

SELECT 
    cust 
    ,acct
    ,NULLIF(first_pay_date,'1900-01-01`) AS clean_date
    ,RANK() OVER (PARTITION BY cust ORDER BY clean_date NULLS LAST) AS ranked
    ,IFF(first_pay_date='1900-01-01`, NULL, ranked) AS answer
FROM
    sample_data

短格式：

SELECT 
    cust 
    ,acct
    ,IFF(first_pay_date='1900-01-01`, 
        NULL, 
        RANK() OVER (PARTITION BY cust ORDER BY NULLIF(first_pay_date,'1900-01-01`) NULLS LAST)) AS answer
FROM
    sample_data

可以是推到零，例如：

SELECT 
    cust 
    ,acct
    ,IFF(first_pay_date='1900-01-01`, 
        0, 
        RANK() OVER (PARTITION BY cust ORDER BY NULLIF(first_pay_date,'1900-01-01`) NULLS LAST)) AS answer
FROM
    sample_data

或者只是将 DATE 分区为您想要忽略的值，并忽略结果，或者强制它们为零/空

RANK() OVER (PARTITION BY cust, year(first_pay_date)=1900 ORDER BY first_pay_date)

并清零

SELECT 
    cust 
    ,acct
    ,year(first_pay_date)=1900, 
        0, 
        RANK() OVER (PARTITION BY cust, year(first_pay_date)=1900 ORDER BY first_pay_date)
    ) AS answer
FROM
    sample_data

In your example output, your RANK function is 0 for the unwanted "1900" date, but what which leads to the simple -1 answer.

So a more logical solution is to making an ADJUISTED_FIRST_PAY you can use NULLIF and use the ORDER NULLS LAST, and then us an IFF to skip that VALUE

long form:

SELECT 
    cust 
    ,acct
    ,NULLIF(first_pay_date,'1900-01-01`) AS clean_date
    ,RANK() OVER (PARTITION BY cust ORDER BY clean_date NULLS LAST) AS ranked
    ,IFF(first_pay_date='1900-01-01`, NULL, ranked) AS answer
FROM
    sample_data

short form:

SELECT 
    cust 
    ,acct
    ,IFF(first_pay_date='1900-01-01`, 
        NULL, 
        RANK() OVER (PARTITION BY cust ORDER BY NULLIF(first_pay_date,'1900-01-01`) NULLS LAST)) AS answer
FROM
    sample_data

which can be pushed to zero like:

SELECT 
    cust 
    ,acct
    ,IFF(first_pay_date='1900-01-01`, 
        0, 
        RANK() OVER (PARTITION BY cust ORDER BY NULLIF(first_pay_date,'1900-01-01`) NULLS LAST)) AS answer
FROM
    sample_data

OR just partition on DATE being the value you want to ignore, and ignore the results, or force them to zero/null

RANK() OVER (PARTITION BY cust, year(first_pay_date)=1900 ORDER BY first_pay_date)

and zero'ed out

SELECT 
    cust 
    ,acct
    ,year(first_pay_date)=1900, 
        0, 
        RANK() OVER (PARTITION BY cust, year(first_pay_date)=1900 ORDER BY first_pay_date)
    ) AS answer
FROM
    sample_data

回复收藏 0 原文

叹沉浮 2025-01-18 23:59:08

除非我遗漏了某些内容，否则 rank()... - 1 应该返回您想要的结果，不需要 case 表达式：

SELECT
       cust,
       acct,
       first_pay_date,
       RANK() OVER(PARTITION BY cust
       ORDER BY
             first_pay_date) - 1
       rank_by_first_pay
FROM
       acct_table
ORDER BY
       first_pay_date ASC

并且取决于您想要的行为如果出现平局，您可能需要使用 dense_rank() 而不是 rank()。

Unless I'm missing something, rank()... - 1 should return the result you want, no need for the case expression:

SELECT
       cust,
       acct,
       first_pay_date,
       RANK() OVER(PARTITION BY cust
       ORDER BY
             first_pay_date) - 1
       rank_by_first_pay
FROM
       acct_table
ORDER BY
       first_pay_date ASC

And depending on the behaviour you want in case of ties, you might want to use dense_rank() instead of rank().

回复收藏 0 原文

烟织青萝梦 2025-01-18 23:59:08

另一种选择是将“1900-01-01”推入空值并使用上面建议的 DENSE_RANK() 。这样做的好处是，如果帐户没有初始 1900-01-01 数据点，则它不会丢失。 1900-01-01 也很好地放置在排名的末尾 - 如果有多个实例，您会看到它们（而不是默默地隐藏数据错误）。

with sample_data as (
  select 'JOHN H'CUST ,'JOHNH1' ACCT,'1900-01-01'::date FIRST_PAY_DATE   
union select 'JOHN H'CUST,'JOHNH2' as ACCT,'2000-02-25' as FIRST_PAY_DATE   
union select 'JOHN H'CUST,'JOHNH3' as ACCT,'2001-03-21' as FIRST_PAY_DATE   
union select 'JOHN H'CUST,'JOHNH4' as ACCT,'2002-12-01' as FIRST_PAY_DATE   
union select 'JOBE' CUST,'JOBE1' ACCT,'2000-01-01' FIRST_PAY_DATE    
union select 'JOBE' CUST,'JOBE2' as ACCT,'2000-02-25' as FIRST_PAY_DATE 
union select 'JOBE' CUST,'JOBE3' as ACCT,'2001-03-21' as FIRST_PAY_DATE 
union select 'JOBE' CUST,'JOBE4' as ACCT,'2001-03-21' as FIRST_PAY_DATE )

select 
    CUST 
    ,ACCT
    ,REPLACE(FIRST_PAY_DATE,'1900-01-01','NULL') ADJUSTED_FIRST_PAY_DATE 
    ,DENSE_RANK() OVER (PARTITION BY CUST ORDER BY ADJUSTED_FIRST_PAY_DATE) RANKED_EX_1900
    ,DENSE_RANK() OVER (PARTITION BY CUST ORDER BY FIRST_PAY_DATE)-1 RANKED_MINUS_ONE
FROM
    SAMPLE_DATA

Another option would be to push the '1900-01-01' to null value and use DENSE_RANK() as recommended above. The benefit is if an account doesn't have the initial 1900-01-01 data point it's not lost. The 1900-01-01 are also nicely placed at the end of the ranking - and if there's more than one instance you'll see them (instead of silently hiding the data error).

with sample_data as (
  select 'JOHN H'CUST ,'JOHNH1' ACCT,'1900-01-01'::date FIRST_PAY_DATE   
union select 'JOHN H'CUST,'JOHNH2' as ACCT,'2000-02-25' as FIRST_PAY_DATE   
union select 'JOHN H'CUST,'JOHNH3' as ACCT,'2001-03-21' as FIRST_PAY_DATE   
union select 'JOHN H'CUST,'JOHNH4' as ACCT,'2002-12-01' as FIRST_PAY_DATE   
union select 'JOBE' CUST,'JOBE1' ACCT,'2000-01-01' FIRST_PAY_DATE    
union select 'JOBE' CUST,'JOBE2' as ACCT,'2000-02-25' as FIRST_PAY_DATE 
union select 'JOBE' CUST,'JOBE3' as ACCT,'2001-03-21' as FIRST_PAY_DATE 
union select 'JOBE' CUST,'JOBE4' as ACCT,'2001-03-21' as FIRST_PAY_DATE )

select 
    CUST 
    ,ACCT
    ,REPLACE(FIRST_PAY_DATE,'1900-01-01','NULL') ADJUSTED_FIRST_PAY_DATE 
    ,DENSE_RANK() OVER (PARTITION BY CUST ORDER BY ADJUSTED_FIRST_PAY_DATE) RANKED_EX_1900
    ,DENSE_RANK() OVER (PARTITION BY CUST ORDER BY FIRST_PAY_DATE)-1 RANKED_MINUS_ONE
FROM
    SAMPLE_DATA

回复收藏 0 原文

~没有更多了~