根据服务日期识别会员月份数据的差距
有两个数据集:
数据集 1:
成员 ID | MIN_SRVC_DT | NEW OR CURRENT |
---|---|---|
111 | 202104 | NEW |
222 | 201801 | NEW |
333 | 202002 | NEW |
数据集 2:
成员 ID | COVERAGE_MONTH |
---|---|
111 | 202102 |
111 | 202103 |
111 | 202104 |
111 | 202105 |
111 | 202106 |
222 | 202101 |
222 | 202102 |
222 | 202103 |
222 | 202104 |
222 | 202105 |
333 | 201909 |
333 | 201910 |
333 | 202101 |
333 | 202102 |
333 | 202103 |
333 | 202104 |
对于每个成员,如果他们在数据集 1 中被标记为“新”,但在数据集 1 中的 min_srvc_dt 之前在数据集 2 中没有连续两个成员月份,则翻转为“当前”。
最终结果表应如下所示:
数据集 1:
成员 ID | MIN_SRVC_DT | NEW OR CURRENT |
---|---|---|
111 | 202104 | NEW |
222 | 201801 | CURRENT |
333 | 202105 | CURRENT |
成员 111 保持为 NEW,因为它们在数据集 2 中具有覆盖月份 202102 和 202103。成员 222 切换到当前,因为最早报道他们从 202101 开始的月份,但他们的 min_srvc_dt 是 201801,因此他们需要 201801 之前 2 个月的覆盖月份(即 201712 和 201711)。成员 333 更改为当前,因为覆盖范围存在差距(他们的覆盖范围为 201909 和 201910,然后是 2021 年,但他们需要 202001 和 201912 的覆盖范围)
我如何在 SQL Server 中对此进行编码?如何使用数据集 1 中的 min_srvc_dt 找到数据集 2 中覆盖范围的差距?
我已经尝试过这个:
select memberID, coverage_month
, previous_month = lag(coverage_month) over (partition by memberID order by memberID, coverage_month)
into #temp
from #database2
select *
from #dataset1
where memberID not in (
select distinct memberID
from #dataset2
where min_srvc_dt between previous_month and coverage_month
)
但这只给出了我正在寻找的成员222。
然后我尝试了这个来尝试找出差距,但它并没有真正给出我正在寻找的东西......
select memberID, cvg_month, previous_month, cvg_month-previous_month as Month_Gap
from #temp
where CVG_MONTH-previous_month > 0 and CVG_MONTH-previous_month >= 1
There's two data sets:
Data set 1:
Member ID | MIN_SRVC_DT | NEW OR CURRENT |
---|---|---|
111 | 202104 | NEW |
222 | 201801 | NEW |
333 | 202002 | NEW |
Data set 2:
Member ID | COVERAGE_MONTH |
---|---|
111 | 202102 |
111 | 202103 |
111 | 202104 |
111 | 202105 |
111 | 202106 |
222 | 202101 |
222 | 202102 |
222 | 202103 |
222 | 202104 |
222 | 202105 |
333 | 201909 |
333 | 201910 |
333 | 202101 |
333 | 202102 |
333 | 202103 |
333 | 202104 |
For each member, if they are tagged as "NEW" in dataset 1, but do not have two consecutive member months in dataset 2 PRIOR to the min_srvc_dt in dataset 1, then flip to "CURRENT".
The ending result table should look like this:
Data set 1:
Member ID | MIN_SRVC_DT | NEW OR CURRENT |
---|---|---|
111 | 202104 | NEW |
222 | 201801 | CURRENT |
333 | 202105 | CURRENT |
Member 111 stays NEW because they have coverage months 202102 and 202103 in Dataset 2. Member 222 switches to current because the earliest coverage month they have starts in 202101 but their min_srvc_dt is 201801 so they needed a coverage month 2 months prior to 201801 (i.e. 201712 and 201711). Member 333 changes to current because there is a gap in coverage (they have coverage 201909 and 201910 and then in 2021, but they needed coverage in 202001 and 201912)
How can I code this in SQL Server? How can I find gaps in coverage in dataset 2 using min_srvc_dt in data set 1?
I tried this already:
select memberID, coverage_month
, previous_month = lag(coverage_month) over (partition by memberID order by memberID, coverage_month)
into #temp
from #database2
select *
from #dataset1
where memberID not in (
select distinct memberID
from #dataset2
where min_srvc_dt between previous_month and coverage_month
)
But this only gave what I was looking for member 222.
Then I tried this to try to identify gaps but it didn't really give me what I was looking for...
select memberID, cvg_month, previous_month, cvg_month-previous_month as Month_Gap
from #temp
where CVG_MONTH-previous_month > 0 and CVG_MONTH-previous_month >= 1
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
根据您的示例数据,您可以使用此查询来更新您的表,我使用
COVERAGE_MONTH
作为 int,您可以将它们用作日期。based on your sample data, you can use this query to update your table, am using
COVERAGE_MONTH
as an int you can use them as dates.