检索每个日期之前 n 天的值

发布于 2024-10-12 06:05:56 字数 460 浏览 5 评论 0原文

我有一个数据集如下,我需要检索两件事:1)每个日期的(date-1)和(date-3)之间的VALUE总和,2)在5天内是否有>= VALUE 为 0 的两天。我认为应该使用 PROC SQL,但我不确定如何实现这一点。 输入数据集:

ID   DATE      VALUE
1   20110101     0
1   20110102     0
1   20110103     1
1   20110104     2
2   20110101     1
2   20110102     2
2   20110103     3
2   20110104     4 

对于 ID1,20110104,输出应为 1) 1 (0+0+1);对于 ID2,20110104,输出应为 6 (1+2+3)。 2) ID1、20110104 的标记,因为有 2 天3 天窗口内的值为 0。

非常感谢任何帮助!

I have a dataset as follows and I need to retrieve two things: 1) the sum of VALUE between (date-1) and (date-3) for each date and 2) whether, during the 5 days, there are >= two days where the VALUE is 0. I think PROC SQL should be used but I'm not sure how to implement this.
INPUT DATASET:

ID   DATE      VALUE
1   20110101     0
1   20110102     0
1   20110103     1
1   20110104     2
2   20110101     1
2   20110102     2
2   20110103     3
2   20110104     4 

Output should be 1) 1 (0+0+1) for ID1, 20110104 and 6 (1+2+3) for ID2, 20110104. and 2) a mark for ID1, 20110104, since there are 2 days with a value of 0 during the 3-day window.

Any help is greatly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

厌味 2024-10-19 06:05:56

这两个问题都可以通过类似的 SQL 查询来解决。你的第二个问题有点令人困惑,因为你曾经提到过 5 天的周期和一次 3 天的窗口。我对这两个查询使用了相同的 3 天窗口,因此如果您需要另一个窗口,请修改开始日期和结束日期。

1)

proc sql;
 select t1.id, t1.date, sum(t2.value) as totalvalue
 from _input t1
 left join _input t2
 on t1.date-4 lt t2.date
 and t1.date gt t2.date
 and t1.id = t2.id
 group by t1.id, t1.date;
quit;

2)

proc sql;
 select t1.id, t1.date
 from _input t1
 left join _input t2
 on t1.date-4 lt t2.date
 and t1.date gt t2.date
 and t1.id = t2.id
 and t2.value = 0
 group by t1.id, t1.date
 having count(*) ge 2
;
quit;

Both problems can be solved with a similar SQL query. Your second question is a bit confusing, because you once mention a 5 day periode and once a 3 day window. I used the same 3 day window for both queries, so modify the start and end date if you need another window.

1)

proc sql;
 select t1.id, t1.date, sum(t2.value) as totalvalue
 from _input t1
 left join _input t2
 on t1.date-4 lt t2.date
 and t1.date gt t2.date
 and t1.id = t2.id
 group by t1.id, t1.date;
quit;

2)

proc sql;
 select t1.id, t1.date
 from _input t1
 left join _input t2
 on t1.date-4 lt t2.date
 and t1.date gt t2.date
 and t1.id = t2.id
 and t2.value = 0
 group by t1.id, t1.date
 having count(*) ge 2
;
quit;
匿名。 2024-10-19 06:05:56

这是仅使用数据步骤的另一种方法。我假设您不希望少于三个记录的范围的总和和标记,因此数据步骤将它们显式设置为未定义。

proc sort data=sample;
    by id date;
run;

data result(drop=k count);
    retain count;
    set sample;
    by id;

    if first.id then count=0;
    sum=lag1(value) + lag2(value) + lag3(value);
    if count<3 then sum=.;

    k=0;
    if lag1(value)=0 then k=k+1;
    if lag2(value)=0 then k=k+1;
    if lag3(value)=0 then k=k+1;
    if k ge 2 then mark=1;

    count=count+1;

run;

proc print data=result;
run;

Here is an alternate way that just uses a data step. I'm assuming that you don't want sums and marks for ranges of less than three records so the data step explicitly sets them to undefined.

proc sort data=sample;
    by id date;
run;

data result(drop=k count);
    retain count;
    set sample;
    by id;

    if first.id then count=0;
    sum=lag1(value) + lag2(value) + lag3(value);
    if count<3 then sum=.;

    k=0;
    if lag1(value)=0 then k=k+1;
    if lag2(value)=0 then k=k+1;
    if lag3(value)=0 then k=k+1;
    if k ge 2 then mark=1;

    count=count+1;

run;

proc print data=result;
run;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文