如何在数据集中每个ID获得每一行? SAS

发布于 2025-02-02 19:35:35 字数 3808 浏览 3 评论 0 原文

我有10000个观测值的列表,每个代码号1000(id):

proc surveyselect data=code_nums out=bootstrapped_code_names (keep=code_name qty replicate) sampsize=1825 method=urs outhits no print seed=0 rep=1000;
strata code_num;
run;

proc sql no print;
create table test as 
select code_num
,replicate
,sum(quantity) as total_qty
from bootstrapped_code_names
group code_name
order by total_qty

quit;

要得到:

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
12345678  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3

我想获得一个数据集,该数据集捕获每个ID的每百分点观察,以获取类似的东西:

ID Replicate QTY
12345 174 54
12345 56 76
12345 876 98
12345 354 102
12345 7 106
12345 45 113
12345 128 123
12345 23 156
12345 948 168
12347 465 24
465 168
12347 12347
5 45
948 160

​依此多待地为每个ID。

我尝试了此代码不适用不同的ID:

%macro test;
  %do j=10 %to 90 %by 10;
  %global row&j.;
  %let row&j. = %sysevalf(1000*&j./100, floor);
%end;
%mend test;

data get_rows;
set code;
if _N_ in(&row&10., &row&20., &row&30., &row&40., &row&50., &row&60., &row&70., &row&80., &row&90.);
run;

data _null_;
set get_rows;
row = _N_*10;
call symputx('row' ||strip(row), replicate, 'G');
run;

每个ID如何获得每一行?每个ID都有1000行。

I have a list of 10000 observations, 1000 per code number (ID):

proc surveyselect data=code_nums out=bootstrapped_code_names (keep=code_name qty replicate) sampsize=1825 method=urs outhits no print seed=0 rep=1000;
strata code_num;
run;

proc sql no print;
create table test as 
select code_num
,replicate
,sum(quantity) as total_qty
from bootstrapped_code_names
group code_name
order by total_qty

quit;

To get this:

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
12345678  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3

I would like to obtain a dataset that captures every hundredth observation for every ID, to get something like this:

id replicate qty
12345 174 54
12345 56 76
12345 876 98
12345 354 102
12345 7 106
12345 45 113
12345 128 123
12345 23 156
12345 948 168
12347 465 24
12347 374 36
12347 58 55
12347 38 78
12347 11 109
12347 283 117
12345 90 133
12345 23 145
12345 948 160

and so on for each ID.

I tried this code by it doesn't apply for distinct ID:

%macro test;
  %do j=10 %to 90 %by 10;
  %global row&j.;
  %let row&j. = %sysevalf(1000*&j./100, floor);
%end;
%mend test;

data get_rows;
set code;
if _N_ in(&row&10., &row&20., &row&30., &row&40., &row&50., &row&60., &row&70., &row&80., &row&90.);
run;

data _null_;
set get_rows;
row = _N_*10;
call symputx('row' ||strip(row), replicate, 'G');
run;

How can get every hundred row per ID? Each ID has 1000 rows.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

小红帽 2025-02-09 19:35:35

不确定我遵循您的完整问题描述,但是这里是一个简单的代码,可以选择每个ID值第一个,第101,201t等观测。源数据必须已经由ID变量对其进行排序。

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if 1=mod(_n_,100) then output;
  end;
run;

Not sure I follow your full problem description, but here is simple code to select the first, 101st, 201st, etc observation per ID value. The source data must already be sorted by the ID variable(s).

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if 1=mod(_n_,100) then output;
  end;
run;
挽心 2025-02-09 19:35:35

通过使用MOD函数从昨天开始修改代码,以检查该值是否可除以100。更改MOD函数中的值以获取其他值。

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
;;;;


data code_counter;
set code;
by code_num;

if first.code_num then count=1;
else count+1;
run;

data code100;
set code_counter;
if mod(count, 100)=0 then output;
run;

Modifying the code from yesterday by using a MOD function to check if the value is divisible by 100. Change the value in the MOD function to get other values.

https://documentation.sas.com/doc/en/vdmmlcdc/8.1/ds2ref/n0t9j8b09x4uphn1kl1i70x63z19.htm

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
;;;;


data code_counter;
set code;
by code_num;

if first.code_num then count=1;
else count+1;
run;

data code100;
set code_counter;
if mod(count, 100)=0 then output;
run;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文