如何在数据集中每个ID获得每一行？ SAS

发布于 2025-02-02 19:35:35 字数 3808 浏览 3 评论 0 原文

我有10000个观测值的列表，每个代码号1000（id）：

proc surveyselect data=code_nums out=bootstrapped_code_names (keep=code_name qty replicate) sampsize=1825 method=urs outhits no print seed=0 rep=1000;
strata code_num;
run;

proc sql no print;
create table test as 
select code_num
,replicate
,sum(quantity) as total_qty
from bootstrapped_code_names
group code_name
order by total_qty

quit;

要得到：

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
12345678  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3

我想获得一个数据集，该数据集捕获每个ID的每百分点观察，以获取类似的东西：

ID	Replicate	QTY
12345	174	54
12345	56	76
12345	876	98
12345	354	102
12345	7	106
12345	45	113
12345	128	123
12345	23	156
12345	948	168
12347	465	24
	465	168
12347	12347
5	45	和
948	160

依此多待地为每个ID。

我尝试了此代码不适用不同的ID：

%macro test;
  %do j=10 %to 90 %by 10;
  %global row&j.;
  %let row&j. = %sysevalf(1000*&j./100, floor);
%end;
%mend test;

data get_rows;
set code;
if _N_ in(&row&10., &row&20., &row&30., &row&40., &row&50., &row&60., &row&70., &row&80., &row&90.);
run;

data _null_;
set get_rows;
row = _N_*10;
call symputx('row' ||strip(row), replicate, 'G');
run;

每个ID如何获得每一行？每个ID都有1000行。

原文

I have a list of 10000 observations, 1000 per code number (ID):

proc surveyselect data=code_nums out=bootstrapped_code_names (keep=code_name qty replicate) sampsize=1825 method=urs outhits no print seed=0 rep=1000;
strata code_num;
run;

proc sql no print;
create table test as 
select code_num
,replicate
,sum(quantity) as total_qty
from bootstrapped_code_names
group code_name
order by total_qty

quit;

To get this:

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
12345678  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3

I would like to obtain a dataset that captures every hundredth observation for every ID, to get something like this:

id	replicate	qty
12345	174	54
12345	56	76
12345	876	98
12345	354	102
12345	7	106
12345	45	113
12345	128	123
12345	23	156
12345	948	168
12347	465	24
12347	374	36
12347	58	55
12347	38	78
12347	11	109
12347	283	117
12345	90	133
12345	23	145
12345	948	160

and so on for each ID.

I tried this code by it doesn't apply for distinct ID:

%macro test;
  %do j=10 %to 90 %by 10;
  %global row&j.;
  %let row&j. = %sysevalf(1000*&j./100, floor);
%end;
%mend test;

data get_rows;
set code;
if _N_ in(&row&10., &row&20., &row&30., &row&40., &row&50., &row&60., &row&70., &row&80., &row&90.);
run;

data _null_;
set get_rows;
row = _N_*10;
call symputx('row' ||strip(row), replicate, 'G');
run;

How can get every hundred row per ID? Each ID has 1000 rows.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

小红帽 2025-02-09 19:35:35

不确定我遵循您的完整问题描述，但是这里是一个简单的代码，可以选择每个ID值第一个，第101，201t等观测。源数据必须已经由ID变量对其进行排序。

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if 1=mod(_n_,100) then output;
  end;
run;

Not sure I follow your full problem description, but here is simple code to select the first, 101st, 201st, etc observation per ID value. The source data must already be sorted by the ID variable(s).

data want;
  do _n_=1 by 1 until(last.id);
    set have;
    by id;
    if 1=mod(_n_,100) then output;
  end;
run;

回复收藏 0 原文

挽心 2025-02-09 19:35:35

通过使用MOD函数从昨天开始修改代码，以检查该值是否可除以100。更改MOD函数中的值以获取其他值。

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
;;;;


data code_counter;
set code;
by code_num;

if first.code_num then count=1;
else count+1;
run;

data code100;
set code_counter;
if mod(count, 100)=0 then output;
run;

Modifying the code from yesterday by using a MOD function to check if the value is divisible by 100. Change the value in the MOD function to get other values.

https://documentation.sas.com/doc/en/vdmmlcdc/8.1/ds2ref/n0t9j8b09x4uphn1kl1i70x63z19.htm

data work.code;
input code_num $9. replicate total_qty;
datalines;
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456780 87   0
123456780 34   0
123456780  837  2
123456780 475  4
123456780 74   5
123456780 507  9
123456780 28   9
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
123456788 76   3
;;;;


data code_counter;
set code;
by code_num;

if first.code_num then count=1;
else count+1;
run;

data code100;
set code_counter;
if mod(count, 100)=0 then output;
run;

回复收藏 0 原文

~没有更多了~