将包含年份的行转换为列

发布于 2024-10-03 09:12:27 字数 305 浏览 7 评论 0原文

为了将数据输入 SAS,需要采用以下格式:

Country Year Indicator_1
Belgium 1900 x1
Belgium 1901 x2
...
Belarus 1901 x1

但是,我的大部分数据采用以下格式:

Country 1900 1901 1902 ... etc
Belgium x1____x2___x3  ...etc
Belarus x1____x2___x3  ...etc

是否有简单的宏或 VBA 脚本可以提供帮助?

In order to input data into SAS it is required to be in the following format:

Country Year Indicator_1
Belgium 1900 x1
Belgium 1901 x2
...
Belarus 1901 x1

However, most of my data comes in the following format:

Country 1900 1901 1902 ... etc
Belgium x1____x2___x3  ...etc
Belarus x1____x2___x3  ...etc

Is there an easy Macro or VBA script that can help?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

沙与沫 2024-10-10 09:12:27

将指示符字符串解析为年份变量,

假设有超过 3 年的数据,您需要调整引用 Y1900-Y1902 的格式和数组。

data original;
    infile datalines;
    format Country $20. YearIndicator $50.;
    input Country YearIndicator;

    format Y1900-Y1902 $4.;
    array y(*) y1900-y1902;
    do i = 1 to dim(y);
        y[i] = scan(YearIndicator,i,'_');
    end;
    drop i;
datalines;
Belgium x1____x2___x3
Belarus x1____x2___x3
run;

使宽表变高

proc transpose data=original out=talldata(rename=(_NAME_=CYear COL1=Indicator));
    by country notsorted;
    var y1900-y1902;
run;

使年份变量为数字,而不是字符

data talldata;
    format Country $20. Year 4. Indicator $4.;
    set talldata;
    year=input(compress(cyear,,'kd'),4.);
    drop cyear;
run;

视图结果

proc print data=talldata; run;

输出

Obs    Country                 Year    Indicator

 1     Belgium                 1900      x1
 2     Belgium                 1901      x2
 3     Belgium                 1902      x3
 4     Belarus                 1900      x1
 5     Belarus                 1901      x2
 6     Belarus                 1902      x3

parse apart the indicator string into year variables

assuming there will be more than 3 years worth of data, you'd need to adjust the format and array referencing Y1900-Y1902.

data original;
    infile datalines;
    format Country $20. YearIndicator $50.;
    input Country YearIndicator;

    format Y1900-Y1902 $4.;
    array y(*) y1900-y1902;
    do i = 1 to dim(y);
        y[i] = scan(YearIndicator,i,'_');
    end;
    drop i;
datalines;
Belgium x1____x2___x3
Belarus x1____x2___x3
run;

make wide table tall

proc transpose data=original out=talldata(rename=(_NAME_=CYear COL1=Indicator));
    by country notsorted;
    var y1900-y1902;
run;

make year variable numeric, not character

data talldata;
    format Country $20. Year 4. Indicator $4.;
    set talldata;
    year=input(compress(cyear,,'kd'),4.);
    drop cyear;
run;

view results

proc print data=talldata; run;

output

Obs    Country                 Year    Indicator

 1     Belgium                 1900      x1
 2     Belgium                 1901      x2
 3     Belgium                 1902      x3
 4     Belarus                 1900      x1
 5     Belarus                 1901      x2
 6     Belarus                 1902      x3
乱世争霸 2024-10-10 09:12:27

您可以使用联合查询:

SELECT Country, 1900 As SYear, [1900] As Indicator FROM Table
UNION ALL
SELECT Country, 1901 As SYear, [1901] As Indicator FROM Table

<..>

UNION ALL
SELECT Country, 2010 As SYear, [2010] As Indicator FROM Table

如果无法导出查询,您可以使用它来创建表。

You can use a Union query:

SELECT Country, 1900 As SYear, [1900] As Indicator FROM Table
UNION ALL
SELECT Country, 1901 As SYear, [1901] As Indicator FROM Table

<..>

UNION ALL
SELECT Country, 2010 As SYear, [2010] As Indicator FROM Table

You can use this to create a table if it is not possible to export a query.

网名女生简单气质 2024-10-10 09:12:27

如果输入的原始数据足够规则,那么可以通过如下简单的数据步骤非常轻松地完成此操作。

   data one;
     infile cards dlm=" _" missover;
     input country :$20. @;
     do year = 1900 to 1902;
       input indicator $ @;
       output;
     end;
   cards;
   Belgium x1____x2___x3
   Belarus x4____x5___x6
   ;
   run;

   /* check */
   proc print data=one;
   run;
   /* on lst
   Obs    country    year    indicator
    1     Belgium    1900       x1
    2     Belgium    1901       x2
    3     Belgium    1902       x3
    4     Belarus    1900       x4
    5     Belarus    1901       x5
    6     Belarus    1902       x6
   */

If the input raw data are regular enough then this can be done very easily with a simple data step as below.

   data one;
     infile cards dlm=" _" missover;
     input country :$20. @;
     do year = 1900 to 1902;
       input indicator $ @;
       output;
     end;
   cards;
   Belgium x1____x2___x3
   Belarus x4____x5___x6
   ;
   run;

   /* check */
   proc print data=one;
   run;
   /* on lst
   Obs    country    year    indicator
    1     Belgium    1900       x1
    2     Belgium    1901       x2
    3     Belgium    1902       x3
    4     Belarus    1900       x4
    5     Belarus    1901       x5
    6     Belarus    1902       x6
   */
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文