按数字范围合并

发布于 2024-11-08 13:46:24 字数 467 浏览 3 评论 0原文

我想根据范围而不是明确的数字将一个人的名字分配给一个数字。可以使用格式来完成此操作，但由于我在数据集中有名称，所以我希望避免手动编写 proc 格式。

data names;                      
   input low high name $;  
   datalines;
1   10  John
11  20  Paul
21  30  George
31  40  Ringo
;

data numbers;
    input number;
    datalines;
33
21
17
5
;

所需的输出是：

data output;
    input number name $;
    datalines;
33  Ringo
21  George
17  Paul
5   John
;

感谢您的帮助。

原文

I'd like to assign a person's name to a number based on a range rather than an explicit number. It's possible to do this using formats, but as I have the names in a dataset I'd prefer to avoid the manual process of writing the proc format.

data names;                      
   input low high name $;  
   datalines;
1   10  John
11  20  Paul
21  30  George
31  40  Ringo
;

data numbers;
    input number;
    datalines;
33
21
17
5
;

The desired output is:

data output;
    input number name $;
    datalines;
33  Ringo
21  George
17  Paul
5   John
;

Thanks for any help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

天涯离梦残月幽梦 2024-11-15 13:46:24

您可以使用 PROC SQL 执行此操作：

proc sql;
 create table output as
 select numbers.number, names.name
 from numbers left join names
  on numbers.number ge names.low
  and numbers.number le names.high
;
quit;

You can do it like this using PROC SQL:

proc sql;
 create table output as
 select numbers.number, names.name
 from numbers left join names
  on numbers.number ge names.low
  and numbers.number le names.high
;
quit;

回复收藏 0 原文

关于从前 2024-11-15 13:46:24

proc format 的一项便利功能是能够使用数据集来创建格式，而不是手动输入。您的场景似乎是此功能的完美场景。

在您给出的示例中，对“名称”数据集进行一些小更改会将其置于可以通过 proc 格式读取的形式。

例如，如果我像这样修改名称数据集......

data names;
   retain fmtname "names" type "N";
   input start end label $; 
   datalines;
1   10  John
11  20  Paul
21  30  George
31  40  Ringo
;

然后我可以发出此命令来基于它构建格式。

proc format cntlin=names;run;

现在我可以使用这种格式，就像使用任何其他格式一样。例如，要创建一个包含基于数字的所需“名称”的新列，您可以执行以下操作：

data numbers;
    input number;
    number_formatted=put(number,names.);
    datalines;
33
21
17
5
;

输出如下：

             number_
  number    formatted

    33       Ringo
    21       George
    17       Paul
     5       John

更新以解决问题：

没有从文本文件中读取所需的编码有很大差异。我们只需对其进行设置，以便输出数据集具有 proc 格式所需的特定变量名称（fmtname、type、start、end 和 label）。

例如，如果我有一个名为“names.csv”的外部逗号分隔文件，如下所示：

1,10,John
11,20,Paul
21,30,George
31,40,Ringo

然后我只需更改创建“names”数据集的代码，使其如下所示：

data names;
   retain fmtname "names" type "N";
   infile "<path to file>/names.csv" dsd;
   input start end label $;
run;

现在我可以运行 proc像我之前一样使用 cntlin 选项格式化：

proc format cntlin=names;run;

One handy feature of proc format is the ability to use a data set to create the format, instead of typing it in by hand. Your scenario seems like a perfect scenario for this feature.

In the example you give, a few small changes to the "names" data set will put it in a form that can be read by proc format.

For example, if I modify the names data set like so..

data names;
   retain fmtname "names" type "N";
   input start end label $; 
   datalines;
1   10  John
11  20  Paul
21  30  George
31  40  Ringo
;

I can then issue this command to build the format based on it.

proc format cntlin=names;run;

Now I can use this format just like you would with any other format. For example, to create a new column that contains the desired "name" based on the number, you could do this:

data numbers;
    input number;
    number_formatted=put(number,names.);
    datalines;
33
21
17
5
;

Here is what the output would look like:

             number_
  number    formatted

    33       Ringo
    21       George
    17       Paul
     5       John

Update to address question:

There isn't much difference in coding needed to read from a text file. We just need to set it up so that the output data set has the particular variable names that proc format expects (fmtname, type, start, end , and label).

For example, if I have an external comma-seperated file called "names.csv" that looks like this:

1,10,John
11,20,Paul
21,30,George
31,40,Ringo

Then I simply can change the code that creates the "names" data set so that it looks like this:

data names;
   retain fmtname "names" type "N";
   infile "<path to file>/names.csv" dsd;
   input start end label $;
run;

Now I can run proc format with the cntlin option like I did before:

proc format cntlin=names;run;

回复收藏 0 原文

新雨望断虹 2024-11-15 13:46:24

我认为 SQL 确实更简洁，但如果您不是它的忠实粉丝并且数字以已知的增量出现，您可以尝试类似的操作：

data ranges;
    set names;
    do number = low to high; /* by ... */
        output;
    end;
proc sort;
    by number;
run;

data output;
    merge ranges
        numbers ( in = innum )
    ;
    by number;
    keep number name;

    if innum;
run;

再次，它要求数字以预定的增量出现，例如整数。

I think SQL is more succinct indeed, but if you aren't big fan of it and the numbers come in known increments, you may try something like:

data ranges;
    set names;
    do number = low to high; /* by ... */
        output;
    end;
proc sort;
    by number;
run;

data output;
    merge ranges
        numbers ( in = innum )
    ;
    by number;
    keep number name;

    if innum;
run;

Again, it requires numbers to come in predetermined increments, e.g. integers.

回复收藏 0 原文

~没有更多了~