按数字范围合并
我想根据范围而不是明确的数字将一个人的名字分配给一个数字。可以使用格式来完成此操作,但由于我在数据集中有名称,所以我希望避免手动编写 proc 格式
。
data names;
input low high name $;
datalines;
1 10 John
11 20 Paul
21 30 George
31 40 Ringo
;
data numbers;
input number;
datalines;
33
21
17
5
;
所需的输出是:
data output;
input number name $;
datalines;
33 Ringo
21 George
17 Paul
5 John
;
感谢您的帮助。
I'd like to assign a person's name to a number based on a range rather than an explicit number. It's possible to do this using formats, but as I have the names in a dataset I'd prefer to avoid the manual process of writing the proc format
.
data names;
input low high name $;
datalines;
1 10 John
11 20 Paul
21 30 George
31 40 Ringo
;
data numbers;
input number;
datalines;
33
21
17
5
;
The desired output is:
data output;
input number name $;
datalines;
33 Ringo
21 George
17 Paul
5 John
;
Thanks for any help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
![扫码二维码加入Web技术交流群](/public/img/jiaqun_03.jpg)
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以使用 PROC SQL 执行此操作:
You can do it like this using PROC SQL:
proc format
的一项便利功能是能够使用数据集来创建格式,而不是手动输入。您的场景似乎是此功能的完美场景。在您给出的示例中,对“名称”数据集进行一些小更改会将其置于可以通过 proc 格式读取的形式。
例如,如果我像这样修改名称数据集......
然后我可以发出此命令来基于它构建格式。
现在我可以使用这种格式,就像使用任何其他格式一样。例如,要创建一个包含基于数字的所需“名称”的新列,您可以执行以下操作:
输出如下:
更新以解决问题:
没有从文本文件中读取所需的编码有很大差异。我们只需对其进行设置,以便输出数据集具有 proc 格式所需的特定变量名称(fmtname、type、start、end 和 label)。
例如,如果我有一个名为“names.csv”的外部逗号分隔文件,如下所示:
然后我只需更改创建“names”数据集的代码,使其如下所示:
现在我可以运行 proc像我之前一样使用 cntlin 选项格式化:
One handy feature of
proc format
is the ability to use a data set to create the format, instead of typing it in by hand. Your scenario seems like a perfect scenario for this feature.In the example you give, a few small changes to the "names" data set will put it in a form that can be read by proc format.
For example, if I modify the names data set like so..
I can then issue this command to build the format based on it.
Now I can use this format just like you would with any other format. For example, to create a new column that contains the desired "name" based on the number, you could do this:
Here is what the output would look like:
Update to address question:
There isn't much difference in coding needed to read from a text file. We just need to set it up so that the output data set has the particular variable names that proc format expects (fmtname, type, start, end , and label).
For example, if I have an external comma-seperated file called "names.csv" that looks like this:
Then I simply can change the code that creates the "names" data set so that it looks like this:
Now I can run proc format with the cntlin option like I did before:
我认为 SQL 确实更简洁,但如果您不是它的忠实粉丝并且数字以已知的增量出现,您可以尝试类似的操作:
再次,它要求数字以预定的增量出现,例如整数。
I think SQL is more succinct indeed, but if you aren't big fan of it and the numbers come in known increments, you may try something like:
Again, it requires numbers to come in predetermined increments, e.g. integers.