使用 R 将变量值编码到类中
我有一组数据,需要将某些变量(数字)的值编码为 3 个类。
我的数据集与此类似,但多了 60 个变量:
anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
wt <- c(181,179,180.5,201,201.5,245,246.4,189.3,301,354,369,205,199,394,231.3)
data <- data.frame(anim,wt)
> data
anim wt
1 1 181.0
2 2 179.0
3 3 180.5
4 4 201.0
5 5 201.5
6 6 245.0
7 7 246.4
8 8 189.3
9 9 301.0
10 10 354.0
11 11 369.0
12 12 205.0
13 13 199.0
14 14 394.0
15 15 231.3
我需要将变量“wt”的值编码为 3 个类:(wt >= 179 & wt < 200) = 1; (重量≥200&重量<300)=2; (wt > 300) = 3
这应该给我这个
> data2
anim wt SWT
1 1 181.0 1
2 2 179.0 1
3 3 180.5 1
4 4 201.0 2
5 5 201.5 2
6 6 245.0 2
7 7 246.4 2
8 8 189.3 1
9 9 301.0 3
10 10 354.0 3
11 11 369.0 3
12 12 205.0 2
13 13 199.0 1
14 14 394.0 3
15 15 231.3 2
I have a set of data in which I need to code values of certain variables (numeric) into 3 classes.
My data set is similar to this but has 60 more variables:
anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
wt <- c(181,179,180.5,201,201.5,245,246.4,189.3,301,354,369,205,199,394,231.3)
data <- data.frame(anim,wt)
> data
anim wt
1 1 181.0
2 2 179.0
3 3 180.5
4 4 201.0
5 5 201.5
6 6 245.0
7 7 246.4
8 8 189.3
9 9 301.0
10 10 354.0
11 11 369.0
12 12 205.0
13 13 199.0
14 14 394.0
15 15 231.3
I need to code values of the variable "wt" up into 3 classes: (wt >= 179 & wt < 200) = 1; (wt >= 200 & wt < 300) = 2; (wt > 300) = 3
which should give me this
> data2
anim wt SWT
1 1 181.0 1
2 2 179.0 1
3 3 180.5 1
4 4 201.0 2
5 5 201.5 2
6 6 245.0 2
7 7 246.4 2
8 8 189.3 1
9 9 301.0 3
10 10 354.0 3
11 11 369.0 3
12 12 205.0 2
13 13 199.0 1
14 14 394.0 3
15 15 231.3 2
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
@Greg 概述的
cut
方法可能就是您想要的。需要注意的一件事是,cut
默认返回一个因子,您可以通过提供labels = FALSE
来返回整数值来抑制该因子:或者,如果您的切割不借出本身到自然中断,您可以使用
ifelse()
。您可以像 Excel 一样“嵌套”ifelse 语句。我使用“with”来减少所需的打字:The
cut
method as outlined by @Greg is probably what you want here. One thing to note is thatcut
returns a factor by default, which you can suppress by supplyinglabels = FALSE
to return the integer values:Alternatively, if your cutting does not lend itself to natural breaks, you can use
ifelse()
. You can "nest" the ifelse statements similar to Excel. I use "with" to cut down on the typing needed:您可以尝试
cut
编辑:固定组 - right = FALSE,摆脱了拆分示例。
You can try
cut
EDIT: fixed group - right = FALSE, got rid of split example.
我认为 Greg 的答案涵盖了“标准操作程序”,但我发现 findInterval 函数也有很多用途。它自然会返回一个数字,用于标识第二个参数中的间隔。
I think Greg's answers cover "standard operating procedure", but I find many uses for the findInterval function as well. It naturally returns a number that identifies the interval in the second argument.
只是为了显示包 car 中的替代方法(类似于在 SPSS 中重新编码):
Just to show an alternate (similar to recode in SPSS) method from package car:
出于完整性和信息的目的,classInt 包(在 CRAN 上)是另一种将数字分类的便捷方法。
Just for completeness and info, the classInt package (on CRAN) is another handy way to classify numbers into classes.