将连续数值转换为由间隔定义的离散类别

发布于 2025-02-06 15:30:07 字数 807 浏览 1 评论 0原文

我有一个具有连续数字变量的数据框,几个月(年龄_mnths)年龄。我想制作一个新的离散变量,并根据年龄间的间隔进行年龄类别。

# Some example data
rota2 <- data.frame(age_mnth = 1:170)

我已经创建了基于的基于过程(下图),但我相信有可能更优雅的解决方案。

rota2$age_gr<-ifelse(rota2$age_mnth < 6, rr2 <- "0-5 mnths",

   ifelse(rota2$age_mnth > 5 & rota2$age_mnth < 12, rr2 <- "6-11 mnths",

          ifelse(rota2$age_mnth > 11 & rota2$age_mnth < 24, rr2 <- "12-23 mnths",

                 ifelse(rota2$age_mnth > 23 & rota2$age_mnth < 60, rr2 <- "24-59 mnths",

                        ifelse(rota2$age_mnth > 59 & rota2$age_mnth < 167, rr2 <- "5-14 yrs",

                              rr2 <- "adult")))))

我知道有剪切< / code>函数,但我无法处理它以离散 /分类的目的。

I have a data frame with a continuous numeric variable, age in months (age_mnths). I want to make a new discrete variable, with age categories based on age intervals.

# Some example data
rota2 <- data.frame(age_mnth = 1:170)

I've created ifelse based procedure (below), but I believe there is a possibility for more elegant solution.

rota2$age_gr<-ifelse(rota2$age_mnth < 6, rr2 <- "0-5 mnths",

   ifelse(rota2$age_mnth > 5 & rota2$age_mnth < 12, rr2 <- "6-11 mnths",

          ifelse(rota2$age_mnth > 11 & rota2$age_mnth < 24, rr2 <- "12-23 mnths",

                 ifelse(rota2$age_mnth > 23 & rota2$age_mnth < 60, rr2 <- "24-59 mnths",

                        ifelse(rota2$age_mnth > 59 & rota2$age_mnth < 167, rr2 <- "5-14 yrs",

                              rr2 <- "adult")))))

I know there is cut function but I couldn't deal with it for my purpose to discretize / categorize.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

恋你朝朝暮暮 2025-02-13 15:30:07

如果有原因您不想使用剪切,那么我不明白为什么。 剪切将对您想做的事情正常工作

# Some example data
rota2 <- data.frame(age_mnth = 1:170)
# Your way of doing things to compare against
rota2$age_gr<-ifelse(rota2$age_mnth<6,rr2<-"0-5 mnths",
                     ifelse(rota2$age_mnth>5&rota2$age_mnth<12,rr2<-"6-11 mnths",
                            ifelse(rota2$age_mnth>11&rota2$age_mnth<24,rr2<-"12-23 mnths",
                                   ifelse(rota2$age_mnth>23&rota2$age_mnth<60,rr2<-"24-59 mnths",
                                          ifelse(rota2$age_mnth>59&rota2$age_mnth<167,rr2<-"5-14 yrs",
                                                 rr2<-"adult")))))

# Using cut
rota2$age_grcut <- cut(rota2$age_mnth, 
                       breaks = c(-Inf, 6, 12, 24, 60, 167, Inf), 
                       labels = c("0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs", "adult"), 
                       right = FALSE)

If there is a reason you don't want to use cut then I don't understand why. cut will work fine for what you want to do

# Some example data
rota2 <- data.frame(age_mnth = 1:170)
# Your way of doing things to compare against
rota2$age_gr<-ifelse(rota2$age_mnth<6,rr2<-"0-5 mnths",
                     ifelse(rota2$age_mnth>5&rota2$age_mnth<12,rr2<-"6-11 mnths",
                            ifelse(rota2$age_mnth>11&rota2$age_mnth<24,rr2<-"12-23 mnths",
                                   ifelse(rota2$age_mnth>23&rota2$age_mnth<60,rr2<-"24-59 mnths",
                                          ifelse(rota2$age_mnth>59&rota2$age_mnth<167,rr2<-"5-14 yrs",
                                                 rr2<-"adult")))))

# Using cut
rota2$age_grcut <- cut(rota2$age_mnth, 
                       breaks = c(-Inf, 6, 12, 24, 60, 167, Inf), 
                       labels = c("0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs", "adult"), 
                       right = FALSE)
只为一人 2025-02-13 15:30:07
rota2$age_gr<-c( "0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs",
                 "adult")[
           findInterval(rota2$age_mnth , c(-Inf, 5.5, 11.5, 23.5, 59.5, 166.5, Inf) ) ]
rota2$age_gr<-c( "0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs",
                 "adult")[
           findInterval(rota2$age_mnth , c(-Inf, 5.5, 11.5, 23.5, 59.5, 166.5, Inf) ) ]
虚拟世界 2025-02-13 15:30:07

问题

虽然我欣赏这可能是一个旧话题,但我将在这里留下5美分,以更简洁地解决了上述输入数据以进行存储的

rotaDF <- data.frame(age_mnth = 1:170)

:避免ifelse递归&amp;重复,我创建了一个静态数据框,该静态数据帧逐组按边界:

rotaBounds <- data.frame(buckets = c("0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs", "Adult"),
                         lowerBound = c(0, 6, 12, 24, 60, 168),
                         upperBound = c(6, 12, 24, 60, 168, 168000000))

然后,我使用边界为:

rotaBounds$age_gr <- sapply(rotaDF$age_mnth, 
                            function(x) rotaBounds[x > rotaBounds$lowerBound & x <= rotaBounds$upperBound,'buckets'])

While I appreciate this might be an old topic, I will leave here my 5 cents for a more concise solution of the problem raised above

Input data for bucketing:

rotaDF <- data.frame(age_mnth = 1:170)

To avoid ifelse recursive & repetition, I create a static dataframe with boundaries by group:

rotaBounds <- data.frame(buckets = c("0-5 mnths", "6-11 mnths", "12-23 mnths", "24-59 mnths", "5-14 yrs", "Adult"),
                         lowerBound = c(0, 6, 12, 24, 60, 168),
                         upperBound = c(6, 12, 24, 60, 168, 168000000))

Then I assign each element, using the boundaries as:

rotaBounds$age_gr <- sapply(rotaDF$age_mnth, 
                            function(x) rotaBounds[x > rotaBounds$lowerBound & x <= rotaBounds$upperBound,'buckets'])
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文