如何从recipes::step_dumm() 中排除Char 变量?

发布于 2025-01-12 03:59:19 字数 296 浏览 0 评论 0原文

如何在配方中保持character ID 变量PERSON_ID 不变?我尝试了 update_role(PERSON_ID , new_role = "id variable") 并尝试将其从 step_dummy 中排除 step_dummy(all_nominal_predictors(), -all_numeric_predictors(), -all_outcomes(), -has_role(match = “id 变量”) 它仍然无法将 PERSON_ID 转换为任何因素。 建议?

How do I keep a character ID variable PERSON_ID unchanged in a recipe? I tried update_role(PERSON_ID , new_role = "id variable") and tried excluding it from step_dummy step_dummy(all_nominal_predictors(), -all_numeric_predictors(), -all_outcomes(), -has_role(match = "id variable"). It does not work. It still converts PERSON_ID to factor. Any suggestion?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

念﹏祤嫣 2025-01-19 03:59:19

这似乎是一个令人困惑的问题。按照配方函数文档,step_factor2string 应该将因子转换为字符串。

然而,当您瞥见菜谱时,它会为 PERSON_ID 注明“fct”。另一方面,如果将 strings_as_factors 设置为 FALSE,则会出现错误,表明 PERSON_ID 不是一个因素:

library(tibble)
library(tidymodels)

data_input <- tibble(target = rep(1,9),
               num_var = rep(2,9),
               char = c(rep("a", 6),rep("b",3)),
               PERSON_ID = as.character(c(rep("W",3),rep("D",6))),
               logi = rep(c(TRUE,FALSE,FALSE),3),
               fac = as.factor(c(rep("1",6),rep("2",3)))
               )
             
recipe_spec <- recipe(target ~ ., data = data_input) %>% 
  update_role("PERSON_ID", new_role = "id variable") %>%
  step_dummy(all_nominal_predictors(),-all_numeric_predictors(),-all_outcomes(),-has_role(match = "id variable")) %>% 
  step_factor2string(PERSON_ID)

recipe_spec %>%  prep() %>%  juice()  %>%  glimpse()

recipe_spec %>%  prep(strings_as_factors = FALSE) %>%  juice()  %>%  glimpse()

 
        

This seems to be a confusing one. Following the recipe function documentation, step_factor2string should convert factors to strings.

However, when you glimpse at the recipe it states "fct" for PERSON_ID. On the other side an error appears, if you set strings_as_factors to FALSE, stating that PERSON_ID is not a factor:

library(tibble)
library(tidymodels)

data_input <- tibble(target = rep(1,9),
               num_var = rep(2,9),
               char = c(rep("a", 6),rep("b",3)),
               PERSON_ID = as.character(c(rep("W",3),rep("D",6))),
               logi = rep(c(TRUE,FALSE,FALSE),3),
               fac = as.factor(c(rep("1",6),rep("2",3)))
               )
             
recipe_spec <- recipe(target ~ ., data = data_input) %>% 
  update_role("PERSON_ID", new_role = "id variable") %>%
  step_dummy(all_nominal_predictors(),-all_numeric_predictors(),-all_outcomes(),-has_role(match = "id variable")) %>% 
  step_factor2string(PERSON_ID)

recipe_spec %>%  prep() %>%  juice()  %>%  glimpse()

recipe_spec %>%  prep(strings_as_factors = FALSE) %>%  juice()  %>%  glimpse()

 
        
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文