将 log 应用到所有大于 0 的数字(具有两个 .cols 条件的 across())

发布于 2025-01-13 14:19:54 字数 1048 浏览 1 评论 0原文

我正在尝试在数据框中创建多个变量的日志,其中还包括非数字变量,并且希望仅将该函数应用于那些不包含零或负值的数字变量。

这就是我所处的位置:

# creating a df with numeric and factor variables
a <- c(3, -1, 0, 5, 2)
b <- c(1, 3, 2, 1, 4)
c <- c(9, -2, 3, -5, 1)
d <- c(3, 0, 6, 1, 5)
e <- c("red", "blu", "yellow", "green", "white")
f <- c(0, 1, 1, 0, 0)
g <- c(3, 1, 1, 4, 2)

df <- data.frame(a,b,c,d,e,f,g) %>% 
mutate_at("f",factor)

#applying the transformation to all numeric variables
df.log <- df %>% 
  as_tibble() %>% 
  mutate(across(
    .cols = is.numeric, #& all()>0,#ideally I shall add here the condition '& >0' but it doesn't work 
    .fns = list(log = log),
    .names = "{.col}_{.fn}"))

使用上面的代码,我用 NaN 表示负值,用 -inf 表示零。然后我可以删除包含这些值的列,但我想找到一种干净的方法来一次性完成这一切。

另一个想法是删除之前值为 <=0 的列,如下所示:

df.skim <- df %>% 
  select_if(is.numeric)

df.skim <- df.skim[,sapply(df.skim, min)>0]

然后将日志应用到左侧的列,但通过这种方式,我也删除了关键列,并且无法轻松合并回数据。

I am trying to create the log of multiple variables in a dataframe which includes also non numeric variables, and would like to apply the function only to those numeric variables which include no zeros or negative values.

This is where I am at:

# creating a df with numeric and factor variables
a <- c(3, -1, 0, 5, 2)
b <- c(1, 3, 2, 1, 4)
c <- c(9, -2, 3, -5, 1)
d <- c(3, 0, 6, 1, 5)
e <- c("red", "blu", "yellow", "green", "white")
f <- c(0, 1, 1, 0, 0)
g <- c(3, 1, 1, 4, 2)

df <- data.frame(a,b,c,d,e,f,g) %>% 
mutate_at("f",factor)

#applying the transformation to all numeric variables
df.log <- df %>% 
  as_tibble() %>% 
  mutate(across(
    .cols = is.numeric, #& all()>0,#ideally I shall add here the condition '& >0' but it doesn't work 
    .fns = list(log = log),
    .names = "{.col}_{.fn}"))

With the code above I have NaN for negative values and -inf for zeros. I could then drop columns with those values, but I'd like to find a clean way to do it all at once.

Another idea was to remove columns with values <=0 before as follows:

df.skim <- df %>% 
  select_if(is.numeric)

df.skim <- df.skim[,sapply(df.skim, min)>0]

and then apply the log to the columns left, but in this way I drop also the key column and I cannot easily merge back the data.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

剩一世无双 2025-01-20 14:19:54

您可以创建一个小函数,然后将其传递到 where 内的 across

numeric_no_zero <- function(x) {
  if(!is.numeric(x)) return(FALSE)
  if(any(x <= 0)) return(FALSE)
  TRUE
}

您可以像这样使用它:

df %>% 
  as_tibble() %>% 
  mutate(across(
    .cols = where(numeric_no_zero),
    .fns = list(log = log),
    .names = "{.col}_{.fn}"))
#> # A tibble: 5 x 9
#>       a     b     c     d e      f         g b_log g_log
#>   <dbl> <dbl> <dbl> <dbl> <chr>  <fct> <dbl> <dbl> <dbl>
#> 1     3     1     9     3 red    0         3 0     1.10 
#> 2    -1     3    -2     0 blu    1         1 1.10  0    
#> 3     0     2     3     6 yellow 1         1 0.693 0    
#> 4     5     1    -5     1 green  0         4 0     1.39 
#> 5     2     4     1     5 white  0         2 1.39  0.693

reprex 包 (v2.0.1)

You can create a little function that you then pass onto across inside where:

numeric_no_zero <- function(x) {
  if(!is.numeric(x)) return(FALSE)
  if(any(x <= 0)) return(FALSE)
  TRUE
}

Which you use like this:

df %>% 
  as_tibble() %>% 
  mutate(across(
    .cols = where(numeric_no_zero),
    .fns = list(log = log),
    .names = "{.col}_{.fn}"))
#> # A tibble: 5 x 9
#>       a     b     c     d e      f         g b_log g_log
#>   <dbl> <dbl> <dbl> <dbl> <chr>  <fct> <dbl> <dbl> <dbl>
#> 1     3     1     9     3 red    0         3 0     1.10 
#> 2    -1     3    -2     0 blu    1         1 1.10  0    
#> 3     0     2     3     6 yellow 1         1 0.693 0    
#> 4     5     1    -5     1 green  0         4 0     1.39 
#> 5     2     4     1     5 white  0         2 1.39  0.693

Created on 2022-03-10 by the reprex package (v2.0.1)

徒留西风 2025-01-20 14:19:54

您可以将 where() 与匿名函数一起使用来指定更复杂的条件,例如您的条件:

library(tidyverse)

df.log <- df %>% 
  as_tibble() %>%
  mutate(across(
    .cols = where(~ is.numeric(.x) && all(.x > 0)),
    .fns = list(log = log),
    .names = "{.col}_{.fn}"))

输出:

# A tibble: 5 x 9
      a     b     c     d e      f         g b_log g_log
  <dbl> <dbl> <dbl> <dbl> <chr>  <fct> <dbl> <dbl> <dbl>
1     3     1     9     3 red    0         3 0     1.10 
2    -1     3    -2     0 blu    1         1 1.10  0    
3     0     2     3     6 yellow 1         1 0.693 0    
4     5     1    -5     1 green  0         4 0     1.39 
5     2     4     1     5 white  0         2 1.39  0.693

You can use where() with an anonymous function to specify more complex conditions like yours:

library(tidyverse)

df.log <- df %>% 
  as_tibble() %>%
  mutate(across(
    .cols = where(~ is.numeric(.x) && all(.x > 0)),
    .fns = list(log = log),
    .names = "{.col}_{.fn}"))

Output:

# A tibble: 5 x 9
      a     b     c     d e      f         g b_log g_log
  <dbl> <dbl> <dbl> <dbl> <chr>  <fct> <dbl> <dbl> <dbl>
1     3     1     9     3 red    0         3 0     1.10 
2    -1     3    -2     0 blu    1         1 1.10  0    
3     0     2     3     6 yellow 1         1 0.693 0    
4     5     1    -5     1 green  0         4 0     1.39 
5     2     4     1     5 white  0         2 1.39  0.693
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文