使用rlang double Curly Bracs {{in Data.Table

发布于 2025-02-08 19:23:04 字数 1637 浏览 1 评论 0原文

问题

{{}}rlang软件包运算符,使得将列名称作为函数参数(又称quasiquotation)非常容易。我了解rlang旨在与didyverse一起使用,但是有一种方法可以使用{{}} in data.table /代码>?

{{}}与Dplyr

test_dplyr <- function(dt, col1, col2){
  
  temp <- dt %>%
            group_by( {{col2}} ) %>%
            summarise(test = mean( {{col1}} ))

  return(temp)
}

test_dplyr(dt=iris, col1=Sepal.Length, col2=Species)

> # A tibble: 3 x 2
>   Species     test
>   <fct>      <dbl>
> 1 setosa      5.01
> 2 versicolor  5.94
> 3 virginica   6.59

失败的尝试使用{{}}的尝试使用{{}}与data.table一起

使用。这是我想做的,但它返回错误。

test_dt2 <- function(dt, col1, col2){
  
  data.table::setDT(dt)
  temp <- dt[, .( test = mean({{col1}})), by = {{col2}} ] )
  return(temp)
}

# error
test_dt2(dt=iris, col1= Sepal.Length, col2= Species)

# and error
test_dt2(dt=iris, col1= 'Sepal.Length', col2= 'Species')

rlang与data.table的替代用途

,这是使用rlangdata.table使用rlang的替代方法。这里有两个不一致的范围,这些不一致是rlang :: ensym()每个列名称变量,并且必须调用data。

test_dt <- function(dt, col1, col2){
  
  # eval colnames
  col1 <- rlang::ensym(col1)
  col2 <- rlang::ensym(col2)
  
  data.table::setDT(dt)
  temp <- rlang::inject( dt[, .( test = mean(!!col1)), by = !!col2] )
  return(temp)
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')

>       Species  test
> 1:     setosa 5.006
> 2: versicolor 5.936
> 3:  virginica 6.588

Problem

The {{}} operator from the rlang package makes it incredibly easy to pass column names as function arguments (aka Quasiquotation). I understand rlang is intended to work with tidyverse, but is there a way to use {{}} in data.table?

Intended use of {{}} with dplyr

test_dplyr <- function(dt, col1, col2){
  
  temp <- dt %>%
            group_by( {{col2}} ) %>%
            summarise(test = mean( {{col1}} ))

  return(temp)
}

test_dplyr(dt=iris, col1=Sepal.Length, col2=Species)

> # A tibble: 3 x 2
>   Species     test
>   <fct>      <dbl>
> 1 setosa      5.01
> 2 versicolor  5.94
> 3 virginica   6.59

Failed attempt of using {{}} with data.table

This is ideally what I would like to do, but it returns an ERROR.

test_dt2 <- function(dt, col1, col2){
  
  data.table::setDT(dt)
  temp <- dt[, .( test = mean({{col1}})), by = {{col2}} ] )
  return(temp)
}

# error
test_dt2(dt=iris, col1= Sepal.Length, col2= Species)

# and error
test_dt2(dt=iris, col1= 'Sepal.Length', col2= 'Species')

Alternative use of rlang with data.table

And here is an alternative way to use rlang with data.table. There are two inconvinences here, which are to rlang::ensym() every column name variable, and having to call data.table operations inside rlang::injec().

test_dt <- function(dt, col1, col2){
  
  # eval colnames
  col1 <- rlang::ensym(col1)
  col2 <- rlang::ensym(col2)
  
  data.table::setDT(dt)
  temp <- rlang::inject( dt[, .( test = mean(!!col1)), by = !!col2] )
  return(temp)
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')

>       Species  test
> 1:     setosa 5.006
> 2: versicolor 5.936
> 3:  virginica 6.588

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

美男兮 2025-02-15 19:23:04

我认为您不想将rlang与data.table一起使用。 Data.Table本身已经具有更方便的设施。还建议不要在此处使用setDT,因为这将导致更改DT的副作用。

library(data.table)

test_dt <- function(dt, col1, col2) {
  as.data.table(dt)[, .( test = mean(.SD[[col1]])), by = c(col2)]
}

test_dt(dt = iris, col1 = 'Sepal.Length', col2 = 'Species')
##       Species  test
## 1:     setosa 5.006
## 2: versicolor 5.936
## 3:  virginica 6.588

这也有效:

test_dt <- function(dt, col1, col2) {
  as.data.table(dt)[, .( test = mean(get(col1))), by = c(col2)]
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')

I don't think you want to use rlang with data.table. data.table already has more convenient facilities itself. Also suggest not using setDT here as that will result in the side effect of changing dt in place.

library(data.table)

test_dt <- function(dt, col1, col2) {
  as.data.table(dt)[, .( test = mean(.SD[[col1]])), by = c(col2)]
}

test_dt(dt = iris, col1 = 'Sepal.Length', col2 = 'Species')
##       Species  test
## 1:     setosa 5.006
## 2: versicolor 5.936
## 3:  virginica 6.588

This also works:

test_dt <- function(dt, col1, col2) {
  as.data.table(dt)[, .( test = mean(get(col1))), by = c(col2)]
}

test_dt(dt=iris, col1='Sepal.Length', col2='Species')
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文