如何在r中拨打或加速函数内的数据帧变量

发布于 2025-02-12 20:51:10 字数 1582 浏览 1 评论 0 原文

我正在努力在R中创建一个函数,该函数使用data.frame的变量名称作为其参数的一部分。

例如,我有这些数据

test.df <- 
  data.frame(
    variable_1 = sample(letters[1:4],10, replace = T),
    variable_2 = rnorm(10,10,3),
    variable_3 = rnorm(10,40,15))
    
test.df
    
   variable_1 variable_2 variable_3
1           c   5.514034   59.23525
2           a  10.515690   31.94552
3           d  11.845118   47.39481
4           c   8.481335   22.32198
5           d   7.945798   29.02631
6           c   9.631182   41.90519
7           c   9.348816   53.79478
8           a   4.559642   58.47290
9           d   9.876674   53.53151
10          c  12.955443   49.84759

,我需要创建一个函数,该函数以其名称访问任何给定变量,例如提取和报告,其平均值是'' :x '(其中' x '包含平均值)。到目前为止,我已经尝试过:

my.function <- function(df, variable) {
  paste0("The mean is: ",
         round(mean(df$variable),2))
}

但是当评估 my.function 在'我的test.df'中时,它表明这显然正在完成这项工作:

> my.function(test.df, variable_2)
[1] "The mean of the varibale is: NA"

所以我的问题是:

  • hoy我会调用变量函数论点中的名字?我知道有多种方法可以做到这一点,因为例如其他库,例如使用 variable_2 “ variable_2” ,或者在需要多个变量时,要么列表没有报价的变量仅通过逗号将它们分开( variable_2,variable_3 dplyr :: select()),或者必须放置目标变量作为字符组( c(“ variable_2”,“ variable_3”) reshape2 :: melt()))

  • ) :我真的很喜欢使用需要多个变量的函数时,您可以按 tab ,并且显示了可用变量的列表(如 dplyr :: select()例如)。构建自己的功能时如何获得此功能?

提前致谢! :)

I'm struggling to create a function in R that uses data.frame's variable's names as part of its arguments.

Say for example that I have this data

test.df <- 
  data.frame(
    variable_1 = sample(letters[1:4],10, replace = T),
    variable_2 = rnorm(10,10,3),
    variable_3 = rnorm(10,40,15))
    
test.df
    
   variable_1 variable_2 variable_3
1           c   5.514034   59.23525
2           a  10.515690   31.94552
3           d  11.845118   47.39481
4           c   8.481335   22.32198
5           d   7.945798   29.02631
6           c   9.631182   41.90519
7           c   9.348816   53.79478
8           a   4.559642   58.47290
9           d   9.876674   53.53151
10          c  12.955443   49.84759

And I need to create a function which accesses any given variable by its name and, for example, extracts and reports it's mean in the form 'The mean is: X' (where 'X' contains the mean value). So far I've tried this:

my.function <- function(df, variable) {
  paste0("The mean is: ",
         round(mean(df$variable),2))
}

But when evaluating my.function in 'my test.df' it shows that is clearly doing the job:

> my.function(test.df, variable_2)
[1] "The mean of the varibale is: NA"

So my questions are:

  • Hoy do I call variables names inside a funtion's argument? I know there is various ways to do this since outhere thare ere other libraries that for example uses either variable_2 or "variable_2", or when needing more than one variable, either list variables without quotations just separating them by commas (variable_2, variable_3 as in dplyr::select()), or one has to place target variables as character groups (c("variable_2", "variable_3") as in reshape2::melt())

  • BONUS: I really like when using functions that require more than one variable, you can press tab, and the list of available variables shows up (as in dplyr::select() for example). How do I get this feature when building my own functions?

Thanks in advance! :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

紫瑟鸿黎 2025-02-19 20:51:10

如果我们传递了列名称的未引用的参数,则使用 depars/替换转换为字符串,并使用 [[而不是 $ 。另外,创建一个条件,以检查替代 symbol 的值,然后使用 deparse ,以便它可以通过引用和未引用的

my.function <- function(df, variable) {
 variable <- substitute(variable)
  if(is.symbol(variable)) variable <- deparse(variable)
   paste0("The mean is: ",
          round(mean(df[[variable]], na.rm = TRUE),2))
}

- 检验

> my.function(test.df, variable_2)
[1] "The mean is: 9.86"
> my.function(test.df, "variable_2")
[1] "The mean is: 9.86"

如果我们想获得多个列的平均值,请使用 colmeans 并将变量作为字符向量

my.function <- function(df, variable) {
    v1 <- colMeans(df[variable], na.rm = TRUE)
    sprintf("The mean of %s: %f", names(v1), v1)
    }

- 测试

> my.function(test.df, c("variable_2", "variable_3"))
[1] "The mean of variable_2: 9.860057"  "The mean of variable_3: 42.317997"

If we are passing unquoted argument for column names, then convert to string with deparse/substitute and use [[ instead of $. Also, create a condition to check if the value from substitute is symbol, then use deparse so that it can pass both quoted and unquoted

my.function <- function(df, variable) {
 variable <- substitute(variable)
  if(is.symbol(variable)) variable <- deparse(variable)
   paste0("The mean is: ",
          round(mean(df[[variable]], na.rm = TRUE),2))
}

-testing

> my.function(test.df, variable_2)
[1] "The mean is: 9.86"
> my.function(test.df, "variable_2")
[1] "The mean is: 9.86"

If we want to get the mean of multiple columns, use colMeans and pass the variable as a character vector

my.function <- function(df, variable) {
    v1 <- colMeans(df[variable], na.rm = TRUE)
    sprintf("The mean of %s: %f", names(v1), v1)
    }

-testing

> my.function(test.df, c("variable_2", "variable_3"))
[1] "The mean of variable_2: 9.860057"  "The mean of variable_3: 42.317997"
红衣飘飘貌似仙 2025-02-19 20:51:10

df $ nameofcolumn 。

column <- "nameOfColumn"
df[[column]]

您可以使用:示例:

my.function <- function(df, variable) {
  paste0("The mean is: ",
         round(mean(df[[variable]]),2))
}
> my.function(test.df, "variable_2")
[1] "The mean is: 11.88"

可以在 r-devel/r-lang.html#索引“ rel =“ nofollow noreferrer”>索引

Instead of df$nameOfColumn, you can use:

column <- "nameOfColumn"
df[[column]]

Example:

my.function <- function(df, variable) {
  paste0("The mean is: ",
         round(mean(df[[variable]]),2))
}
> my.function(test.df, "variable_2")
[1] "The mean is: 11.88"

This can be found in the R Language Definition under Indexing

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文