关于 sapply /plyr 语法的 R 问题:如何将变量值传递给函数

发布于 2024-10-14 13:54:28 字数 225 浏览 3 评论 0原文

有没有办法将 ddply/sapply 中的变量值直接传递给函数而不使用函数 (x) 表示法?

例如,而不是: ddply(bu,.(Trial), function (x) print(x$tangle) )

有没有办法做到: ddply(bu,.(Trial), print(tangle) )

我问是因为对于许多变量,这种表示法变得非常麻烦。

谢谢!

Is there a way to pass a variable value in ddply/sapply directly to a function without the function (x) notation?

E.g. Instead of:
ddply(bu,.(trial), function (x) print(x$tangle) )

Is there a way to do:
ddply(bu,.(trial), print(tangle) )

I am asking because with many variables this notation becomes very cumbersome.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

辞旧 2024-10-21 13:54:28

您可以在 gsubfn 包中使用 fn$。只需在相关函数前面添加 fn$ 即可,然后您就可以使用公式符号,如下所示:

> library(gsubfn)
>
> # instead of specifying function(x) mean(x) / sd(x)
>
> fn$sapply(iris[-5], ~ mean(x) / sd(x))
Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
    7.056602     7.014384     2.128819     1.573438 

> library(plyr)
> # instead of specifying function(x) colMeans(x[-5]) / sd(x[-5])
> 
> fn$ddply(iris, .(Species), ~ colMeans(x[-5]) / sd(x[-5]))
     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1     setosa     14.20183    9.043319     8.418556    2.334285
2 versicolor     11.50006    8.827326     9.065547    6.705345
3  virginica     10.36045    9.221802    10.059890    7.376660

You can use fn$ in the gsubfn package. Just preface the function in question with fn$ and then you can use a formula notation as shown here:

> library(gsubfn)
>
> # instead of specifying function(x) mean(x) / sd(x)
>
> fn$sapply(iris[-5], ~ mean(x) / sd(x))
Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
    7.056602     7.014384     2.128819     1.573438 

> library(plyr)
> # instead of specifying function(x) colMeans(x[-5]) / sd(x[-5])
> 
> fn$ddply(iris, .(Species), ~ colMeans(x[-5]) / sd(x[-5]))
     Species Sepal.Length Sepal.Width Petal.Length Petal.Width
1     setosa     14.20183    9.043319     8.418556    2.334285
2 versicolor     11.50006    8.827326     9.065547    6.705345
3  virginica     10.36045    9.221802    10.059890    7.376660
一紙繁鸢 2024-10-21 13:54:28

只需在 **ply 命令中添加函数参数即可。例如:

ddply(my_data, c("var1","var2"), my_function, param1=something, param2=something)

my_function 通常看起来像

my_function(x, param1, param2)

下面是一个工作示例:

require(plyr)

n=1000
my_data = data.frame(
    subject=1:n, 
    city=sample(1:4, n, T), 
    gender=sample(1:2, n, T), 
    income=sample(50:200, n, T)
    )


my_function = function(data_in, dv, extra=F){
    dv = data_in[,dv]
    output = data.frame(mean=mean(dv), sd=sd(dv))
    if(extra) output = cbind(output,  data.frame(n=length(dv), se=sd(dv)/sqrt(length(dv)) )  )
    return(output)
}

#with params
ddply(my_data, c("city", "gender"), my_function, dv="income", extra=T)

  city gender     mean       sd   n       se
1    1      1 127.1158 44.64347  95 4.580324
2    1      2 125.0154 44.83492 130 3.932283
3    2      1 130.3178 41.00359 107 3.963967
4    2      2 128.1608 43.33454 143 3.623816
5    3      1 121.1419 45.02290 148 3.700859
6    3      2 120.1220 45.01031 123 4.058443
7    4      1 126.6769 38.33233 130 3.361968
8    4      2 125.6129 44.46168 124 3.992777

#without params
ddply(my_data, c("city", "gender"), my_function, dv="income", extra=F)

  city gender     mean       sd
1    1      1 127.1158 44.64347
2    1      2 125.0154 44.83492
3    2      1 130.3178 41.00359
4    2      2 128.1608 43.33454
5    3      1 121.1419 45.02290
6    3      2 120.1220 45.01031
7    4      1 126.6769 38.33233
8    4      2 125.6129 44.46168

Just add your function parameters in the **ply command. For example:

ddply(my_data, c("var1","var2"), my_function, param1=something, param2=something)

where my_function usually looks like

my_function(x, param1, param2)

Here's a working example of this:

require(plyr)

n=1000
my_data = data.frame(
    subject=1:n, 
    city=sample(1:4, n, T), 
    gender=sample(1:2, n, T), 
    income=sample(50:200, n, T)
    )


my_function = function(data_in, dv, extra=F){
    dv = data_in[,dv]
    output = data.frame(mean=mean(dv), sd=sd(dv))
    if(extra) output = cbind(output,  data.frame(n=length(dv), se=sd(dv)/sqrt(length(dv)) )  )
    return(output)
}

#with params
ddply(my_data, c("city", "gender"), my_function, dv="income", extra=T)

  city gender     mean       sd   n       se
1    1      1 127.1158 44.64347  95 4.580324
2    1      2 125.0154 44.83492 130 3.932283
3    2      1 130.3178 41.00359 107 3.963967
4    2      2 128.1608 43.33454 143 3.623816
5    3      1 121.1419 45.02290 148 3.700859
6    3      2 120.1220 45.01031 123 4.058443
7    4      1 126.6769 38.33233 130 3.361968
8    4      2 125.6129 44.46168 124 3.992777

#without params
ddply(my_data, c("city", "gender"), my_function, dv="income", extra=F)

  city gender     mean       sd
1    1      1 127.1158 44.64347
2    1      2 125.0154 44.83492
3    2      1 130.3178 41.00359
4    2      2 128.1608 43.33454
5    3      1 121.1419 45.02290
6    3      2 120.1220 45.01031
7    4      1 126.6769 38.33233
8    4      2 125.6129 44.46168
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文