可以 lapply 不修改更高范围内的变量

发布于 2024-08-29 09:02:07 字数 288 浏览 6 评论 0原文

我经常想做如下的事情:

mat <- matrix(0,nrow=10,ncol=1)
lapply(1:10, function(i) { mat[i,] <- rnorm(1,mean=i)})

但是,我希望 mat 中有 10 个随机数,但实际上它有 0 个。(我不担心 rnorm 部分。显然有一个正确的方法可以做到这一点。我担心从 lapply 的匿名函数中影响 mat)我可以不从 lapply 内部影响矩阵 mat 吗?为什么不呢? R 是否存在阻止此操作的范围规则?

I often want to do essentially the following:

mat <- matrix(0,nrow=10,ncol=1)
lapply(1:10, function(i) { mat[i,] <- rnorm(1,mean=i)})

But, I would expect that mat would have 10 random numbers in it, but rather it has 0. (I am not worried about the rnorm part. Clearly there is a right way to do that. I am worry about affecting mat from within an anonymous function of lapply) Can I not affect matrix mat from inside lapply? Why not? Is there a scoping rule of R that is blocking this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

不如归去 2024-09-05 09:02:08

我在这个相关问题中讨论了这个问题:“Is R's apply family more than语法糖”。您会注意到,如果查看 forapply 的函数签名,它们有一个关键区别:for 循环计算表达式,而apply循环评估函数

如果您想更改 apply 函数范围之外的内容,则需要使用 <<-assign。或者更重要的是,使用类似 for 循环的东西来代替。但是,在处理函数之外的事物时,您确实需要小心,因为这可能会导致意外的行为。

在我看来,使用 apply 函数的主要原因之一是明确的,因为它不会改变其外部的内容。这是函数式编程的核心概念,其中函数避免产生副作用。这也是为什么apply系列函数可以用于并行处理(并且类似的函数存在于各种并行包中,例如snow)。

最后,运行代码示例的正确方法是将参数传递给函数,如下所示,然后分配回输出:

mat <- matrix(0,nrow=10,ncol=1)
mat <- matrix(lapply(1:10, function(i, mat) { mat[i,] <- rnorm(1,mean=i)}, mat=mat))

尽可能明确参数总是最好的(因此 mat=mat)而不是推断它。

I discussed this issue in this related question: "Is R’s apply family more than syntactic sugar". You will notice that if you look at the function signature for for and apply, they have one critical difference: a for loop evaluates an expression, while an apply loop evaluates a function.

If you want to alter things outside the scope of an apply function, then you need to use <<- or assign. Or more to the point, use something like a for loop instead. But you really need to be careful when working with things outside of a function because it can result in unexpected behavior.

In my opinion, one of the primary reasons to use an apply function is explicitly because it doesn't alter things outside of it. This is a core concept in functional programming, wherein functions avoid having side effects. This is also a reason why the apply family of functions can be used in parallel processing (and similar functions exist in the various parallel packages such as snow).

Lastly, the right way to run your code example is to also pass in the parameters to your function like so, and assigning back the output:

mat <- matrix(0,nrow=10,ncol=1)
mat <- matrix(lapply(1:10, function(i, mat) { mat[i,] <- rnorm(1,mean=i)}, mat=mat))

It is always best to be explicit about a parameter when possible (hence the mat=mat) rather than inferring it.

迷爱 2024-09-05 09:02:08

lapply()sapply() 等高阶函数的主要优点之一是您不必初始化“容器”(在本例中为矩阵) )。

正如 Fojtasek 建议的那样:

as.matrix(lapply(1:10,function(i) rnorm(1,mean=i)))

或者:

do.call(rbind,lapply(1:10,function(i) rnorm(1,mean=i)))

或者,简单地作为数字向量:

sapply(1:10,function(i) rnorm(1,mean=i))

如果您确实想修改匿名函数(在本例中为随机数生成器)范围之上的变量,请使用 <<-

> mat <- matrix(0,nrow=10,ncol=1)
> invisible(lapply(1:10, function(i) { mat[i,] <<- rnorm(1,mean=i)}))
> mat
           [,1]
 [1,] 1.6780866
 [2,] 0.8591515
 [3,] 2.2693493
 [4,] 2.6093988
 [5,] 6.6216346
 [6,] 5.3469690
 [7,] 7.3558518
 [8,] 8.3354715
 [9,] 9.5993111
[10,] 7.7545249

请参阅这篇文章了解<< ;-。但在这个特定的示例中,for 循环更有意义:

mat <- matrix(0,nrow=10,ncol=1)
for( i in 1:10 ) mat[i,] <- rnorm(1,mean=i)

在全局工作区中创建索引变量 i 的成本较小。

One of the main advantages of higher-order functions like lapply() or sapply() is that you don't have to initialize your "container" (matrix in this case).

As Fojtasek suggests:

as.matrix(lapply(1:10,function(i) rnorm(1,mean=i)))

Alternatively:

do.call(rbind,lapply(1:10,function(i) rnorm(1,mean=i)))

Or, simply as a numeric vector:

sapply(1:10,function(i) rnorm(1,mean=i))

If you really want to modify a variable above of the scope of your anonymous function (random number generator in this instance), use <<-

> mat <- matrix(0,nrow=10,ncol=1)
> invisible(lapply(1:10, function(i) { mat[i,] <<- rnorm(1,mean=i)}))
> mat
           [,1]
 [1,] 1.6780866
 [2,] 0.8591515
 [3,] 2.2693493
 [4,] 2.6093988
 [5,] 6.6216346
 [6,] 5.3469690
 [7,] 7.3558518
 [8,] 8.3354715
 [9,] 9.5993111
[10,] 7.7545249

See this post about <<-. But in this particular example, a for-loop would just make more sense:

mat <- matrix(0,nrow=10,ncol=1)
for( i in 1:10 ) mat[i,] <- rnorm(1,mean=i)

with the minor cost of creating a indexing variable, i, in the global workspace.

↘紸啶 2024-09-05 09:02:08

lapply 并不实际更改 mat,而是仅返回 mat 的更改版本(作为列表)。您只需将其分配给 mat 并使用 as.matrix() 将其转换回矩阵即可。

Instead of actually altering mat, lapply just returns the altered version of mat (as a list). You just need to assign it to mat and turn it back into a matrix using as.matrix().

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文