渐进式操作的 For 循环替代方案

发布于 2024-10-19 11:07:36 字数 428 浏览 4 评论 0原文

我必须逐步将回归函数应用于时间序列数据（向量“time”和“tm”，并且我使用 For 循环，如下所示：

top<-length(time)
for(k in 2:top){
    lin.regr<-lm(tm[1:k] ~ log(time[1:k]))
    slope[k]<-coef(lin.regr)[2]
}

但是对于向量长度约为 10k 的情况，它会变得非常慢。有没有更快的替代方案（也许使用 apply 函数）？

在一个更简单的问题中：如果我有一个像 x<-c(1:10) 这样的向量，我如何构建包含（例如）x 值的渐进和的 ay 向量？喜欢：

x
1 2 3 4 5 6 7 8 9 10
y
1  3  6 10 15 21 28 36 45 55

原文

I have to apply regression function progressively to a time series data (vector "time" and "tm" and I'm using a For Loop as follow:

top<-length(time)
for(k in 2:top){
    lin.regr<-lm(tm[1:k] ~ log(time[1:k]))
    slope[k]<-coef(lin.regr)[2]
}

But for vectors' length of about 10k it becomes very slow.
Is there a faster alternative (maybe using apply function)?

In a more easy problem: if I have a vector like x<-c(1:10) how can I build a y vector containing (for example) the progressive sum of x values?
Like:

x
1 2 3 4 5 6 7 8 9 10
y
1  3  6 10 15 21 28 36 45 55

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

陪我终i 2024-10-26 11:07:36

嗯，没有快速循环替代方案，除非您可以矢量化。在某些情况下，诸如ave、aggregate、ddply、tapply...之类的函数可以为您带来巨大的胜利，但通常诀窍在于使用更快的函数，例如cumsum（参见user615147的答案）

举例来说：

top <- 1000
tm <- rnorm(top,10)   
time <- rnorm(top,10)

> system.time(
+ results <- sapply(2:top,function (k) coef(lm(tm[1:k] ~ log(time[1:k])))[2])
+ )
   user  system elapsed 
   4.26    0.00    4.27 

> system.time(
+ results <- lapply(2:top,function (k) coef(lm(tm[1:k] ~ log(time[1:k])))[2])
+ )
   user  system elapsed 
   4.25    0.00    4.25 

> system.time(
+ results <- for(k in 2:top) coef(lm(tm[1:k] ~ log(time[1:k])))[2]
+ )
   user  system elapsed 
   4.25    0.00    4.25 

> system.time(
+ results <- for(k in 2:top) lm.fit(matrix(log(time[1:k]),ncol=1),
+                                 tm[1:k])$coefficients[2]
+ )
   user  system elapsed 
   0.43    0.00    0.42

唯一更快的解决方案是lm.fit()。不要误会，每次运行分析时的时间都会有所不同，因此 0.02 的差异在 R 中并不显着。 sapply、for 和 lapply 都是在这里同样快。诀窍是使用lm.fit。

如果您有一个名为 Data 的数据框，您可以使用类似 :

Data <- data.frame(Y=rnorm(top),X1=rnorm(top),X2=rnorm(top))

mf <- model.matrix(Y~X1+X2,data=Data)
results <- sapply(2:top, function(k)
  lm.fit(mf[1:k,],Data$Y[1:k])$coefficients[2]
)

作为更通用的解决方案。

Well, there is no fast loop alternative, unless you can vectorize. In some circumstances functions like ave, aggregate, ddply, tapply, ... can give you a substantial win, but often the trick lies in using faster functions, like cumsum (cfr. the answer of user615147)

To illustrate :

top <- 1000
tm <- rnorm(top,10)   
time <- rnorm(top,10)

> system.time(
+ results <- sapply(2:top,function (k) coef(lm(tm[1:k] ~ log(time[1:k])))[2])
+ )
   user  system elapsed 
   4.26    0.00    4.27 

> system.time(
+ results <- lapply(2:top,function (k) coef(lm(tm[1:k] ~ log(time[1:k])))[2])
+ )
   user  system elapsed 
   4.25    0.00    4.25 

> system.time(
+ results <- for(k in 2:top) coef(lm(tm[1:k] ~ log(time[1:k])))[2]
+ )
   user  system elapsed 
   4.25    0.00    4.25 

> system.time(
+ results <- for(k in 2:top) lm.fit(matrix(log(time[1:k]),ncol=1),
+                                 tm[1:k])$coefficients[2]
+ )
   user  system elapsed 
   0.43    0.00    0.42

The only faster solution is lm.fit(). Don't be mistaken, the timings differ a bit every time you run the analysis, so a difference of 0.02 is not significant in R. sapply, for and lapply are all exactly as fast here. The trick is to use lm.fit.

If you have a dataframe called Data, you could use something like :

Data <- data.frame(Y=rnorm(top),X1=rnorm(top),X2=rnorm(top))

mf <- model.matrix(Y~X1+X2,data=Data)
results <- sapply(2:top, function(k)
  lm.fit(mf[1:k,],Data$Y[1:k])$coefficients[2]
)

as a more general solution.

回复收藏 0 原文

￡冰雨忧蓝° 2024-10-26 11:07:36

results <- sapply(2:top,function (k) coef(lm(tm[1:k] ~ log(time[1:k])))[2])

〜应用函数系列是在 R 中迭代的最快方法。

还可以看看使用 lm.fit() 来加快你的回归速度，

cumsum(1:10)

第二个问题是如何做的

results <- sapply(2:top,function (k) coef(lm(tm[1:k] ~ log(time[1:k])))[2])

~apply family of functions is the fastest way to iterate in R.

can also look at using lm.fit() to speed up your regrssion a bit

cumsum(1:10)

is how to do the second question

回复收藏 0 原文

~没有更多了~

关于作者

一场信仰旅途

暂无简介

0 文章

0 评论

25 人气

关注发私信

友情链接

文江博客

渐进式操作的 For 循环替代方案

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

Gabu-gabumon

qq_CgiN62

荔枝明

赏烟花じ飞满天

独守阴晴ぅ圆缺

¤→小豸慧

友情链接

渐进式操作的 For 循环替代方案

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

Gabu-gabumon

qq_CgiN62

荔枝明

赏烟花じ飞满天

独守阴晴ぅ圆缺

¤→小豸慧

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。