没有 Rcpp 就能加快速度吗?
我希望加快以下算法的速度。我给函数一个 xts 时间序列,然后想要对之前 X 点上的每个时间点执行主成分分析(我目前使用 500 个),然后使用该 PCA 的结果(在 X 点中的 5 个主成分)以下代码)来计算一些值。像这样的事情:
lookback <- 500
for(i in (lookback+1):nrow(x))
{
x.now <- x[(i-lookback):i]
x.prcomp <- prcomp(x.now)
ans[i] <- (some R code on x.prcomp)
}
我认为这需要我将回溯行复制为列,以便 x 类似于 cbind(x,lag(x),lag(x,k=2) ),lag(x,k=3)...lag(x,k=lookback)),然后在每一行运行 prcomp
?不过这看起来很贵。也许是 apply
的某种变体?我愿意研究 Rcpp,但想在此之前由你们来运行它。
编辑:哇,感谢您的所有回复。有关我的数据集/算法的信息:
- dim(x.xts) 当前 = 2000x24。但最终,如果这显示出希望,它必须运行得很快(我将给它多个数据集)。
- func(x.xts) 大约需要 70 秒。这是 2000-500 次 prcomp 调用,创建了 1500 个 500x24 数据帧。
我尝试使用 Rprof 来查看算法中最昂贵的部分,但这是我第一次使用 Rprof,因此我需要更多使用此工具的经验才能获得可理解的结果(感谢您的建议)。
我想我将首先尝试将其滚动到 _apply 类型循环中,然后查看并行化。
I'm looking to speed up the following algorithm. I give the function an xts time series and then want to perform a principal components analysis for each time point on the previous X points (I'm using 500 at the moment) and then use the results of that PCA (5 principal components in the following code) to compute some value. Something like this:
lookback <- 500
for(i in (lookback+1):nrow(x))
{
x.now <- x[(i-lookback):i]
x.prcomp <- prcomp(x.now)
ans[i] <- (some R code on x.prcomp)
}
I assume this would require me to replicate the lookback rows as columns so that x
would be something like cbind(x,lag(x),lag(x,k=2),lag(x,k=3)...lag(x,k=lookback))
, and then run prcomp
on each line? This seems expensive though. Perhaps some variant of apply
? I'm willing to look into Rcpp but wanted to run this by you guys before that.
Edit: Wow thanks for all the responses. Info on my dataset/algorithm:
- dim(x.xts) currently = 2000x24. But eventually, if this shows promise, it will have to run fast (I'll give it multiple datasets).
- func(x.xts) takes ~70 seconds. That's 2000-500 prcomp calls with 1500 500x24 dataframe creations.
I attempted to use Rprof
to see what was the most expensive part of the algo but it's my first time using Rprof
so I need some more experience with this tool to get intelligible results (thanks for the suggestion).
I think I will first attempt to roll this into an _apply type loop, and then look at parallelizing.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
在我的 4 核桌面上,如果这无法在合理的时间范围内完成,我将使用以下内容运行该块(未测试):
On my 4 core desktop, if this wouldn't complete in a reasonable time-frame, I would run the chunk using something along the lines of (not tested):