没有 Rcpp 就能加快速度吗？

发布于 2024-12-15 09:56:26 字数 864 浏览 2 评论 0原文

我希望加快以下算法的速度。我给函数一个 xts 时间序列，然后想要对之前 X 点上的每个时间点执行主成分分析（我目前使用 500 个），然后使用该 PCA 的结果（在 X 点中的 5 个主成分）以下代码）来计算一些值。像这样的事情：

lookback <- 500
for(i in (lookback+1):nrow(x))
{   
        x.now <- x[(i-lookback):i]        
        x.prcomp <- prcomp(x.now)
        ans[i] <- (some R code on x.prcomp)
}

我认为这需要我将回溯行复制为列，以便 x 类似于 cbind(x,lag(x),lag(x,k=2) ),lag(x,k=3)...lag(x,k=lookback))，然后在每一行运行 prcomp？不过这看起来很贵。也许是 apply 的某种变体？我愿意研究 Rcpp，但想在此之前由你们来运行它。

编辑：哇，感谢您的所有回复。有关我的数据集/算法的信息：

dim(x.xts) 当前 = 2000x24。但最终，如果这显示出希望，它必须运行得很快（我将给它多个数据集）。
func(x.xts) 大约需要 70 秒。这是 2000-500 次 prcomp 调用，创建了 1500 个 500x24 数据帧。

我尝试使用 Rprof 来查看算法中最昂贵的部分，但这是我第一次使用 Rprof，因此我需要更多使用此工具的经验才能获得可理解的结果（感谢您的建议）。

我想我将首先尝试将其滚动到 _apply 类型循环中，然后查看并行化。

原文

I'm looking to speed up the following algorithm. I give the function an xts time series and then want to perform a principal components analysis for each time point on the previous X points (I'm using 500 at the moment) and then use the results of that PCA (5 principal components in the following code) to compute some value. Something like this:

lookback <- 500
for(i in (lookback+1):nrow(x))
{   
        x.now <- x[(i-lookback):i]        
        x.prcomp <- prcomp(x.now)
        ans[i] <- (some R code on x.prcomp)
}

I assume this would require me to replicate the lookback rows as columns so that x would be something like cbind(x,lag(x),lag(x,k=2),lag(x,k=3)...lag(x,k=lookback)), and then run prcomp on each line? This seems expensive though. Perhaps some variant of apply? I'm willing to look into Rcpp but wanted to run this by you guys before that.

Edit: Wow thanks for all the responses. Info on my dataset/algorithm:

dim(x.xts) currently = 2000x24. But eventually, if this shows promise, it will have to run fast (I'll give it multiple datasets).
func(x.xts) takes ~70 seconds. That's 2000-500 prcomp calls with 1500 500x24 dataframe creations.

I attempted to use Rprof to see what was the most expensive part of the algo but it's my first time using Rprof so I need some more experience with this tool to get intelligible results (thanks for the suggestion).

I think I will first attempt to roll this into an _apply type loop, and then look at parallelizing.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

荒人说梦 2024-12-22 09:56:26

在我的 4 核桌面上，如果这无法在合理的时间范围内完成，我将使用以下内容运行该块（未测试）：

library(snowfall)
sfInit(parallel = TRUE, cpus = 4, type = "SOCK")
lookback <- 500
sfExport(list = c("lookback", "x"))
sfLibrary(xts)

output.object <- sfSapply(x = (lookback+1):nrow(x),
    fun = function(i, my.object = x, lb = lookback) {
        x.now <- my.object[(i-lb):i]      
        x.prcomp <- prcomp(x.now)
        ans <- ("some R code on x.prcomp")

        return(ans)
    }, simplify = FALSE) # or maybe it's TRUE? depends on what ans is

On my 4 core desktop, if this wouldn't complete in a reasonable time-frame, I would run the chunk using something along the lines of (not tested):

library(snowfall)
sfInit(parallel = TRUE, cpus = 4, type = "SOCK")
lookback <- 500
sfExport(list = c("lookback", "x"))
sfLibrary(xts)

output.object <- sfSapply(x = (lookback+1):nrow(x),
    fun = function(i, my.object = x, lb = lookback) {
        x.now <- my.object[(i-lb):i]      
        x.prcomp <- prcomp(x.now)
        ans <- ("some R code on x.prcomp")

        return(ans)
    }, simplify = FALSE) # or maybe it's TRUE? depends on what ans is

回复收藏 0 原文

~没有更多了~