使用“loess.smooth”而不是“loess”或“lowess”时出错

发布于 2024-10-11 19:50:36 字数 1821 浏览 10 评论 0原文

我需要平滑一些模拟数据,但当要平滑的模拟坐标大多为相同值时,偶尔会遇到问题。这是最简单情况的一个可重现的小例子。

> x <- 0:50
> y <- rep(0,51)
> loess.smooth(x,y)
Error in simpleLoess(y, x, w, span, degree, FALSE, FALSE, normalize = FALSE,  : 
   NA/NaN/Inf in foreign function call (arg 1)

在此示例中,loess(y~x)lowess(x,y) 及其在 MATLAB 中的类似物产生预期结果,没有错误。我在这里使用 loess.smooth 是因为我需要在一定数量的点上评估估计值。根据文档,我相信 loess.smoothloess 使用相同的估计函数,但前者是处理评估点的“辅助函数”。该错误似乎来自 C 函数:

> traceback()
3: .C(R_loess_raw, as.double(pseudovalues), as.double(x), as.double(weights), 
   as.double(weights), as.integer(D), as.integer(N), as.double(span), 
   as.integer(degree), as.integer(nonparametric), as.integer(order.drop.sqr), 
   as.integer(sum.drop.sqr), as.double(span * cell), as.character(surf.stat), 
   temp = double(N), parameter = integer(7), a = integer(max.kd), 
   xi = double(max.kd), vert = double(2 * D), vval = double((D + 
       1) * max.kd), diagonal = double(N), trL = double(1), 
   delta1 = double(1), delta2 = double(1), as.integer(0L))
2: simpleLoess(y, x, w, span, degree, FALSE, FALSE, normalize = FALSE, 
   "none", "interpolate", control$cell, iterations, control$trace.hat)
1: loess.smooth(x, y)

loess 也调用 simpleLoess,但参数似乎不同。当然,如果您将 y 值改变得足够多,使其不为零,则 loess.smooth 运行时不会出现错误,但我需要程序在最极端的情况下也能运行。

希望有人可以帮助我解决以下一项和/或全部问题:

  1. 了解为什么只有 loess.smooth 而不是其他函数会产生此错误,并找到此问题的解决方案。
  2. 使用 loess 找到解决方法,但仍然在可能与向量 x 不同的指定数量的点处评估估计值。例如,我可能只想在平滑中使用 x <- seq(0,50,10),但在 x <- 0:50 处评估估计值>。据我所知,将 predict 与新数据框一起使用将无法正确处理这种情况,但如果我遗漏了某些内容,请告诉我。
  3. 以不会阻止程序移至下一个模拟数据集的方式处理错误。

预先感谢您对这个问题的任何帮助。

I need to smooth some simulated data, but occasionally run into problems when the simulated ordinates to be smoothed are mostly the same value. Here is a small reproducible example of the simplest case.

> x <- 0:50
> y <- rep(0,51)
> loess.smooth(x,y)
Error in simpleLoess(y, x, w, span, degree, FALSE, FALSE, normalize = FALSE,  : 
   NA/NaN/Inf in foreign function call (arg 1)

loess(y~x), lowess(x,y), and their analogue in MATLAB produce the expected results without error on this example. I am using loess.smooth here because I need the estimates evaluated at a set number of points. According to the documentation, I believe loess.smooth and loess are using the same estimation functions, but the former is an "auxiliary function" to handle the evaluation points. The error seems to come from a C function:

> traceback()
3: .C(R_loess_raw, as.double(pseudovalues), as.double(x), as.double(weights), 
   as.double(weights), as.integer(D), as.integer(N), as.double(span), 
   as.integer(degree), as.integer(nonparametric), as.integer(order.drop.sqr), 
   as.integer(sum.drop.sqr), as.double(span * cell), as.character(surf.stat), 
   temp = double(N), parameter = integer(7), a = integer(max.kd), 
   xi = double(max.kd), vert = double(2 * D), vval = double((D + 
       1) * max.kd), diagonal = double(N), trL = double(1), 
   delta1 = double(1), delta2 = double(1), as.integer(0L))
2: simpleLoess(y, x, w, span, degree, FALSE, FALSE, normalize = FALSE, 
   "none", "interpolate", control$cell, iterations, control$trace.hat)
1: loess.smooth(x, y)

loess also calls simpleLoess, but with what appears to be different arguments. Of course, if you vary enough of the y values to be nonzero, loess.smooth runs without error, but I need the program to run in even the most extreme case.

Hopefully, someone can help me with one and/or all of the following:

  1. Understand why only loess.smooth, and not the other functions, produces this error and find a solution for this problem.
  2. Find a work-around using loess but still evaluating the estimate at a specified number of points that can differ from the vector x. For example, I might want to use only x <- seq(0,50,10) in the smoothing, but evaluate the estimate at x <- 0:50. As far as I know, using predict with a new data frame will not properly handle this situation, but please let me know if I am missing something there.
  3. Handle the error in a way that doesn't stop the program from moving onto the next simulated data set.

Thanks in advance for any help on this problem.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

呆橘 2024-10-18 19:50:36

第 1 部分:
这需要一些追踪,但如果你这样做:

loess.smooth(x, y, family = "guassian")

模型就会适合。这是由于 loess.smoothloess 的默认值不同而产生的;前者有family = c("symmetry", "gaussian"),而后者则相反。如果您浏览 loessloess.smooth 的代码,您会发现当 family = "gaussian" iterations< /code> 设置为 1。否则,它将采用值loess.control()$iterations。如果您在 simpleLoess 中进行迭代,以下函数调用将返回一个 NaN 向量:

pseudovalues <- .Fortran(R_lowesp, as.integer(N), as.double(y), 
            as.double(z$fitted.values), as.double(weights), as.double(robust), 
            integer(N), pseudovalues = double(N))$pseudovalues

这会导致下一个函数调用抛出您看到的错误:

zz <- .C(R_loess_raw, as.double(pseudovalues), as.double(x), 
            as.double(weights), as.double(weights), as.integer(D), 
            as.integer(N), as.double(span), as.integer(degree), 
            as.integer(nonparametric), as.integer(order.drop.sqr), 
            as.integer(sum.drop.sqr), as.double(span * cell), 
            as.character(surf.stat), temp = double(N), parameter = integer(7), 
            a = integer(max.kd), xi = double(max.kd), vert = double(2 * 
                D), vval = double((D + 1) * max.kd), diagonal = double(N), 
            trL = double(1), delta1 = double(1), delta2 = double(1), 
            as.integer(0L))

这都与鲁棒拟合有关黄土(方法)。如果您不想/不需要稳健的配合,请在 loess.smooth 调用中使用 family = "gaussian"

另请注意,loess.smooth 的默认值与 loess 的默认值不同,例如 'span'' Degree'< /代码>。因此,请仔细检查您想要拟合哪些模型并调整相关函数的默认值。

对于第 2 部分:

DF <- data.frame(x = 0:50, y = rep(0,51))
mod <- loess(y ~ x, data = DF)
pred <- predict(mod, newdata = data.frame(x = c(-1, 10, 15, 55)))
mod2 <- loess(y ~ x, data = DF, control = loess.control(surface = "direct"))
pred2 <- predict(mod2, newdata = data.frame(x = c(-1, 10, 15, 55)))

其中给出:

> pred
 1  2  3  4 
NA  0  0 NA 
> pred2
1 2 3 4 
0 0 0 0

如果这就是您的意思,则默认值不会推断。事实上,我根本不明白这里使用 predict 有什么问题。

第 3 部分:
查看 ?try?tryCatch,您可以将它们包裹在 loess 拟合函数(例如 loess.smooth)中,这将允许计算继续进行如果在 loess.smooth 中遇到错误。

您需要通过包含类似的内容来处理 trytryCatch 的输出(如果您在循环中执行此操作:

mod <- try(loess.smooth(x, y))
if(inherits(mod, "try-error"))
    next
## if here, model work, do something with `mod`

我可能会结合 try 或通过 loess 拟合并使用 predict 来解决此类问题的 tryCatch

For part 1:
This took a bit of tracking down, but if you do:

loess.smooth(x, y, family = "guassian")

the model will fit. This arises due to the different defaults of loess.smooth and loess; the former has family = c("symmetric", "gaussian") whilst the latter has it reversed. If you trawl through the code for loess and loess.smooth, you'll see that when family = "gaussian" iterations is set to 1. Otherwise it takes the value loess.control()$iterations. If you do iterations in simpleLoess, the following function call returns a vector of NaN:

pseudovalues <- .Fortran(R_lowesp, as.integer(N), as.double(y), 
            as.double(z$fitted.values), as.double(weights), as.double(robust), 
            integer(N), pseudovalues = double(N))$pseudovalues

Which causes the next function call to throw the error you saw:

zz <- .C(R_loess_raw, as.double(pseudovalues), as.double(x), 
            as.double(weights), as.double(weights), as.integer(D), 
            as.integer(N), as.double(span), as.integer(degree), 
            as.integer(nonparametric), as.integer(order.drop.sqr), 
            as.integer(sum.drop.sqr), as.double(span * cell), 
            as.character(surf.stat), temp = double(N), parameter = integer(7), 
            a = integer(max.kd), xi = double(max.kd), vert = double(2 * 
                D), vval = double((D + 1) * max.kd), diagonal = double(N), 
            trL = double(1), delta1 = double(1), delta2 = double(1), 
            as.integer(0L))

This all relates to robust fitting in Loess (the method). If you don't want/need a robust fit, use family = "gaussian" in your loess.smooth call.

Also, note that the defaults for loess.smooth differ from those of loess, e.g. for 'span' and 'degree'. So carefully check out what models you want to fit and adjust the relevant function's defaults.

For part 2:

DF <- data.frame(x = 0:50, y = rep(0,51))
mod <- loess(y ~ x, data = DF)
pred <- predict(mod, newdata = data.frame(x = c(-1, 10, 15, 55)))
mod2 <- loess(y ~ x, data = DF, control = loess.control(surface = "direct"))
pred2 <- predict(mod2, newdata = data.frame(x = c(-1, 10, 15, 55)))

Which gives:

> pred
 1  2  3  4 
NA  0  0 NA 
> pred2
1 2 3 4 
0 0 0 0

The default won't extrapolate if that was what you meant. I don't see what the problem with using predict here is at all, in fact.

For part 3:
Look at ?try and ?tryCatch which you can wrap round the loess fitting function (loess.smooth say), which will allow computations to continue if an error in loess.smooth is encountered.

You will need to handle the output of try or tryCatch by including something like (if you are doing this in a loop:

mod <- try(loess.smooth(x, y))
if(inherits(mod, "try-error"))
    next
## if here, model work, do something with `mod`

I would probably combine try or tryCatch with fitting via loess and using predict for such a problem.

输什么也不输骨气 2024-10-18 19:50:36

这是我第一次遇到这些函数,所以我无法为您提供太多帮助,但这是否与 y 值的方差为 0 有关?现在,您尝试根据已经尽可能平滑的数据来估计一条平滑线,这确实有效:

x <- 0:50
y <- c(rep(0,25),rep(1,26))
loess.smooth(x,y)

This is the first time I encountered these functions so I can't help you that much, but can't this have something to do with having a variance of 0 in the y-values? Now you try to estimate a smooth line from data that already is as smooth as it gets, and this does work:

x <- 0:50
y <- c(rep(0,25),rep(1,26))
loess.smooth(x,y)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文