glm 起始值不被接受日志链接

发布于 2024-12-17 03:43:21 字数 568 浏览 3 评论 0原文

我想运行带有日志链接和偏移量的高斯 GLM。 出现以下问题:

y <- c(1,1,0,0)
t <- c(5,3,2,4)

没问题:

exp(coef(glm(y~1 +  offset(log(t)), family=poisson)))

使用 family=gaussian,需要指定起始值,它在这里工作:

exp(coef(glm(y~1, family=gaussian(link=log), start=0)))

但在这里不起作用:

exp(coef(glm(y~1 +  offset(log(t)), family=gaussian(link=log), start=0)))

eval(expr, envir, enclos) 中的错误:找不到有效的起始值:请指定一些“

有人看到出了什么问题吗(希望只是在我的编码中)?

I want to run a Gaussian GLM with a log link and an offset.
The following problems arise:

y <- c(1,1,0,0)
t <- c(5,3,2,4)

No problem:

exp(coef(glm(y~1 +  offset(log(t)), family=poisson)))

with family=gaussian, starting values need to be specified, it works here:

exp(coef(glm(y~1, family=gaussian(link=log), start=0)))

but does not work here:

exp(coef(glm(y~1 +  offset(log(t)), family=gaussian(link=log), start=0)))

Error in eval(expr, envir, enclos) : cannot find valid starting values: please specify some"

Does anyone see what's wrong (hopefully just in my coding) ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

宁愿没拥抱 2024-12-24 03:43:21

以下是一些考古学的结果,解释了 glm 函数内部发生的事情:

调试(使用 debug("glm"))并逐步执行该函数表明:它在以下调用时失败:

if (length(offset) && attr(mt, "intercept") > 0L) {
  fit$null.deviance <- eval(call(if (is.function(method)) "method" else method, 
    x = X[, "(Intercept)", drop = FALSE], y = Y, weights = weights, 
    offset = offset, family = family, control = control, 
    intercept = TRUE))$deviance
}

这是计算模型的零偏差的尝试。仅当存在截距项和偏移项时才会对其进行评估(我不确定为什么;可能是之前调用 glm 计算出的默认零偏差在这种情况下是错误的,并且必须是重新计算?)。它调用 glm.fitmethod 的默认值),但没有起始值,因为这些对于仅拦截模型通常是不必要的。

现在在 glm.fit 内部进行调试,看看会发生什么:我们得到(在对系列函数 gaussian() 的调用中):

  if (is.null(etastart) && is.null(start) && is.null(mustart) && 
    ((family$link == "inverse" && any(y == 0)) || (family$link == 
        "log" && any(y <= 0))))
    stop("cannot find valid starting values: please specify some")

我们看到这一点是因为起始值未通过,因为使用了日志链接,并且由于某些 y 值等于零,因此拟合失败。因此,如果(且仅当?)同时指定了偏移量和截距、使用了日志链接并且响应中存在零值,则应该发生这种情况。

如果您 dump("glm",file="glmtemp.R"); 将符合零偏差的行添加

    start = start[1], etastart = etastart[1], mustart = mustart[1],

到调用中(即上面显示的行);和 source("glmtemp.R"),它似乎工作正常......我认为这应该是一个合理的通用解决方案。如果有人想在 R 开发列表中提出这个问题,请随意。

Here are the results of some archaeology that explains what's going on, deep within the glm function:

Debugging (with debug("glm")) and stepping through the function shows that it fails at the following call:

if (length(offset) && attr(mt, "intercept") > 0L) {
  fit$null.deviance <- eval(call(if (is.function(method)) "method" else method, 
    x = X[, "(Intercept)", drop = FALSE], y = Y, weights = weights, 
    offset = offset, family = family, control = control, 
    intercept = TRUE))$deviance
}

This is an attempt to calculate the null deviance for the model. It's only evaluated if there's an intercept term and an offset term (I'm not sure why; it may be that the default null deviance calculated by the previous call to glm is wrong in that case and must be recalculated?). It calls glm.fit (the default value of method), but without starting values because these are usually unnecessary for the intercept-only model.

Now debugging inside glm.fit to see what happens: we get (within a call to the family function, gaussian()) to:

  if (is.null(etastart) && is.null(start) && is.null(mustart) && 
    ((family$link == "inverse" && any(y == 0)) || (family$link == 
        "log" && any(y <= 0))))
    stop("cannot find valid starting values: please specify some")

and we see that because the starting values were not passed through, because a log link is used, and because some y values are equal to zero, the fit fails. So this is a case that should happen if (and only if?) an offset and an intercept are both specified, a log link is used, and there are zero values in the response.

If you dump("glm",file="glmtemp.R"); add the line

    start = start[1], etastart = etastart[1], mustart = mustart[1],

to the call that fits the null deviance (i.e. the one shown above); and source("glmtemp.R"), it seems to work OK ... I think this should be a reasonable general solution. If anyone wants to bring this issue up on the R development list, feel free.

眉黛浅 2024-12-24 03:43:21

我看起来当 offset 存在时,start 未被识别。您试图在 y 值中取 0 的对数,即 -Inf。在寻找解决方案时,如果没有 start 的帮助,glm 显然无法处理这个问题。对 y 值进行一点小小的扰动就能找到解决方案。

exp(coef(glm(I(y+.Machine$double.eps)~1 + offset(log(t)), family=gaussian(link=log))))
(Intercept) 
  0.1481481

I looks like start isn't being recognised when offset is present. You are trying to take the log of 0 in the y values which is -Inf. glm obviously cannot deal with this when looking for a solution without being given some help by start. A small perturbation in your y values will permit a solution.

exp(coef(glm(I(y+.Machine$double.eps)~1 + offset(log(t)), family=gaussian(link=log))))
(Intercept) 
  0.1481481
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文