警告:“与线性/添加预测变量相交”将非比例赔率序数回归模型与VGLM()拟合时

发布于 2025-01-23 13:15:14 字数 625 浏览 0 评论 0原文

我适合部分比例赔率累积logit序数回归模型。反应是一种顺序诊断,预测因子是两个尿生物标志物。我使用以下命令适合模型:

fit=vglm(diagnosis ~ creatinine + LYVE1, data=urine.dat, 
    family=cumulative(parallel=F))
summary(fit)

之后,我经常收到以下大约20个警告:

In slot(family, "validparams")(eta, y, extra = extra) :
    It seems that the nonparallelism assumption has resulted in    
    intersecting linear/additive predictors.  
    Try propodds() or fitting a partial nonproportional odds model or  
    choosing some other link function, etc.

有人了解“与线性/添加预测指标相交”是什么意思?据我所知,这种错误经常以非偏差赔率VGLM模型返回。只是试图了解模型的问题。

任何洞察力都会有所帮助。

I am fitting a partial proportional odds cumulative logit ordinal regression model. Response is an ordinal diagnosis, predictors are two urinary biomarkers. I fit the model using the following command:

fit=vglm(diagnosis ~ creatinine + LYVE1, data=urine.dat, 
    family=cumulative(parallel=F))
summary(fit)

Afterwards, I often get about 20 of the following warnings:

In slot(family, "validparams")(eta, y, extra = extra) :
    It seems that the nonparallelism assumption has resulted in    
    intersecting linear/additive predictors.  
    Try propodds() or fitting a partial nonproportional odds model or  
    choosing some other link function, etc.

Does anyone understand what is meant by "intersecting linear/additive predictors?" From what I have seen, this error is returned very often with non-proportional odds VGLM models. Just trying to understand what is the issue with the model.

Any insight would be helpful.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

栀子花开つ 2025-01-30 13:15:14

如果您浏览vgam源代码,您会发现以下警告的部分:

    probs <-
      if ( .reverse ) {
        ccump <- cbind(1, eta2theta(eta, .link , earg = .earg ))
        cbind(-tapplymat1(ccump, "diff"), ccump[, ncol(ccump)])
      } else {
        cump <- cbind(eta2theta(eta, .link , earg = .earg ), 1)
        cbind(cump[, 1], tapplymat1(cump, "diff"))
      }
    okay1 <- all(is.finite(probs)) && all(0 < probs & probs < 1)
    if (!okay1)
      warning("It seems that the nonparallelism assumption has ",
              "resulted in intersecting linear/additive ",
              "predictors.  Try propodds() or fitting a partial ",
              "nonproportional odds model or choosing ",
              "some other link function, etc.")

我们可以将其肠心煮沸到几个不同的部分:

library(VGAM)
#> Loading required package: stats4
#> Loading required package: splines
fit=vglm(cyl ~ wt, data=mtcars, 
         family=cumulative(parallel=F))

eta <- predict(fit, type="link")
cump <- cbind(VGAM:::eta2theta(eta, logitlink ), 1)
probs <- cbind(cump[, 1], tapplymat1(cump, "diff"))

因为cyl 具有三个值,ETA是链接刻度上预测值的NX2矩阵。 cump是按照订购logit的通常方式计算的累积概率的矩阵。 probs是按照订购logit的通常方式计算的类别概率的矩阵 - 通过从当前logit中减去先前的累积概率。一旦计算出这些,就会生成一个标志来确定所有概率是否都是有限的,并且在理论上:

okay1 <- all(is.finite(probs)) && all(0 < probs & probs < 1)
#> [1] FALSE

在这种情况下,oke1 is false。我们可以看到下面的原因:

all(is.finite(probs))
#> [1] TRUE
all(0 < probs & probs < 1)
#> [1] FALSE

这是因为一些预测的概率是负面的。我们可以看到下面的哪一个:

ind <- which(probs < 0 | probs > 1, arr.ind=TRUE)[,1]
ind
#> row 
#>  16

probs[ind, ]
#>                    logitlink(P[Y<=2])                    
#>       1.253506e-04      -2.167395e-12       9.998746e-01

请注意,第二组的预测概率为负(尽管与零没有太大不同)。这里的要点是,即使您已经指定了Parallel = false,结果模型仍与基本累积概率假设不相容。警告是鼓励您使用一个不同的模型,该模型不以这种方式计算概率,例如多项式logit。例如:

fit2=vglm(cyl ~ wt, data=mtcars, 
         family=multinomial())

这不会发出警告,因为只要链接量表上的指定预测值是有限的,就以不允许它们在外面的方式计算概率。

在2022-04-26创建的 reprex package (v2.0.1)

If you look through the VGAM source code, you'll find the following piece that throws the warning:

    probs <-
      if ( .reverse ) {
        ccump <- cbind(1, eta2theta(eta, .link , earg = .earg ))
        cbind(-tapplymat1(ccump, "diff"), ccump[, ncol(ccump)])
      } else {
        cump <- cbind(eta2theta(eta, .link , earg = .earg ), 1)
        cbind(cump[, 1], tapplymat1(cump, "diff"))
      }
    okay1 <- all(is.finite(probs)) && all(0 < probs & probs < 1)
    if (!okay1)
      warning("It seems that the nonparallelism assumption has ",
              "resulted in intersecting linear/additive ",
              "predictors.  Try propodds() or fitting a partial ",
              "nonproportional odds model or choosing ",
              "some other link function, etc.")

We can boil the guts of this down to a couple of different pieces:

library(VGAM)
#> Loading required package: stats4
#> Loading required package: splines
fit=vglm(cyl ~ wt, data=mtcars, 
         family=cumulative(parallel=F))

eta <- predict(fit, type="link")
cump <- cbind(VGAM:::eta2theta(eta, logitlink ), 1)
probs <- cbind(cump[, 1], tapplymat1(cump, "diff"))

Since cyl has three values, eta is an Nx2 matrix of predicted values on the link scale. cump is the matrix of cumulative probabilities calculated in the usual way for ordered logit. probs is the matrix of category probabilities calculated in the usual way for ordered logit - by subtracting the previous cumulative probability from the current one. Once these are calculated, a flag is generated to identify whether all probabilities are finite and in the theoretical bounds:

okay1 <- all(is.finite(probs)) && all(0 < probs & probs < 1)
#> [1] FALSE

In this case okay1 is FALSE. We can see why below:

all(is.finite(probs))
#> [1] TRUE
all(0 < probs & probs < 1)
#> [1] FALSE

It's because some of the predicted probabilities are negative. We can see which ones below:

ind <- which(probs < 0 | probs > 1, arr.ind=TRUE)[,1]
ind
#> row 
#>  16

probs[ind, ]
#>                    logitlink(P[Y<=2])                    
#>       1.253506e-04      -2.167395e-12       9.998746e-01

Notice here that the predicted probabilities for the second group are negative (though not much different from zero). The takeaway here is that even though you have specified parallel=FALSE, the resulting model is incompatible with the underlying cumulative probability assumption. The warning is encouraging you to use a different model that doesn't calculate probabilities this way, like multinomial logit. For example:

fit2=vglm(cyl ~ wt, data=mtcars, 
         family=multinomial())

which doesn't throw a warning because the probabilities are calculated in a way that won't allow them to be outside [0,1], so long as the exponentiated predicted values on the link scale are finite.

Created on 2022-04-26 by the reprex package (v2.0.1)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文