分位数回归和 p 值

发布于 2024-11-13 09:47:27 字数 476 浏览 3 评论 0原文

我正在对我的数据集应用 guantile 回归（使用 R）。很容易生成具有不同分位数回归线的漂亮散点图图像 (taus＜-c(0.05,0.25,0.75,0.95))。

当我想要为这些分位数中的每一个生成 p 值（以便查看每条回归线的统计显着性）时，就会出现问题。对于中位数分位数 (tau=0.5)，这不是问题，但是当涉及到 tau=0.25 时，我收到以下错误消息：

>QRmodel<-rq(y~x,tau=0.25,model=T)
>summary(QRmodel,se="nid")
Error in summary.rq(QRmodel, se = "nid") : tau - h < 0:  error in summary.rq

这可能是什么原因？

另外：是否建议提及有关分位数回归模型结果的 p 值和系数，或者仅显示绘图并根据该图讨论结果就足够了吗？

最好的问候，失意的人

原文

I am applying guantile regression for my data-set (using R). It is easy to produce the nice scatterplot-image with different quantile regression lines
(taus <- c(0.05,0.25,0.75,0.95)).

Problem occurs when I want to produce p-values (in order to see statistical significance of each regression line) for each one of these quantiles. For median quantile (tau=0.5) this is not problematic, but when it comes to for example tau=0.25, I get following error message:

>QRmodel<-rq(y~x,tau=0.25,model=T)
>summary(QRmodel,se="nid")
Error in summary.rq(QRmodel, se = "nid") : tau - h < 0:  error in summary.rq

What could be the reason for this?

Also: Is it recommendable to mention p-values and coefficients regarding the results of quantile regression model or could it be enough to show just the plot-picture and discuss the results based on that picture?

Best regards, frustrated person

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

灼疼热情 2024-11-20 09:47:27

了解此类调试情况中发生的情况的一个好方法是找到引发错误的相关代码部分。如果您在控制台输入“summary.rq”，您将看到函数summary.rq 的代码。浏览它，您将找到使用“nid”方法计算 se 的部分，从以下代码开始：

else if (se == "nid") {
    h <- bandwidth.rq(tau, n, hs = hs)
    if (tau + h > 1) 
        stop("tau + h > 1:  error in summary.rq")
    if (tau - h < 0) 
        stop("tau - h < 0:  error in summary.rq")
    bhi <- rq.fit.fnb(x, y, tau = tau + h)$coef
    blo <- rq.fit.fnb(x, y, tau = tau - h)$coef

所以这里发生的是，为了计算 se，该函数首先需要计算带宽 h 和Quantreg 模型针对 tau +/- h 进行了改装。对于 tau 接近 0 或 1 的情况，添加或减去带宽“h”可能会导致 tau 低于 0 或大于 1，这不好，因此函数会停止。

您有几个选择：

1.) 尝试不同的 se 方法（引导？）

2.) 自己修改summary.rq 代码以强制它在中使用 max(tau,0) 或 min(tau,1)带宽使 tau 超出范围的实例。（可能有严重的理论原因说明为什么这是一个坏主意；除非您知道自己在做什么，否则不建议这样做。）

3.）您可以尝试阅读这些 se 的计算背后的理论，这样您就会有一个更好地了解它们何时可以有效或无效。这可能会解释为什么您会遇到 tau 值接近 0 或 1 的错误。

A good way to learn what's going on in these sorts of debugging situations is to find the relevant portion of code that is throwing the error. If you type 'summary.rq' at the console, you'll see the code for the function summary.rq. Scanning through it you'll find the section where it calculates se's using the "nid" method, starting with this code:

else if (se == "nid") {
    h <- bandwidth.rq(tau, n, hs = hs)
    if (tau + h > 1) 
        stop("tau + h > 1:  error in summary.rq")
    if (tau - h < 0) 
        stop("tau - h < 0:  error in summary.rq")
    bhi <- rq.fit.fnb(x, y, tau = tau + h)$coef
    blo <- rq.fit.fnb(x, y, tau = tau - h)$coef

So what's happening here is that in order to calculate the se's, the function first need to calculate a bandwidth, h, and the quantreg model is refit for tau +/- h. For tau's near 0 or 1, there's a possibility that adding or subtracting the bandwidth 'h' will lead to a tau below 0 or greater than 1, which isn't good, so the function stops.

You have a couple of options:

1.) Try a different se method (bootstrapping?)

2.) Modify the summary.rq code yourself to force it to use either max(tau,0) or min(tau,1) in the instances where the bandwidth pushes tau out of bounds. (There could be serious theoretical reasons why this is a bad idea; not advised unless you know what you're doing.)

3.) You could try to read up on the theory behind the calculation of these se's so you'd have a better idea of when they might work well or not. This might shed some light on why you're running into errors with values of tau near 0 or 1.

回复收藏 0 原文