重新创建 minitab 正态概率图
我正在尝试使用 R 重新创建以下图。Minitab 将其描述为正态概率图。
probplot 可以帮助您完成大部分工作。不幸的是,我无法弄清楚如何在该图周围添加置信区间带。
同样,ggplot 的 stat_qq() 似乎通过转换的 x 轴呈现类似的信息。似乎 geom_smooth() 可能是添加频段的候选者,但我还没有弄清楚。
最后,完成遗传学的人描述了类似的东西 这里。
重新创建上面的图的示例数据:
x <- c(40.2, 43.1, 45.5, 44.5, 39.5, 38.5, 40.2, 41.0, 41.6, 43.1, 44.9, 42.8)
如果有人有基本图形或 ggplot 的解决方案,我将不胜感激!
编辑
在查看了probplot
的详细信息后,我确定这就是它在图表上生成拟合线的方式:
> xl <- quantile(x, c(0.25, 0.75))
> yl <- qnorm(c(0.25, 0.75))
> slope <- diff(yl)/diff(xl)
> int <- yl[1] - slope * xl[1]
> slope
75%
0.4151
> int
75%
-17.36
确实,将这些结果与您得到的结果进行比较probplot 对象似乎比较得很好:
> check <- probplot(x)
> str(check)
List of 3
$ qdist:function (p)
$ int : Named num -17.4
..- attr(*, "names")= chr "75%"
$ slope: Named num 0.415
..- attr(*, "names")= chr "75%"
- attr(*, "class")= chr "probplot"
>
但是,将此信息合并到 ggplot2 或基础图形中不会产生相同的结果。
probplot(x)
与:
ggplot(data = df, aes(x = x, y = y)) + geom_point() + geom_abline(intercept = int, slope = slope)
我使用 R 的基本图形得到了类似的结果
plot(df$x, df$y)
abline(int, slope, col = "red")
最后,我了解到图例的最后两行引用了 Anderson-Darling 正态性测试,并且可以使用 nortest
包进行重现。
> ad.test(x)
Anderson-Darling normality test
data: x
A = 0.2303, p-value = 0.7502
I am trying to recreate the following plot with R. Minitab describes this as a normal probability plot.
The probplot gets you most of the way there. Unfortunately, I cannot figure out how to add the confidence interval bands around this plot.
Similarly, ggplot's stat_qq() seems to present similar information with a transformed x axis. It seems that geom_smooth()
would be the likely candidate to add the bands, but I haven't figure that out.
Finally, the Getting Genetics Done guys describe something similar here.
Sample data to recreate the plot above:
x <- c(40.2, 43.1, 45.5, 44.5, 39.5, 38.5, 40.2, 41.0, 41.6, 43.1, 44.9, 42.8)
If anyone has a solution with base graphics or ggplot, I'd appreciate it!
EDIT
After looking at the details of probplot
, I've determined this is how it generates the fit line on the graph:
> xl <- quantile(x, c(0.25, 0.75))
> yl <- qnorm(c(0.25, 0.75))
> slope <- diff(yl)/diff(xl)
> int <- yl[1] - slope * xl[1]
> slope
75%
0.4151
> int
75%
-17.36
Indeed, comparing these results to what you get out of the probplot object seem to compare very well:
> check <- probplot(x)
> str(check)
List of 3
$ qdist:function (p)
$ int : Named num -17.4
..- attr(*, "names")= chr "75%"
$ slope: Named num 0.415
..- attr(*, "names")= chr "75%"
- attr(*, "class")= chr "probplot"
>
However, incorporating this information into ggplot2 or base graphics does not yield the same results.
probplot(x)
Versus:
ggplot(data = df, aes(x = x, y = y)) + geom_point() + geom_abline(intercept = int, slope = slope)
I get similar results using R's base graphics
plot(df$x, df$y)
abline(int, slope, col = "red")
Lastly, I've learned that the last two rows of the legend refer to the Anderson-Darling test for normality and can be reproduced with the nortest
package.
> ad.test(x)
Anderson-Darling normality test
data: x
A = 0.2303, p-value = 0.7502
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
尝试使用
QTLRel
包中的qqPlot
函数。Try the
qqPlot
function in theQTLRel
package.也许这将是你可以借鉴的东西。默认情况下,stat_smooth() 使用 level=0.95。
Perhaps this will be something you can build on. By default, stat_smooth() uses level=0.95.
您使用了不正确的“y”,它们应该是分位数(用概率标记)。下面显示了正确位置的线:
要像在 Minitab 中一样添加置信界限,您可以执行以下操作
,并将以下两条线从上面添加到您的 ggplot 中(此外,用估计的百分位数替换斜率和截距线方法) )
you are using the incorrect "y", they should be quantiles (labeled with probabilities). The following shows the line in the right spot:
to add the confidence bounds as in Minitab, you can do the following
and add the following two lines to your ggplot from above (in addition, replace the slope and intercept line approach with the estimated percentiles)
我知道这是一个老问题,但对于仍在寻找解决方案的其他人来说,请查看 ggpubr 包中的 ggqqplot 。
I know it's an old question, but for others who also still look for a solution, have a look at
ggqqplot
from theggpubr
package.[这与上面朱莉B的回答有关]
https://stackoverflow.com/a/9215532/5885615
这个这是老话题了,但有人仍然想做某事(我最近做了)。
因此,我发现一个问题,显示 R 和 Minitab 之间的结果略有不同:QQ 图相似,但端点向外移动得更多。在深入研究代码后,我发现了差异:
函数“ppoints”用于按范围分布样本:
在R中,它有下一个源代码:
其中参数“a”取决于“n”,可以是3/8 或 1/2。
Minitab 对所有“n”使用 a = 0.3。
最明显的影响是在样品的端点上。
[It's related to the answer from Julie B: above]
https://stackoverflow.com/a/9215532/5885615
This is the old topic, but someone still can want to do something (I did it recently).
So I have found one issue showing a bit different results between R and Minitab: the QQ-plots are similar, but the end points are shifted more outside. After digging inside the code I have found the difference:
The function "ppoints" is used to distribute the sample by the range:
In R it has the next source code:
where the parameter "a", depending on "n", can be 3/8 or 1/2.
Minitab uses a = 0.3 for all "n".
The most visible effect is on the end points of the sample.