r cdplot() - 右轴是否显示概率或密度?

发布于 2025-01-27 22:44:53 字数 667 浏览 5 评论 0原文

可重复的数据:

## NASA space shuttle o-ring failures
fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1,
                 1, 2, 1, 1, 1, 1, 1),
               levels = 1:2, labels = c("no", "yes"))
temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70,
                 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81)

## CD plot
cdplot(fail ~ temperature)

CDPLOT的文档说:

CDPLOT计算X的条件密度,鉴于Y的边缘分布的加权水平。 这些密度是在y的水平上累积的条件概率不是通过离散化(如自旋图中的)而而是通过密度使用平滑方法来得出的。强>可见。

因此,在x = 63的图上,y = 0.4(大约)。这概率还是概率密度?关于计算的内容,返回的内容以及绘制的内容,我感到困惑。

Reproducible data:

## NASA space shuttle o-ring failures
fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1,
                 1, 2, 1, 1, 1, 1, 1),
               levels = 1:2, labels = c("no", "yes"))
temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70,
                 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81)

## CD plot
cdplot(fail ~ temperature)

The documentation for cdplot says:

cdplot computes the conditional densities of x given the levels of y weighted by the marginal distribution of y. The densities are derived cumulatively over the levels of y. The conditional probabilities are not derived by discretization (as in the spinogram), but using a smoothing approach via density.The conditional density functions (cumulative over the levels of y) are returned invisibly.

So on the plot where x = 63, y = 0.4 (approximately). Is this probability, or probability density? I am confused by the documentation as to what is calculated, what is returned and what is plotted.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

ま昔日黯然 2025-02-03 22:44:53

该图显示了给定温度的结果的概率

文档所说的是,为温度测量计算标准密度分布,当失败为'no'时,温度分别处理密度。如果我们将“否”温度的密度除以所有温度的密度,然后以“否”温度的比例加权,那么我们将获得在给定温度下绘制“否”的概率的估计。

为了证明这种情况,让我们看看CDPLOT:

cdplot(fail ~ temperature)

​我们应该在曲线上获得几乎相同的形状

all <- density(temperature, from = min(temperature), to = max(temperature))

no  <- density(temperature[fail == "no"], from = min(temperature), 
                 to = max(temperature))

probs <- no$y/all$y * proportions(table(fail))[1]

plot(all$x, 1 - probs, type = "l", ylim = c(0, 1))

”在此处输入图像描述

The plot shows the probability of an outcome for a given temperature.

What the docs are saying is that a standard density distribution is calculated for temperature measurements, and a density is worked out separately for temperature when fail is 'no'. If we divide the density of "no" temperatures by the density of all temperatures, then weight this by the proportion of 'no' temperatures, then we will get an estimate of the probability of drawing a "no" at a given temperature.

To show this is the case, let's see the cdplot:

cdplot(fail ~ temperature)

enter image description here

Now let's calculate the probabilities from the marginal densities manually and plot. We should get a near-identical shape to our curve

all <- density(temperature, from = min(temperature), to = max(temperature))

no  <- density(temperature[fail == "no"], from = min(temperature), 
                 to = max(temperature))

probs <- no$y/all$y * proportions(table(fail))[1]

plot(all$x, 1 - probs, type = "l", ylim = c(0, 1))

enter image description here

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文