如何制作相对频率正态分布?

发布于 2025-02-07 04:48:35 字数 1280 浏览 4 评论 0原文

好的,基本上我必须绘制一个相对频率直方图(我已经完成了),但是我还必须在其上绘制正态分布曲线。无论我怎么做,它总是出于绝对频率而不是相对的自由度。

这就是我到目前为止的:

set.seed(1099)

N <- 1520
n_1 <- 4
n_2 <- 30
n_3 <- 76
Valor_esperado = (8 + 12)/2
Variancia = (12-8)^2/12

Amostra_1 <- matrix( runif(N*n_1,min = 8,max = 12)
             , nrow = n_1)

Amostra_2 <- matrix( runif(N*n_2,min = 8,max = 12)
, nrow = n_2)

Amostra_3 <- matrix( runif(N*n_3,min = 8,max = 12)
, nrow = n_3)


media_1 <- colMeans(Amostra_1)
media_2 <- colMeans(Amostra_2)
media_3 <- colMeans(Amostra_3)


Amostra_1 <- as.numeric(unlist(media_1))
Amostra_2 <- as.numeric(unlist(media_2))
Amostra_3 <- as.numeric(unlist(media_3))

#par(mfrow=c(2,2))

h <- hist(Amostra_1, plot=FALSE)
h$density = h$counts/sum(h$counts) * 100
plot(h, main="n = 4",
     xlab = NULL,
     ylab="Frequência Relativa",
     col="blue",
     freq=FALSE)


h <- hist(Amostra_2, plot=FALSE)
h$density = h$counts/sum(h$counts) * 100
plot(h, main="n = 30",
     xlab = NULL,
     ylab="Frequência Relativa",
     col="red",
     freq=FALSE)

h <- hist(Amostra_3, plot=FALSE)
h$density = h$counts/sum(h$counts) * 100
plot(h, main="n = 76",
     xlab = NULL,
     ylab="Frequência Relativa",
     col="yellow",
     freq=FALSE)

Ok so basically I have to plot a relative frequency histogram (which I've done) but I also have to plot a normal distribution curve over it. And no matter how I do it it's always for absolute frequency and not relative freqency.

This is what I have so far:

set.seed(1099)

N <- 1520
n_1 <- 4
n_2 <- 30
n_3 <- 76
Valor_esperado = (8 + 12)/2
Variancia = (12-8)^2/12

Amostra_1 <- matrix( runif(N*n_1,min = 8,max = 12)
             , nrow = n_1)

Amostra_2 <- matrix( runif(N*n_2,min = 8,max = 12)
, nrow = n_2)

Amostra_3 <- matrix( runif(N*n_3,min = 8,max = 12)
, nrow = n_3)


media_1 <- colMeans(Amostra_1)
media_2 <- colMeans(Amostra_2)
media_3 <- colMeans(Amostra_3)


Amostra_1 <- as.numeric(unlist(media_1))
Amostra_2 <- as.numeric(unlist(media_2))
Amostra_3 <- as.numeric(unlist(media_3))

#par(mfrow=c(2,2))

h <- hist(Amostra_1, plot=FALSE)
h$density = h$counts/sum(h$counts) * 100
plot(h, main="n = 4",
     xlab = NULL,
     ylab="Frequência Relativa",
     col="blue",
     freq=FALSE)


h <- hist(Amostra_2, plot=FALSE)
h$density = h$counts/sum(h$counts) * 100
plot(h, main="n = 30",
     xlab = NULL,
     ylab="Frequência Relativa",
     col="red",
     freq=FALSE)

h <- hist(Amostra_3, plot=FALSE)
h$density = h$counts/sum(h$counts) * 100
plot(h, main="n = 76",
     xlab = NULL,
     ylab="Frequência Relativa",
     col="yellow",
     freq=FALSE)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

追星践月 2025-02-14 04:48:35

鉴于您已定义的直方图,您需要一个集成到(100*binwidth)而不是1的高斯曲线

binwidth <- diff(h$breaks)[1]
curve(dnorm(x, mean = mean(Amostra_1), 
            sd = sd(Amostra_1)) * binwidth*100, 
      add = TRUE)

。因为直方图仅基于条高(bin密度),而不考虑理论曲线的峰值。修复此操作的简单/粗略方法是添加ylim = C(0,最大(H $密度)*1.1)绘制直方图时,以扩展最大值(一个“正确”) ,稍微烦人的方法是计算max(H $密度),计算dnorm(0,...)*binwidth*100(理论曲线的最大值),并在设置ylim时使用这两个值的最大值)。

Given the histogram you've defined, you need a Gaussian curve that integrates to (100*binwidth) rather than 1. This should do it (for example):

binwidth <- diff(h$breaks)[1]
curve(dnorm(x, mean = mean(Amostra_1), 
            sd = sd(Amostra_1)) * binwidth*100, 
      add = TRUE)

In this particular case the top of the curve gets clipped because the y-axis for the histogram is only based on the bar heights (bin densities), not considering the peak of the theoretical curve. The simple/crude way to fix this would be to add ylim = c(0, max(h$density)*1.1) when plotting your histogram, to extend the maximum a bit (one "correct", slightly more annoying way is to compute max(h$density), compute dnorm(0, ...)*binwidth*100 (the max value of the theoretical curve), and use the maximum of these two values when setting ylim).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文