但你不必做这个积分,因为你有 z 表。 z表给你x在-infinite和x之间的概率(应用将x与z相关的方程)
我这里没有matlab,但我猜你提到的直线是累积分布函数,它告诉你x 在 [-infinite, x] 之间的概率,由从 -infinite 到 x 值(或在 z 表中获得)的总和(在本例中为积分)确定。
抱歉,如果我的英语不好。 我希望我有帮助。
The normal distribution is a density function. The probability of any single value will be 0. This because you have the total probability ( = 1) distributed between an infinite number of values (its a continuous function).
What you have there in the graph (of the normal distribution) is how the probability is distributed (y axis) around the values (x axis). So what you can get from the graph is the probability of an interval either between 2 points, from -infinite to any point, or from any point to +infinte. This probability is obtained integrating the function (of the normal distribution) defined from point1 to point2.
But you don't have to do this integral since you have the z table. The z table gives you the probability of x being between -infinite and x (aplying the equation that relates x to z)
I don't have matlab here, but i guess the straight line you mention is the cumulative distribution function, which tells you the probability of x between [-infinite, x], and is determined by the sum (or integral in this case) from -infinite to the value of x (or obtained in the z table)
Sorry if my english was bad. I hope i was helpful.
My question is, how do I interpret this? Does this mean that my data is normally distributed but has a non-zero mean (i.e. not standard normal) or does this probability only reflect something else?
You are correct. If you run normplot and get data very close to the fitted line, that means your data has a cumulative distribution function that is very close to a normal distribution. The 0.5 CDF point corresponds to the mean value of the fitted normal distribution. (Looks like about 0.002 in your case)
The reason you get a straight line is that the y-axis is nonlinear, and it's made to be "warped" in such a way that a perfect Gaussian cumulative distribution would map into a line: the y-axis marks are linear with the inverse error function.
When you look at the ends and they have steeper slopes than the fitted line, that means your distribution has shorter tails than a normal distribution, i.e. there are fewer outliers, perhaps due to some physical constraint that prevents excessive variation from the mean.
发布评论
评论(2)
正态分布是密度函数。任何单个值的概率均为 0。这是因为总概率 (= 1) 分布在无限多个值之间(它是一个连续函数)。
(正态分布)图中的内容是概率如何围绕值(x 轴)分布(y 轴)。因此,您可以从图中得到的是 2 个点之间、从负无穷到任意点或从任意点到正无穷的区间的概率。该概率是通过对从 point1 到 point2 定义的(正态分布)函数进行积分而获得的。
但你不必做这个积分,因为你有 z 表。 z表给你x在-infinite和x之间的概率(应用将x与z相关的方程)
我这里没有matlab,但我猜你提到的直线是累积分布函数,它告诉你x 在 [-infinite, x] 之间的概率,由从 -infinite 到 x 值(或在 z 表中获得)的总和(在本例中为积分)确定。
抱歉,如果我的英语不好。
我希望我有帮助。
The normal distribution is a density function. The probability of any single value will be 0. This because you have the total probability ( = 1) distributed between an infinite number of values (its a continuous function).
What you have there in the graph (of the normal distribution) is how the probability is distributed (y axis) around the values (x axis). So what you can get from the graph is the probability of an interval either between 2 points, from -infinite to any point, or from any point to +infinte. This probability is obtained integrating the function (of the normal distribution) defined from point1 to point2.
But you don't have to do this integral since you have the z table. The z table gives you the probability of x being between -infinite and x (aplying the equation that relates x to z)
I don't have matlab here, but i guess the straight line you mention is the cumulative distribution function, which tells you the probability of x between [-infinite, x], and is determined by the sum (or integral in this case) from -infinite to the value of x (or obtained in the z table)
Sorry if my english was bad.
I hope i was helpful.
你是对的。如果您运行normplot并获得非常接近拟合线的数据,则意味着您的数据具有累积分布函数 非常接近正态分布。 0.5 CDF 点对应于拟合正态分布的平均值。 (在你的例子中看起来大约是 0.002)
你得到一条直线的原因是 y 轴是非线性的,并且它被“扭曲”,这样完美的高斯累积分布就会映射成一条线: y 轴标记与反误差函数成线性关系。
当您查看两端时,它们的斜率比拟合线更陡,这意味着您的分布的尾部比正态分布短,即离群值更少,这可能是由于某些物理约束阻止了平均值的过度变化。
You are correct. If you run normplot and get data very close to the fitted line, that means your data has a cumulative distribution function that is very close to a normal distribution. The 0.5 CDF point corresponds to the mean value of the fitted normal distribution. (Looks like about 0.002 in your case)
The reason you get a straight line is that the y-axis is nonlinear, and it's made to be "warped" in such a way that a perfect Gaussian cumulative distribution would map into a line: the y-axis marks are linear with the inverse error function.
When you look at the ends and they have steeper slopes than the fitted line, that means your distribution has shorter tails than a normal distribution, i.e. there are fewer outliers, perhaps due to some physical constraint that prevents excessive variation from the mean.