从 n 个元素集中随机选择一个元素的预期概率为 P=1.0/n 。
假设我使用无偏方法多次检查 P。 P的分布类型是什么? 很明显,P 不是正态分布,因为不能为负数。 因此,我可以正确地假设 P 是伽玛分布吗? 如果是,这个分布的参数是什么?
这里显示了从 100 个元素集中选择一个元素 1000 次的概率直方图。
有什么方法可以将其转换为标准分布
现在假设选择给定元素的观察概率为 P* (P* != P)。 如何估计偏差是否具有统计显着性?
编辑:这不是作业。 我正在做一个业余爱好项目,我需要这份统计数据。 我大约 10 年前就完成了最后的作业:-)
The expected probability of randomly selecting an element from a set of n elements is P=1.0/n .
Suppose I check P using an unbiased method sufficiently many times. What is the distribution type of P? It is clear that P is not normally distributed, since cannot be negative. Thus, may I correctly assume that P is gamma distributed? And if yes, what are the parameters of this distribution?
Histogram of probabilities of selecting an element from 100-element set for 1000 times is shown here.
Is there any way to convert this to a standard distribution
Now supposed that the observed probability of selecting the given element was P* (P* != P). How can I estimate whether the bias is statistically significant?
EDIT: This is not a homework. I'm doing a hobby project and I need this piece of statistics for it. I've done my last homework ~10 years ago:-)
发布评论
评论(4)
通过重复,您的分布将是二项式的。 因此,设 X 为您选择某个固定对象的次数,总共选择 M 次
P{ X = x } = ( M 选择 x ) * (1/N)^x * (N-1/N)^(Mx)
您可能会发现对于较大的 N 来说很难计算。事实证明,对于足够大的 N,这实际上会收敛到概率为 1 的正态分布(中心极限定理)。
如果 P{X=x} 将由正态分布给出。 平均值为 M/N,方差为 M * (1/N) * ( N-1) / N。
With repetitions, your distribution will be binomial. So let X be the number of times you select some fixed object, with M total selections
P{ X = x } = ( M choose x ) * (1/N)^x * (N-1/N)^(M-x)
You may find this difficult to compute for large N. It turns out that for sufficiently large N, this actually converges to a normal distribution with probability 1 (Central Limit theorem).
In case P{X=x} will be given by a normal distribution. The mean will be M/N and the variance will be M * (1/N) * ( N-1) / N.
这是一个明确的二项式分布,其中 p=1/(元素数量)和 n=(试验次数)。
要测试观察到的结果是否与预期结果显着不同,您可以进行二项式检验。
维基百科两个页面上的骰子示例应该可以为您提供一些关于如何表述问题的良好指导。 在您的 100 个元素、1000 次试验示例中,这就像将 100 面骰子滚动 1000 次。
This is a clear binomial distribution with p=1/(number of elements) and n=(number of trials).
To test whether the observed result differs significantly from the expected result, you can do the binomial test.
The dice examples on the two Wikipedia pages should give you some good guidance on how to formulate your problem. In your 100-element, 1000 trial example, that would be like rolling a 100-sided die 1000 times.
正如其他人所指出的,您需要二项式分布。 不过,你的问题似乎暗示着对连续近似它的兴趣。 它实际上可以通过正态分布以及泊松分布。
As others have noted, you want the Binomial distribution. Your question seems to imply an interest in a continuous approximation to it, though. It can actually be approximated by the normal distribution, and also by the Poisson distribution.
您的分布是离散均匀分布吗?
Is your distribution a discrete uniform distribution?