如何找到事件总数的置信区间
我有一个程序可以记录以一定概率p发生的事件。运行它后,我记录了k个事件。我怎样才能有把握地计算出有多少事件(无论是否有记录),比如 95%?
例如,在记录了 13 个事件后,我希望能够计算出总共有 13 到 19 个事件,置信度为 95%。
I have a program which records events that occur with some probability p. After I run it I get k events recorded. How can I calculate how many events there were, recorded or not, with some confidence, say 95%?
So for example, after getting 13 events recorded I would like to be able to calculate that there were between 13 and 19 events total with 95% confidence.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我很确定你的过程与二项式过程相同 - 记录事件的概率 p 可以被认为是成功的。我认为没有必要进一步详细说明基本流程。
你的问题的关键在于你不知道 n 的值,只知道 k 和 p。置信区间计算通常假设您知道 n & p 并且您想要一个围绕 k(成功次数)的置信区间。请参阅此处。
给定 k 和 p,您应该能够确定 n 的概率分布, q(n),然后在给定已知 p 和 q(n) 的情况下创建 k 的分布。 k 的这种分布将产生一个置信区间,对吧?
I'm pretty sure your process is the same as a binomial process - the probability p of an event being recorded can be considered a success. I don't think there's a need to elaborate further on the underlying process.
The twist in your problem is that you don't know the value of n, only k and p. Confidence interval calculations typically assume you know n & p and you want a confidence interval around k, the number of successes. See here.
Given k and p, you should be able to determine the probabiilty distribution of n, q(n), then create a distribution of k given known p and q(n). This distribution of k will yield a confidence interval, right?
如果 p 介于 0 和 1 之间:
(1/p) * k = 实际事件的典型数量
如果您的 random() 是完美的,则它将始终为真。然而,通常情况并非如此。
对于较大的 k(越大,结果基数折扣百分比越准确),它将接近实际数字,尽管它是否会准确地达到该数字值得怀疑。
If p is between 0 and 1:
(1/p) * k = typical number of actual events
If your random() is PERFECT, it will ALWAYS be true. However, this is not usually the case.
For a LARGE k (the larger, the more accurate the result base don percentage off) it will be CLOSE to the actual number, though it is doubtful that it will hit it exactly.
你的陈述的问题在于你说该事件的概率是已知的。如果这是已知的并且您知道您看到了多少个事件,那么有多少个事件就没有错误。你知道有多少录音吗?
我认为你需要重新构建你提出问题的方式或者尝试估计一些不同的东西。
或者你是说你的录音只发生在真实事件发生的 60% 的情况下。您正在测量什么以及什么构成了事件。打个比方是可以的,但按照现在的表述方式,无法根据事件的真实数量构建置信区间。
The problem with your statement is that you are saying there is a know probablitiy of the event. If that is know and you know how many events you saw there is no error in how many events there were. Do you know how many recordings there were?
I think you need to reframe the way you are asking the question or try to estimate something different.
Or are you saying your recording only happens 60% of the time when a true event happens. What is it you are measuring and what constitutes an event. An analogy would be ok - but the way it is formulated now there is no way to construct a confidence interval on the true number of events.
这是 Andrew Walker 在统计网站上给出的答案。我将接受这个作为这个问题的答案。谢谢大家。
Here is the answer that Andrew Walker gave on the stats site. I am going to accept this as the answer to this question. Thanks to everyone.