在水库采样中重复使用随机数
最近有人提出与另一个问题相关的问题: 给定一个未知长度的列表,只需扫描一次即可返回其中的随机项目
我知道你不应该,我只是无法给出一个规范的解释来解释为什么不应该。
看看在示例代码中:
import random, sys
def rnd(): # a function that returns a random number each call
return int(random.getrandbits(32))
class fixed: # a functor that returns the same random number each call
def __init__(self):
self._ret = rnd()
def __call__(self):
return self._ret
def sample(rnd,seq_size):
choice = 0
for item in xrange(1,seq_size):
if (rnd() % (item+1)) == 0:
choice = item
return choice
dist = [0 for i in xrange(500)]
for i in xrange(1000):
dist[sample(rnd,len(dist))] += 1
print "real",dist
print
dist = [0 for i in xrange(500)]
for i in xrange(1000):
dist[sample(fixed(),len(dist))] += 1
print "reuse",dist
为每个项目生成新随机数的正确水库采样的选择是均匀分布的,因为它应该是:
real [1, 3, 0, 1, 2, 3, 2, 3, 1, 2, 2, 2, 2, 0, 0, 1, 3, 3, 4, 0, 2, 1, 2, 1, 1, 4, 0, 3, 1, 1, 2, 0, 0, 0, 1, 4, 6, 2, 3, 1, 1, 3, 2, 1, 3, 3, 1, 4, 1, 1, 2, 2, 5, 1, 2, 1, 0, 3, 1, 0, 2, 6, 1, 2, 2, 1, 1, 1, 1, 3, 2, 1, 5, 4, 0, 3, 3, 4, 0, 0, 2, 1, 3, 2, 3, 0, 2, 4, 6, 3, 0, 1, 3, 0, 2, 2, 4, 3, 2, 1, 2, 1, 2, 2, 1, 4, 2, 0, 0, 1, 1, 0, 1, 4, 2, 2, 2, 1, 0, 3, 1, 2, 1, 0, 2, 2, 1, 5, 1, 5, 3, 3, 1, 0, 2, 2, 0, 3, 2, 3, 0, 1, 1, 3, 0, 1, 2, 2, 0, 1, 2, 2, 3, 2, 3, 1, 1, 0, 1, 2, 2, 2, 2, 2, 3, 2, 1, 2, 2, 2, 1, 3, 3, 1, 0, 1, 1, 0, 1, 3, 2, 1, 4, 3, 4, 1, 1, 1, 2, 1, 2, 0, 0, 0, 1, 1, 2, 6, 0, 1, 1, 0, 1, 0, 1, 2, 2, 3, 0, 1, 2, 2, 1, 0, 4, 2, 1, 2, 2, 0, 4, 4, 0, 3, 2, 2, 1, 2, 4, 1, 2, 1, 0, 2, 1, 1, 5, 1, 2, 2, 3, 2, 3, 0, 1, 2, 3, 2, 5, 2, 3, 0, 1, 1, 1, 1, 3, 4, 2, 4, 1, 2, 3, 2, 5, 2, 1, 0, 1, 1, 2, 2, 3, 1, 1, 1, 2, 1, 2, 0, 4, 1, 1, 2, 3, 4, 3, 1, 2, 3, 3, 3, 2, 1, 2, 0, 0, 4, 3, 2, 2, 5, 5, 3, 3, 3, 1, 0, 1, 3, 1, 1, 2, 4, 3, 1, 4, 4, 2, 5, 0, 5, 4, 2, 1, 0, 4, 1, 3, 3, 2, 4, 2, 3, 3, 1, 3, 3, 4, 2, 2, 1, 1, 1, 1, 3, 3, 5, 3, 2, 4, 0, 1, 3, 2, 2, 4, 2, 2, 3, 4, 5, 3, 2, 1, 2, 3, 2, 2, 2, 4, 4, 0, 1, 3, 3, 3, 4, 1, 2, 4, 0, 4, 0, 3, 2, 1, 1, 4, 2, 1, 0, 0, 0, 4, 2, 2, 1, 4, 3, 1, 1, 3, 2, 4, 3, 4, 2, 1, 1, 2, 2, 3, 3, 1, 2, 2, 1, 1, 2, 3, 1, 9, 1, 3, 4, 2, 4, 4, 0, 1, 0, 1, 0, 2, 1, 0, 1, 2, 3, 3, 6, 2, 2, 1, 2, 4, 3, 3, 3, 2, 1, 2, 1, 2, 8, 2, 3, 1, 5, 3, 0, 2, 1, 1, 4, 2, 2, 1, 2, 3, 2, 1, 0, 4, 3, 4, 3, 1, 3, 2, 3, 2, 2, 1, 0, 1, 2, 5, 3, 0, 3, 1, 2, 2, 2, 1, 0, 1, 4]
而当您对所有项目重复使用相同的随机数时,您会得到一个偏向于的分布非常低的数字:
reuse [92, 50, 34, 19, 23, 16, 13, 9, 9, 9, 11, 10, 6, 7, 8, 5, 5, 6, 4, 2, 2, 3, 2, 3, 3, 6, 6, 1, 4, 3, 5, 2, 2, 1, 1, 2, 3, 4, 3, 4, 1, 3, 1, 0, 0, 1, 5, 3, 1, 2, 0, 2, 0, 1, 1, 6, 2, 0, 2, 2, 4, 2, 2, 0, 2, 2, 2, 0, 3, 0, 4, 1, 2, 1, 4, 2, 2, 0, 1, 0, 1, 1, 0, 0, 0, 2, 0, 0, 2, 0, 0, 1, 0, 0, 1, 0, 2, 0, 0, 1, 2, 1, 3, 1, 0, 1, 2, 0, 4, 3, 0, 0, 2, 0, 0, 1, 0, 0, 2, 0, 2, 1, 0, 1, 0, 0, 1, 1, 3, 0, 1, 1, 0, 2, 0, 1, 2, 0, 1, 1, 4, 1, 1, 1, 2, 1, 0, 1, 2, 0, 2, 1, 1, 2, 0, 1, 1, 0, 2, 0, 2, 0, 0, 2, 0, 1, 0, 2, 1, 1, 0, 0, 1, 2, 4, 1, 0, 2, 0, 1, 2, 1, 3, 0, 1, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 3, 2, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 4, 1, 0, 2, 1, 0, 0, 2, 1, 1, 3, 3, 2, 0, 1, 0, 2, 0, 1, 1, 0, 0, 3, 1, 0, 0, 1, 0, 3, 2, 2, 0, 0, 0, 0, 0, 2, 0, 1, 0, 2, 0, 4, 1, 0, 0, 2, 0, 1, 1, 0, 0, 3, 1, 3, 2, 2, 1, 3, 1, 2, 0, 1, 1, 3, 0, 3, 1, 2, 0, 2, 0, 2, 0, 3, 0, 3, 0, 3, 1, 0, 2, 3, 1, 1, 0, 1, 3, 3, 1, 1, 1, 0, 2, 1, 1, 4, 1, 1, 1, 2, 0, 3, 1, 1, 0, 4, 1, 1, 0, 1, 3, 1, 0, 1, 1, 0, 3, 3, 0, 2, 4, 0, 1, 2, 1, 6, 1, 0, 0, 0, 0, 1, 2, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 4, 2, 0, 1, 2, 0, 1, 4, 1, 2, 0, 5, 2, 2, 0, 6, 2, 2, 1, 3, 0, 3, 1, 1, 0, 3, 1, 4, 2, 0, 1, 0, 1, 2, 3, 1, 1, 3, 0, 0, 0, 1, 1, 4, 3, 3, 0, 0, 1, 0, 1, 1, 2, 1, 0, 2, 1, 4, 5, 1, 1, 3, 0, 1, 1, 1, 3, 1, 1, 0, 3, 3, 1, 3, 0, 1, 0, 0, 1, 1, 3, 2, 1, 0, 3, 1, 1, 3, 1, 3, 1, 2, 2, 2, 0, 0, 5, 1, 3, 0, 1, 4, 1, 1, 1, 3, 2, 1, 3, 2, 1, 3, 1, 2, 2, 3, 2, 2, 1, 0, 3, 3, 1, 3, 3, 3, 2, 1, 2, 3, 3, 3, 1, 2, 2, 2, 4, 2, 1, 5, 2, 2, 0]
这里的数学是什么?为什么不能重复使用相同的随机数?
It was asked in relation to another question recently: Given an unknown length list, return a random item in it by scanning it only 1 time
I know you shouldn't, I just can't put my finger on a canonical explanation of why not.
Look at the example code:
import random, sys
def rnd(): # a function that returns a random number each call
return int(random.getrandbits(32))
class fixed: # a functor that returns the same random number each call
def __init__(self):
self._ret = rnd()
def __call__(self):
return self._ret
def sample(rnd,seq_size):
choice = 0
for item in xrange(1,seq_size):
if (rnd() % (item+1)) == 0:
choice = item
return choice
dist = [0 for i in xrange(500)]
for i in xrange(1000):
dist[sample(rnd,len(dist))] += 1
print "real",dist
print
dist = [0 for i in xrange(500)]
for i in xrange(1000):
dist[sample(fixed(),len(dist))] += 1
print "reuse",dist
The choices for the proper reservoir sampling that generates a new random number per item is nicely evenly distributed as it should be:
real [1, 3, 0, 1, 2, 3, 2, 3, 1, 2, 2, 2, 2, 0, 0, 1, 3, 3, 4, 0, 2, 1, 2, 1, 1, 4, 0, 3, 1, 1, 2, 0, 0, 0, 1, 4, 6, 2, 3, 1, 1, 3, 2, 1, 3, 3, 1, 4, 1, 1, 2, 2, 5, 1, 2, 1, 0, 3, 1, 0, 2, 6, 1, 2, 2, 1, 1, 1, 1, 3, 2, 1, 5, 4, 0, 3, 3, 4, 0, 0, 2, 1, 3, 2, 3, 0, 2, 4, 6, 3, 0, 1, 3, 0, 2, 2, 4, 3, 2, 1, 2, 1, 2, 2, 1, 4, 2, 0, 0, 1, 1, 0, 1, 4, 2, 2, 2, 1, 0, 3, 1, 2, 1, 0, 2, 2, 1, 5, 1, 5, 3, 3, 1, 0, 2, 2, 0, 3, 2, 3, 0, 1, 1, 3, 0, 1, 2, 2, 0, 1, 2, 2, 3, 2, 3, 1, 1, 0, 1, 2, 2, 2, 2, 2, 3, 2, 1, 2, 2, 2, 1, 3, 3, 1, 0, 1, 1, 0, 1, 3, 2, 1, 4, 3, 4, 1, 1, 1, 2, 1, 2, 0, 0, 0, 1, 1, 2, 6, 0, 1, 1, 0, 1, 0, 1, 2, 2, 3, 0, 1, 2, 2, 1, 0, 4, 2, 1, 2, 2, 0, 4, 4, 0, 3, 2, 2, 1, 2, 4, 1, 2, 1, 0, 2, 1, 1, 5, 1, 2, 2, 3, 2, 3, 0, 1, 2, 3, 2, 5, 2, 3, 0, 1, 1, 1, 1, 3, 4, 2, 4, 1, 2, 3, 2, 5, 2, 1, 0, 1, 1, 2, 2, 3, 1, 1, 1, 2, 1, 2, 0, 4, 1, 1, 2, 3, 4, 3, 1, 2, 3, 3, 3, 2, 1, 2, 0, 0, 4, 3, 2, 2, 5, 5, 3, 3, 3, 1, 0, 1, 3, 1, 1, 2, 4, 3, 1, 4, 4, 2, 5, 0, 5, 4, 2, 1, 0, 4, 1, 3, 3, 2, 4, 2, 3, 3, 1, 3, 3, 4, 2, 2, 1, 1, 1, 1, 3, 3, 5, 3, 2, 4, 0, 1, 3, 2, 2, 4, 2, 2, 3, 4, 5, 3, 2, 1, 2, 3, 2, 2, 2, 4, 4, 0, 1, 3, 3, 3, 4, 1, 2, 4, 0, 4, 0, 3, 2, 1, 1, 4, 2, 1, 0, 0, 0, 4, 2, 2, 1, 4, 3, 1, 1, 3, 2, 4, 3, 4, 2, 1, 1, 2, 2, 3, 3, 1, 2, 2, 1, 1, 2, 3, 1, 9, 1, 3, 4, 2, 4, 4, 0, 1, 0, 1, 0, 2, 1, 0, 1, 2, 3, 3, 6, 2, 2, 1, 2, 4, 3, 3, 3, 2, 1, 2, 1, 2, 8, 2, 3, 1, 5, 3, 0, 2, 1, 1, 4, 2, 2, 1, 2, 3, 2, 1, 0, 4, 3, 4, 3, 1, 3, 2, 3, 2, 2, 1, 0, 1, 2, 5, 3, 0, 3, 1, 2, 2, 2, 1, 0, 1, 4]
Whereas when you re-use the same random number for all items, you get a distribution skewed to the very low numbers:
reuse [92, 50, 34, 19, 23, 16, 13, 9, 9, 9, 11, 10, 6, 7, 8, 5, 5, 6, 4, 2, 2, 3, 2, 3, 3, 6, 6, 1, 4, 3, 5, 2, 2, 1, 1, 2, 3, 4, 3, 4, 1, 3, 1, 0, 0, 1, 5, 3, 1, 2, 0, 2, 0, 1, 1, 6, 2, 0, 2, 2, 4, 2, 2, 0, 2, 2, 2, 0, 3, 0, 4, 1, 2, 1, 4, 2, 2, 0, 1, 0, 1, 1, 0, 0, 0, 2, 0, 0, 2, 0, 0, 1, 0, 0, 1, 0, 2, 0, 0, 1, 2, 1, 3, 1, 0, 1, 2, 0, 4, 3, 0, 0, 2, 0, 0, 1, 0, 0, 2, 0, 2, 1, 0, 1, 0, 0, 1, 1, 3, 0, 1, 1, 0, 2, 0, 1, 2, 0, 1, 1, 4, 1, 1, 1, 2, 1, 0, 1, 2, 0, 2, 1, 1, 2, 0, 1, 1, 0, 2, 0, 2, 0, 0, 2, 0, 1, 0, 2, 1, 1, 0, 0, 1, 2, 4, 1, 0, 2, 0, 1, 2, 1, 3, 0, 1, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 3, 2, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 4, 1, 0, 2, 1, 0, 0, 2, 1, 1, 3, 3, 2, 0, 1, 0, 2, 0, 1, 1, 0, 0, 3, 1, 0, 0, 1, 0, 3, 2, 2, 0, 0, 0, 0, 0, 2, 0, 1, 0, 2, 0, 4, 1, 0, 0, 2, 0, 1, 1, 0, 0, 3, 1, 3, 2, 2, 1, 3, 1, 2, 0, 1, 1, 3, 0, 3, 1, 2, 0, 2, 0, 2, 0, 3, 0, 3, 0, 3, 1, 0, 2, 3, 1, 1, 0, 1, 3, 3, 1, 1, 1, 0, 2, 1, 1, 4, 1, 1, 1, 2, 0, 3, 1, 1, 0, 4, 1, 1, 0, 1, 3, 1, 0, 1, 1, 0, 3, 3, 0, 2, 4, 0, 1, 2, 1, 6, 1, 0, 0, 0, 0, 1, 2, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 4, 2, 0, 1, 2, 0, 1, 4, 1, 2, 0, 5, 2, 2, 0, 6, 2, 2, 1, 3, 0, 3, 1, 1, 0, 3, 1, 4, 2, 0, 1, 0, 1, 2, 3, 1, 1, 3, 0, 0, 0, 1, 1, 4, 3, 3, 0, 0, 1, 0, 1, 1, 2, 1, 0, 2, 1, 4, 5, 1, 1, 3, 0, 1, 1, 1, 3, 1, 1, 0, 3, 3, 1, 3, 0, 1, 0, 0, 1, 1, 3, 2, 1, 0, 3, 1, 1, 3, 1, 3, 1, 2, 2, 2, 0, 0, 5, 1, 3, 0, 1, 4, 1, 1, 1, 3, 2, 1, 3, 2, 1, 3, 1, 2, 2, 3, 2, 2, 1, 0, 3, 3, 1, 3, 3, 3, 2, 1, 2, 3, 3, 3, 1, 2, 2, 2, 4, 2, 1, 5, 2, 2, 0]
What's the maths here? Why can't you re-use the same random number?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
编辑,回应评论:
水库采样应该的工作方式是:您希望从每个现有的箱中选择正确比例的样本,以便以相同的概率组成一个额外的箱。在您的
sample()
循环中,假设您已随机采样了item
bin 之一,您需要从每个 bin 中以概率 <代码>1/(项目+1)。然而,使用
fixed()
,选择决策和先前的 bin 编号都取决于相同的固定 32 位数字。这意味着从每个箱中取出样本的可能性将不均匀。考虑在
sample()
循环的第三次迭代期间发生的情况。您有三个现有的 bin(0、1 和 2),并且您希望在每个 bin 中选取 1/4 的样本并将它们添加到新创建的 bin 3 中。请注意,所有 32 位
fixed() bin 1 中的数字将为偶数(因为第一遍选择了所有可被 2 整除的数字),bin 0 中的所有数字将为奇数(因为偶数已移至 bin 1)。第二遍将所有可被三整除的数字移动到 bin 2(到目前为止应该没问题,并且不会改变 bin 0 和 1 中的偶数/奇数除法)。
然后,第三遍将所有可被 4 整除的
fixed()
数字移动到 bin 3 中。但是,这将从 bin 1 中选择一半的数字(因为所有偶数中有一半可以被 4 整除),并且没有一个数字可以被 4 整除。 bin 0 中的数字(因为它们都是奇数)。因此,即使新垃圾箱的预期大小应该是正确的,旧垃圾箱的预期大小也不再相同。这就是
fixed()
生成不均匀分布的方式:如果该数字以可预测的方式取决于用于选择原始垃圾箱的数字。随机数的基本属性是,从统计意义上讲,每个样本必须独立于前面的样本分布。基于随机数的算法取决于此属性。
伪随机数生成器 (PRNG) 实际上并不是随机的;如您所知,它们的结果实际上是一个固定的序列。 PRNG 结果被故意打乱,以便在大多数情况下它们的行为与实际随机数足够相似。然而,如果 PRNG 对于特定应用程序而言“较弱”,则 PRNG 的内部工作原理可能会以奇怪的方式与应用程序的细节进行交互,从而产生非常非随机的结果。
您在这里尝试通过重复使用相同的随机数来构建一个糟糕的 PRNG。实际结果取决于应用程序如何使用随机数的细节...
尽管
fixed()
是一个故意破坏的 PRNG,但许多商业库 PRNG 都是“弱的” ”,并且最终可能会与某些应用程序进行类似的奇怪交互。实际上,“弱点”是与应用程序相关的——并且,虽然有广泛使用的统计测试来尝试暴露弱 PRNG,但不能保证您的应用程序不会偶然发现偶数的一些奇数相关性。一个“强”的 PRNG。Edit, in response to comment:
The way reservoir sampling should work is: you want to select exactly the right proportion of samples from each of the existing bins in order to make up an additional bin with the same probability. In your
sample()
loop, given that you have randomly sampled one ofitem
bins, you need to select samples from each bin with probability1/(item + 1)
.However, with
fixed()
, both the selection decision and the previous bin number depend on the same fixed 32-bit number. This means that the likelihood that a sample is removed from each of the bins will not be uniform.Consider what happens during the third iteration of the
sample()
loop. You have three existing bins (0, 1, and 2), and you want to pick 1/4 of the samples in each and add them to a newly created bin 3.Note that all the 32-bit
fixed()
numbers in bin 1 will be even (because the first pass selected all numbers divisible by 2), and all the numbers in bin 0 will be odd (because the even ones were moved to bin 1). The second pass moves all numbers divisible by three to bin 2 (which should be OK so far, and does not change the even/odd division in bins 0 and 1).The third pass then moves all
fixed()
numbers divisible by 4 into bin 3. But, this will select half the numbers from bin 1 (because half of all even numbers are divisible by 4), and none of the numbers from bin 0 (because they are all odd). So, even though the expected size of the new bin should be correct, the expected sizes of the old bins are no longer the same.That is how
fixed()
generates an uneven distribution: the implicit assumption, that you can select an exact fraction of each bin by choosing a random number, is violated if that number depends in a predictable way on the numbers used to choose the original bin.The basic property of random numbers is that each sample must be independently distributed from the preceding samples, in a statistical sense. Algorithms based on random numbers depend on this property.
Pseudo-random number generators (PRNG's) are not actually random; as you know, their results are actually a fixed sequence. The PRNG results are deliberately scrambled so that they act enough like actual random numbers for most purposes. However, if the PRNG is "weak" for a particular application, the inner workings of the PRNG can interact with the details of the application in odd ways, to very non-random results.
What you're trying out here, by re-using the same random number, is building a bad PRNG. The actual results depend on the details of how the application uses the random numbers...
Even though
fixed()
is an intentionally broken PRNG, many commercial library PRNG's are "weak", and can end up with similar weird interactions with some applications. As a practical matter, "weakness" is relative to the application -- and, while there are statistical tests that are widely used to try to expose weak PRNG's, there is no guarantee that your application won't stumble on some odd correlation of even a "strong" PRNG.如果您每次选择一个随机数,则流中的下一个项目有 1/CURRENTSIZE 的机会击败前一个选择的项目。
那么每个流一个随机数有什么问题呢?为什么它会扭曲分布?
我还没有找到完整的答案,但我有一个想法。
例如,让我们采用 100 个数字的流并选择一个随机数 0...999。现在我们从第二项的角度来看。
什么时候能赢呢?嗯,首先,它需要是N%2==0。所以它必须是偶数。此外,它还被流中每个 2 的倍数的所有其他倍数击败,4...6...8....10 等。但它在例如 106 上获胜。
计算它获胜的所有数字, 0..999,我们得到了 81 次!
现在我们取 4,它需要是 N%4==0 并且它被 4 到 N 的所有倍数击败(8...12....16)。如果我们计算4可以赢多少次,我们得到45次......!所以分配不公平。
如果您使随机数达到流的最大大小,则可以修复此问题,然后所有人都有 1 次获胜机会,从而再次实现均匀分布。
例如,如果流大小为 100,则选择随机数 0..199。我们知道前 100 个随机数都恰好有 1 个匹配项,因此它们分布均匀。但是随机数 99...199 会发生什么呢?分布不均匀!例如,对于 1,101 只会给出 101%X==0。这对于所有素数都是如此(101、103、107、109、113、127、131、137、139、149、151、157、163、 167、173、179、181、 191、193、197、199)。因此,第一个项目比其他项目有更大的获胜机会。
如果您为每个项目选择一个新的随机数,则情况并非如此,在这种情况下可以添加机会。例如,当第一项出现时,它有获胜的机会:
NOT(1/2 + 1/3 + 1/4 + 1/5 + 1/6 (...etc))
If you pick a random number each time, the next item from the stream has 1/CURRENTSIZE chance of beating the previous picked item.
So what is the problem with one random number per stream? Why does it skew the distribution?
I haven't found a complete answer yet, but I have an idea.
For example, lets take a stream of 100 numbers and pick a random number 0...999. Now we look at it from viewpoint of the second item.
When does it win? Well, first of all, it needs to be N%2==0. So it has to be an even number. Also, it is also beat by every other multiple of each multiple of 2 in the stream, 4...6...8....10 etc. But it wins for example on 106.
Calculating all numbers it wins with, 0..999 and we get 81 times!
Now lets take 4, it needs to be N%4==0 and it is beat by all multiples of 4 to N (8...12....16). If we calculate how many times 4 can win, we get 45 times...! So the distribution isn't fair.
This can be fixed if you make the random number maximum the size of the stream, then all have 1 chance to win, making it an even distribution again.
For example, if we have a stream size of 100, and we pick a random number of 0..199. We know the first 100 random numbers all have exacly 1 match, so they are distributed evenly. But what happens with random numbers 99...199? The distribution isn't even! For example 101 will only give 101%X==0 for 1. This is true for all the prime numbers (101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199). So item one has a LOT larger chance to win than the others.
This isn't the case if you pick a new random number for each item, in that case the chances can be added. For example when item one comes along it has a chance of winning that is:
NOT(1/2 + 1/3 + 1/4 + 1/5 + 1/6 (...etc))
想一想:当你的固定数字非常小时会发生什么?
Think about this: What happens when your fixed number is really small?