如何避免 numpy.random.choice 中的舍入错误?
假设 x_1, x_2, ..., x_n 是 n 个对象,并且想要选择其中一个,以便选择 x_i 的概率与某个数字 u_i 成正比。 Numpy 为此提供了一个函数:
x, u = np.array([x_1, x_2, ..., x_n]), np.array([u_1, ..., u_n])
np.random.choice(x, p = u/np.sum(u))
但是,我观察到这段代码有时会抛出一个 ValueError ,指出“概率之和不为 1”。这可能是由于有限精度算术的舍入误差造成的。应该怎么做才能让这个功能正常工作呢?
Say x_1, x_2, ..., x_n are n objects and one wants to pick one of them so that the probability of choosing x_i is proportional to some number u_i. Numpy provides a function for that:
x, u = np.array([x_1, x_2, ..., x_n]), np.array([u_1, ..., u_n])
np.random.choice(x, p = u/np.sum(u))
However, I have observed that this code sometimes throws a ValueError saying "probabilities do not sum to 1.". This is probably due to the round-off errors of finite precision arithmetic. What should one do to make this function work properly?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在阅读了 @Pychopath 指出的问题的答案 https://stackoverflow.com/a/60386427/6087087 后,我受到 numpy.random.multinomial https://docs.scipy.org/doc /numpy-1.15.0/reference/ generated/numpy.random.multinomial.html
假设
p
是概率数组,可能不完全是1
到舍入误差,即使我们使用 p = p/np.sum(p) 对其进行归一化。这种情况并不罕见,请参阅@pd shah在答案中的评论 https://stackoverflow.com/a/46539921/6087087。只要这样做
,问题就解决了!减法引起的舍入误差将比归一化引起的舍入误差小得多。此外,我们不必担心 p 的变化,它们是舍入误差的量级。
After reading the answer https://stackoverflow.com/a/60386427/6087087 to the question pointed by @Pychopath, I have found the following solution, inspired by the documentation of numpy.random.multinomial https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.multinomial.html
Say
p
is the array of probabilities which may not be exactly1
due to roundoff errors, even if we normalized it withp = p/np.sum(p)
. This is not rare, see the comment by @pd shah at the answer https://stackoverflow.com/a/46539921/6087087.Just do
And the problem is solved! The roundoff errors due to subtraction will be much smaller than roundoff errors due to normalization. Moreover, one need not worry about the changes in p, they are of the order of roundoff errors.
根据NumPy文档我们必须使用
p1-D 类似数组
。所以我认为如果 u 数组是概率数组那么你可以尝试一下:
或者
According to NumPy documentation we have to use
p1-D array-like
.So i think if u-array is array of probabilities then you can try it:
or