我应该如何处理浮点数,这些数字可能变得很小以至于变成零
所以我只是修复了以下代码中的一个有趣的错误,但我不确定我采取的最佳方法:
p = 1
probabilities = [ ... ] # a (possibly) long list of numbers between 0 and 1
for wp in probabilities:
if (wp > 0):
p *= wp
# Take the natural log, this crashes when 'probabilites' is long enough that p ends up
# being zero
try:
result = math.log(p)
因为结果不需要精确,所以我通过简单地保留最小的非零值来解决这个问题,如果 p 变为 0,则使用它。
p = 1
probabilities = [ ... ] # a long list of numbers between 0 and 1
for wp in probabilities:
if (wp > 0):
old_p = p
p *= wp
if p == 0:
# we've gotten so small, its just 0, so go back to the smallest
# non-zero we had
p = old_p
break
# Take the natural log, this crashes when 'probabilites' is long enough that p ends up
# being zero
try:
result = math.log(p)
这可行,但对我来说似乎有点笨拙。我不会进行大量此类数值编程,并且我不确定这是否是人们使用的修复方法,或者是否有更好的东西我可以寻求。
So I just fixed an interesting bug in the following code, but I'm not sure the approach I took it the best:
p = 1
probabilities = [ ... ] # a (possibly) long list of numbers between 0 and 1
for wp in probabilities:
if (wp > 0):
p *= wp
# Take the natural log, this crashes when 'probabilites' is long enough that p ends up
# being zero
try:
result = math.log(p)
Because the result doesn't need to be exact, I solved this by simply keeping the smallest non-zero value, and using that if p ever becomes 0.
p = 1
probabilities = [ ... ] # a long list of numbers between 0 and 1
for wp in probabilities:
if (wp > 0):
old_p = p
p *= wp
if p == 0:
# we've gotten so small, its just 0, so go back to the smallest
# non-zero we had
p = old_p
break
# Take the natural log, this crashes when 'probabilites' is long enough that p ends up
# being zero
try:
result = math.log(p)
This works, but it seems a bit kludgy to me. I don't do a ton of this kind of numerical programming, and I'm not sure if this is the kind of fix people use, or if there is something better I can go for.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
既然,math.log(a * b) 等于 math.log(a) + math.log(b),为什么不取对数之和
probabilities
数组的所有成员?这将避免
p
变得太小而导致下溢的问题。编辑:这是 numpy 版本,对于大型数据集来说更干净且速度更快:
Since,
math.log(a * b)
is equal tomath.log(a) + math.log(b)
, why not take a sum of the logs of all members of theprobabilities
array?This will avoid the problem of
p
getting so small it under-flows.Edit: this is the numpy version, which is cleaner and a lot faster for large data sets: