返回介绍

Finite representation of numbers

发布于 2025-02-25 23:43:40 字数 5869 浏览 0 评论 0 收藏 0

For integers, there is a maximum and minimum representatble number for langauges. Python integers are acutally objects, so they intelligently switch to arbitrary precision numbers when you go beyond these limits, but this is not true for most other languages including C and R. With 64 bit representation, the maximumm is 2^63 - 1 and the minimum is -2^63 - 1.

import sys
sys.maxint
9223372036854775807
2**63-1 == sys.maxint
True
# Python handles "overflow" of integers gracefully by
# swithing from integers to "long" abritrary precsion numbers
sys.maxint + 1
9223372036854775808L

Integer division

This has been illustrated more than once, becuase it is such a common source of bugs. Be very careful when dividing one integer by another. Here are some common workarounds.

# Explicit float conversion

print float(1)/3
0.333333333333
# Implicit float conversion

print (0.0 + 1)/3
print (1.0 * 1)/3
0.333333333333
0.333333333333
# Telling Python to ALWAYS do floaitng point with '/'
# Integer division can still be done with '//'
# The __futre__ package contains routines that are only
# found beyond some Python release number.

from __future__ import division

print (1/3)
print (1//3)
0.333333333333
0

Documentation about the fuure package

Overflow in langauges such as C “wraps around” and gives negative numbers

This will not work out of the box because the VM is missing some packages. If you want to really, really want to run this, you can issue the following commands from the command line and have your sudo password ready. It is not necessary to run this - this is just an example to show integer overflow in C - it does not happen in Python.

sudo apt-get update
sudo apt-get install build-essential
sudo apt-get install llvm
pip install git+https://github.com/dabeaz/bitey.git
%%file demo.c

#include "limits.h"

long limit() {
    return LONG_MAX;
}

long overflow() {
    long x = LONG_MAX;
    return x+1;
}
Writing demo.c
! clang -emit-llvm -c demo.c -o demo.o
import bitey
import demo

demo.limit(), demo.overflow()
(9223372036854775807, -9223372036854775808)

Floating point numbers

A floating point number is stored in 3 pieces (sign bit, exponent, mantissa) so that every float is represetned as get +/- mantissa ^ exponent. Because of this, the interval between consecutive numbers is smallest (high precison) for numebrs close to 0 and largest for numbers close to the lower and upper bounds.

Because exponents have to be singed to represent both small and large numbers, but it is more convenint to use unsigned numbers here, the exponnent has an offset (also knwnn as the exponentn bias). For example, if the expoennt is an unsigned 8-bit number, it can rerpesent the range (0, 255). By using an offset of 128, it will now represent the range (-127, 128).

from IPython.display import Image

Binary represetnation of a floating point number

Image(url='../Images/ca76e1b9890dbf60404525dc1a556ac0.jpg')

Intervals between consecutive floating point numbers are not constant

Because of this, if you are adding many numbers, it is more accuate to first add the small numbers before the large numbers.

Image(url='http:///fig1.jpg')

Floating point numbers on your system

Information about the floating point reresentation on your system can be obtained from sys.float_info . Definitions of the stored values are given at https://docs.python.org/2/library/sys.html#sys.float_info

import sys

print sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)

Floating point numbers may not be precise

'%.20f' % (0.1 * 0.1 * 100)
'1.00000000000000022204'
# Because of this, don't chek for equality of floating point numbers!

# Bad
s = 0.0

for i in range(1000):
    s += 1.0/10.0
    if s == 1.0:
        break
print i

# OK

TOL = 1e-9
s = 0.0

for i in range(1000):
    s += 1.0/10.0
    if abs(s - 1.0) < TOL:
        break
print i
999
9
# Loss of precision
1 + 6.022e23 - 6.022e23
0.0000

Lesson: Avoid algorithms that subtract two numbers that are very close to one anotoer. The loss of significnance is greater when both numbers are very large due to the limited number of precsion bits available.

Associative law does not necessarily hold

6.022e23 - 6.022e23 + 1
1.0000
1 + 6.022e23 - 6.022e23
0.0000

Distributive law does not hold

a = np.exp(1);
b = np.pi;
c = np.sin(1);
a*(b+c) == a*b+a*c
False
# loss of precision can be a problem when calculating likelihoods
probs = np.random.random(1000)
np.prod(probs)
0.0000
# when multiplying lots of small numbers, work in log space
np.sum(np.log(probs))
-980.0558

Lesson: Work in log space for very small or very big numbers to reduce underflow/overflow

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。
列表为空,暂无数据
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文