Measuring algorithmic complexity

发布于 2025-02-25 23:43:41 字数 5024 浏览 0 评论 0 收藏 0

However, profiling doesn’t tell us much about how the algorithm will perform on a different computer since it is partly determined by the hardware available. To compare performance in a device-indpendent fashion, we use what is known as Big O notation (you may or may not have encountered this before in your Calculus courses). The Big O formalism characterizes functions in terms of their rates of growth.

A little more formally, we have a comparison function \(g(n)\) and another function \(f(n)\) that returns the number of “elementary operations” we need to perform in our algorithm given an input of size \(n\). In the example, the elementary oepration is comparison of two items. In statisitcal algorithms, this is most commonly a floating point operation (FLOP), such as addition or multiplicaiton of two floats. Now if the ratio \(|f(n)/g(n)|\) can be bounded by a finite number \(M\) as \(n\) grows to infinity, we say that \(f(n)\) has complexity of order \(g(n)\). For example, if \(f(n) = 10n^2\) and \(g(n) = n\), then there is no such number \(M\) and \(f(n)\) is not \(\mathcal{O}(n)\), but if \(g(n) = n^2\), then \(M = 10\) wil do and we say that \(f(n)\) is \(\mathcal{O}(n^2)\). So our search function is \(\mathcal{O}(n)\). Formally, it is also \(\mathcal{O} (n^2)\) and so on, but we always choose the “smallest” function \(g\). We also drop all terms ohter than the larget - so we don’t say \(\mathcal{O}(n^3 + n^2 + n)\) but simply \(\mathcal{O}(n^3)\).

Note that since the constant is not used in big O notation, two algorithms can have the same big O complexity and have very different performance! However, the O notation is very helpful for understanding the scalability of our algorithm. Below we show a comparison of an \(\mathcal{O}(n^2)\) algorithm (e.g. bubble sort) with an \(\mathcal{O}(n \log{n})\) algorithm (e.g. merge sort). Regardless of the difference in constant factor, there is no competition as \(n\) gets large.

Suppsoe you wanted to search for an item in an unsorted list of length \(n\). One way to do this would be to scan from the first position sequentially until you find it (or not). If the item is in the list, you will need to scan (\(n/2\)) items on average to find it. If it is not in the list, you will need to scan all \(n\) items. In any case, the complexity of the search grows linearly with the lenght of the list \(n\). We say that the algorithmic complexity of the search using a linear scan is \(\mathcal{O}(n)\).

Strictly, we should say the average complexity is \(\mathcal{O}(n)\). We can also calculate worst case performance (when the item is not in the list), which is the same class \(\mathcal{O}(n)\) as average complexity for this searching example. Since worst case performance may require a perverse organizaiotn of the input (e.g. asking a sort function to sort an already sorted list), randomizaiton of inputs will sometimes suffice to convert it to the average case.

Question: What is the algorithmic complexity of textbook matrix multiplication? Why?

Comparing complexity of \(\mathcal{O}(n^2)\) (e.g. bubble sort) and \(\mathcal{O} (n \log n)\) (e.g. merge sort).

def f1(n, k):
    return k*n*n

def f2(n, k):
    return k*n*np.log(n)

n = np.arange(0, 20001)

plt.plot(n, f1(n, 1), c='blue')
plt.plot(n, f2(n, 1000), c='red')
plt.xlabel('Size of input (n)', fontsize=16)
plt.ylabel('Number of operations', fontsize=16)
plt.legend(['$\mathcal{O}(n^2)$', '$\mathcal{O}(n \log n)$'], loc='best', fontsize=20);

Ranking of common Big O complexity classes

consstant = \(\mathcal{O}(1)\)
logarithmic = \(\mathcal{O}(\log n)\)
linear = \(\mathcal{O}(n)\)
n log n = \(\mathcal{O}(n \log n)\)
quadratic = \(\mathcal{O}(n^2)\)
cubic = \(\mathcal{O}(n^3)\)
polynomial = \(\mathcal{O}(n^k)\)
exponential = \(\mathcal{O}(k^n)\)
factorial =\(\mathcal{O}(n!)\)

from IPython.display import Image

Image(url='http://bigocheatsheet.com/img/big-o-complexity.png')

Complexity of common operations on Python data structures

See here for the complexity of operations on standard Python data structures. Note for instance that searching a list is much more expensive than searching a dicitonary.

# Searching a list is O(n)

alist = range(1000000)
r = np.random.randint(100000)
%timeit -n3 r in alist

3 loops, best of 3: 1.28 ms per loop

# Searching a dictionary is O(1)

adict = dict.fromkeys(alist)
%timeit -n3 r in adict

3 loops, best of 3: 318 ns per loop

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

列表为空，暂无数据