Amortized analysis
Often we are not so interested in the time complexity of individual operations, but rather the time averaged running time of sequences of operations. This is called amortized analysis. It is different from average case analysis, which we will discuss shortly, in that it makes no assumptions regarding the data distribution of input values. It does, however, take into account the state change of data structures. For example, if a list is sorted it should make any subsequent find operations quicker. Amortized analysis can take into account the state change of data structures because it analyzes sequences of operations, rather then simply aggregating single operations.
Amortized analysis finds an upper bound on runtime by imposing an artificial cost on each operation in a sequence of operations, and then combining each of these costs. The artificial cost of a sequence takes in to account that the initial expensive operations can make subsequent operations cheaper.
When we have a small number of expensive operations, such as sorting, and lots of cheaper operations such as lookups, standard worst case analysis can lead to overly pessimistic results, since it assumes that each lookup must compare each element in the list until a match is found. We should take into account that once we sort the list we can make subsequent find operations cheaper.
So far in our runtime analysis we have assumed that the input data was completely random and have only looked at the effect the size of the input has on the runtime. There are two other common approaches to algorithm analysis; they are:
- Average case analysis
- Benchmarking
Average case analysis finds the average running time based on some assumptions regarding the relative frequencies of various input values. Using real-world data, or data that replicates the distribution of real-world data, is many times on a particular data distribution and the average running time is calculated.
Benchmarking is simply having an agreed set of typical inputs that are used to measure performance. Both benchmarking and average time analysis rely on having some domain knowledge. We need to know what the typical or expected datasets are. Ultimately we will try to find ways to improve performance by fine-tuning to a very specific application setting.
Let's look at a straightforward way to benchmark an algorithm's runtime performance. This can be done by simply timing how long the algorithm takes to complete given various input sizes. As we mentioned earlier, this way of measuring runtime performance is dependent on the hardware that it is run on. Obviously faster processors will give better results, however, the relative growth rates as we increase the input size will retain characteristics of the algorithm itself rather than the hardware it is run on. The absolute time values will differ between hardware (and software) platforms; however, their relative growth will still be bound by the time complexity of the algorithm.
Let's take a simple example of a nested loop. It should be fairly obvious that the time complexity of this algorithm is O(n2) since for each n iterations in the outer loop there are also n iterations in the inter loop. For example, our simple nested for loop consists of a simple statement executed on the inner loop:
def nest(n):
for i in range(n):
for j in range(n):
i+j
The following code is a simple test function that runs the nest function with increasing values of n. With each iteration we calculate the time this function takes to complete using the timeit.timeit function. The timeit function, in this example, takes three arguments, a string representation of the function to be timed, a setup function that imports the nest function, and an int parameter that indicates the number of times to execute the main statement. Since we are interested in the time the nest function takes to complete relative to the input size, n, it is sufficient, for our purposes, to call the nest function once on each iteration. The following function returns a list of the calculated runtimes for each value of n:
import timeit
def test2(n):
ls=[]
for n in range(n):
t=timeit.timeit("nest(" + str(n) +")", setup="from __main__ import nest", number = 1)
ls.append(t)
return ls
In the following code we run the test2 function and graph the results, together with the appropriately scaled n2 function for comparison, represented by the dashed line:
import matplotlib.pyplot as plt
n=1000
plt.plot(test2(n))
plt.plot([x*x/10000000 for x in range(n)])
This gives the following results:
As we can see, this gives us pretty much what we expect. It should be remembered that this represents both the performance of the algorithm itself as well as the behavior of underlying software and hardware platforms, as indicated by both the variability in the measured runtime and the relative magnitude of the runtime. Obviously a faster processor will result in faster runtimes, and also performance will be affected by other running processes, memory constraints, clock speed, and so on.