Runtime analysis
It should be becoming clear that an important aspect to algorithm design is gauging the efficiency both in terms of space (memory) and time (number of operations). This second measure, called runtime performance, is the subject of this section. It should be mentioned that an identical metric is used to measure an algorithm's memory performance. There are a number of ways we could, conceivably, measure run time and probably the most obvious is simply to measure the time the algorithm takes to complete. The major problem with this approach is that the time it takes for an algorithm to run is very much dependent on the hardware it is run on. A platform-independent way to gauge an algorithm's runtime is to count the number of operations involved. However, this is also problematic in that there is no definitive way to quantify an operation. This is dependent on the programming language, the coding style, and how we decide to count operations. We can use this idea, though, of counting operations, if we combine it with the expectation that as the size of the input increases the runtime will increase in a specific way. That is, there is a mathematical relationship between n, the size of the input, and the time it takes for the algorithm to run.
Much of the discussion that follows will be framed by the following three guiding principles. The rational and importance of these principles should become clearer as we proceed. These principles are as follows:
- Worst case analysis. Make no assumptions on the input data.
- Ignore or suppress constant factors and lower order terms. At large inputs higher order terms dominate.
- Focus on problems with large input sizes.
Worst case analysis is useful because it gives us a tight upper bound that our algorithm is guaranteed not to exceed. Ignoring small constant factors, and lower order terms is really just about ignoring the things that, at large values of the input size, n, do not contribute, in a large degree, to the overall run time. Not only does it make our work mathematically easier, it also allows us to focus on the things that are having the most impact on performance.
We saw with the Karatsuba algorithm that the number of multiplication operations increased to the square of the size, n, of the input. If we have a four-digit number the number of multiplication operations is 16; an eight-digit number requires 64 operations. Typically, though, we are not really interested in the behavior of an algorithm at small values of n, so we most often ignore factors that increase at slower rates, say linearly with n. This is because at high values of n, the operations that increase the fastest as we increase n, will dominate.
We will explain this in more detail with an example, the merge sort algorithm. Sorting is the subject of Chapter 10, Sorting, however, as a precursor and as a useful way to learn about runtime performance, we will introduce merge sort here.
The merge sort algorithm is a classic algorithm developed over 60 years ago. It is still used widely in many of the most popular sorting libraries. It is relatively simple and efficient. It is a recursive algorithm that uses a divide and conquer approach. This involves breaking the problem into smaller sub problems, recursively solving them, and then somehow combining the results. Merge sort is one of the most obvious demonstrations of the divide and conquer paradigm.
The merge sort algorithm consists of three simple steps:
- Recursively sort the left half of the input array.
- Recursively sort the right half of the input array.
- Merge two sorted sub arrays into one.
A typical problem is sorting a list of numbers into a numerical order. Merge sort works by splitting the input into two halves and working on each half in parallel. We can illustrate this process schematically with the following diagram:
Here is the Python code for the merge sort algorithm:
def mergeSort(A):
#base case if the input array is one or zero just return.
if len(A) > 1:
# splitting input array
print('splitting ', A )
mid = len(A)//2
left = A[:mid]
right = A[mid:]
#recursive calls to mergeSort for left and right sub arrays
mergeSort(left)
mergeSort(right)
#initalizes pointers for left (i) right (j) and output array (k)
# 3 initalization operations
i = j = k = 0
#Traverse and merges the sorted arrays
while i <len(left) and j<len(right):
# if left < right comparison operation
if left[i] < right[j]:
# if left < right Assignment operation
A[k]=left[i]
i=i+1
else:
#if right <= left assignment
A[k]= right[j]
j=j+1
k=k+1
while i<len(left):
#Assignment operation
A[k]=left[i]
i=i+1
k=k+1
while j<len(right):
#Assignment operation
A[k]=right[j]
j=j+1
k=k+1
print('merging ', A)
return(A)
We run this program for the following results:
The problem that we are interested in is how we determine the running time performance, that is, what is the rate of growth in the time it takes for the algorithm to complete relative to the size of n. To understand this a bit better, we can map each recursive call onto a tree structure. Each node in the tree is a recursive call working on progressively smaller sub problems:
Each invocation of merge-sort subsequently creates two recursive calls, so we can represent this with a binary tree. Each of the child nodes receives a sub set of the input. Ultimately we want to know the total time it takes for the algorithm to complete relative to the size of n. To begin with we can calculate the amount of work and the number of operations at each level of the tree.
Focusing on the runtime analysis, at level 1, the problem is split into two n/2 sub problems, at level 2 there is four n/4 sub problems, and so on. The question is when does the recursion bottom out, that is, when does it reach its base case. This is simply when the array is either zero or one.
The number of recursive levels is exactly the number of times you need to divide n by 2 until you get a number that is at most 1. This is precisely the definition of log2. Since we are counting the initial recursive call as level 0, the total number of levels is log2n + 1.
Let's just pause to refine our definitions. So far we have been describing the number of elements in our input by the letter n. This refers to the number of elements in the first level of the recursion, that is, the length of the initial input. We are going to need to differentiate between the size of the input at subsequent recursive levels. For this we will use the letter m or specifically mj for the length of the input at recursive level j.
Also there are a few details we have overlooked, and I am sure you are beginning to wonder about. For example, what happens when m/2 is not an integer, or when we have duplicates in our input array. It turns out that this does not have an important impact on our analysis here; we will revisit some of the finer details of the merge sort algorithm in Chapter 12, Design Techniques and Strategies.
The advantage of using a recursion tree to analyze algorithms is that we can calculate the work done at each level of the recursion. How to define this work is simply as the total number of operations and this of course is related to the size of the input. It is important to measure and compare the performance of algorithms in a platform independent way. The actual run time will of course be dependent on the hardware on which it is run. Counting the number of operations is important because it gives us a metric that is directly related to an algorithm's performance, independent of the platform.
In general, since each invocation of merge sort is making two recursive calls, the number of calls is doubling at each level. At the same time each of these calls is working on an input that is half of its parents. We can formalize this and say that:
To calculate the total number of operations, we need to know the number of operations encompassed by a single merge of two sub arrays. Let's count the number of operations in the previous Python code. What we are interested in is all the code after the two recursive calls have been made. Firstly, we have the three assignment operations. This is followed by three while loops. In the first loop we have an if else statement and within each of are two operations, a comparison followed by an assignment. Since there are only one of these sets of operations within the if else statements, we can count this block of code as two operations carried out m times. This is followed by two while loops with an assignment operation each. This makes a total of 4m + 3 operations for each recursion of merge sort.
Since m must be at least 1, the upper bound for the number of operations is 7m. It has to be said that this has no pretense at being an exact number. We could of course decide to count operations in a different way. We have not counted the increment operations or any of the housekeeping operations; however, this is not so important as we are more concerned with the rate of growth of the runtime with respect to n at high values of n.
This may seem a little daunting since each call of a recursive call itself spins off more recursive calls, and seemingly explodes exponentially. The key fact that makes this manageable is that as the number of recursive calls doubles, the size of each sub problem halves. These two opposing forces cancel out nicely as we can demonstrate.
To calculate the maximum number of operations at each level of the recursion tree we simply multiply the number of sub problems by the number of operations in each sub problem as follows:
Importantly this shows that, because the 2j cancels out the number of operations at each level is independent of the level. This gives us an upper bound to the number of operations carried out on each level, in this example, 7n. It should be pointed out that this includes the number of operations performed by each recursive call on that level, not the recursive calls made on subsequent levels. This shows that the work done, as the number of recursive calls doubles with each level, is exactly counter balanced by the fact that the input size for each sub problem is halved.
To find the total number of operations for a complete merge sort we simply multiply the number of operations on each level by the number of levels. This gives us the following:
When we expand this out, we get the following:
The key point to take from this is that there is a logarithmic component to the relationship between the size of the input and the total running time. If you remember from school mathematics, the distinguishing characteristic of the logarithm function is that it flattens off very quickly. As an input variable, x, increases in size, the output variable, y increases by smaller and smaller amounts. For example, compare the log function to a linear function:
In the previous example, multiplying the nlog2n component and comparing it to n2 .
Notice how for very low values of n, the time to complete, t , is actually lower for an algorithm that runs in n2 time. However, for values above about 40, the log function begins to dominate, flattening the output until at the comparatively moderate size n = 100, the performance is more than twice than that of an algorithm running in n2 time. Notice also that the disappearance of the constant factor, + 7 is irrelevant at high values of n.
The code used to generate these graphs is as follows:
import matplotlib.pyplot as plt
import math
x=list(range(1,100))
l =[]; l2=[]; a = 1
plt.plot(x , [y * y for y in x] )
plt.plot(x, [(7 *y )* math.log(y, 2) for y in x])
plt.show()
You will need to install the matplotlib library, if it is not installed already, for this to work. Details can be found at the following address; I encourage you to experiment with this list comprehension expression used to generate the plots. For example, adding the following plot statement:
plt.plot(x, [(6 *y )* math.log(y, 2) for y in x])
Gives the following output:
The preceding graph shows the difference between counting six operations or seven operations. We can see how the two cases diverge, and this is important when we are talking about the specifics of an application. However, what we are more interested in here is a way to characterize growth rates. We are not so much concerned with the absolute values, but how these values change as we increase n. In this way we can see that the two lower curves have similar growth rates, when compared to the top (x2) curve. We say that these two lower curves have the same complexity class. This is a way to understand and describe different runtime behaviors. We will formalize this performance metric in the next section.