Recursive Analysis¶

Recursive functions are common in computer science. We often have to determine how many times a recursive function will execute and how much work is will do. This is a formal method to solve this.

Let us imagine that we looked at the assembly of the factorial function and counted exact operations. Imagine fact(n) needed 17 operations every recursive call. It also need 11 extra operations for the base case. We can write this as a recursive function.

$ \begin{align} T(n) =& \begin{cases} 11 & n=0 \\ T(n-1)+17 & n > 0 \end{cases} \end{align} $

We traditionally use the letter $T$ when counting the number of operations. This is because the count of operations is an approximation of the time needed to run the program. We also generally use the value $n$ as a standing for whatever number is used to determined the repetitions.

The first step is to algebraically expand the formal. In this case, the number of operations needed to compute $T(n)$ is 17 plus the number of operations needed to compute $T(n-1)$. We can replace $T(n-1)$ with another value. The number of operations to compute $T(n-1)$ is $T(n-2)+17$. We will assume $n>0$ is true for now.

$ \begin{align} T(n) =& T(n-1) + 17 \\ =& \left[ T(n-2) + 17 \right] + 17 \end{align} $

We repeat this process until we find a pattern.

$ \begin{align} T(n) =& T(n-1) + 17 \\ =& \left[ T(n-2) + 17 \right] + 17 & \text{Replace Def T(n-1)} \\ =& T(n-2) + 2 * 17 & \text{Simplify} \\ =& \left[ T(n-3) + 17 \right] + 2 * 17 & \text{Replace Def T(n-2)} \\ =& T(n-3) + 3 * 17 & \text{Simplify} \\ =& \left[ T(n-4) + 17 \right] + 3 * 17 & \text{Replace Def T(n-3)} \\ =& T(n-4) + 4 * 17 & \text{Simplify} \end{align} $

We repeat this process until we see a pattern.

$ \begin{align} T(n) =& T(n-1) + 17 & \text{Iteration 1}\\ =& T(n-2) + 2 * 17 & \text{Iteration 2} \\ =& T(n-3) + 3 * 17 & \text{Iteration 3} \\ =& T(n-4) + 4 * 17 & \text{Iteration 4} \\ =& T(n-k) + k * 17 & \text{Iteration k} \end{align} $

How many iterations do we need? We know that $T(0)=11$. So, when $T(n-k)=T(0)$ we can stop. This happens when $n-k=0$. We can solve this for $k$.

$ \begin{align} n-k =& 0 \\ n =& k & \text{Add k both sides} \end{align} $

The number of iterations we need is $n$. We can plug this into our formula.

$ \begin{align} T(n) =& T(n-n) + n * 17 &\text{Let k=n} \\ =& T(0) + n * 17 &\text{Simplify} \\ =& 11 + 17n & \text{Use Base Case} \end{align} $

We find out that the number of operations is $T(n)=17n+11$. This is the exact same answer we came up with by just reasoning about the code before. This method is more formal and consistent compared to the reasoning we did before.

Recursive Formula¶

We frequently encounter recursive functions in programs. A simple example is binary search.

Function bSearch(n,h,start,stop)
    If stop < start then
        Return False
    End If
    Let middle = (stop-start)/2+start
    If h[middle]==n then
        Return True
    End If
    If n < h[middle] then
        Return bSearch(n,h,start,middle-1)
    Else
        Return bSearch(n,h,middle+1,stop)
    End If
End Function

How do we determine the worst case running time for this function?

Let’s assume the list we are searching has 8 elements. We start the search from $0$ to $7$.

\[ \begin{align} m = \left \lfloor \frac{ 7-0}{2} \right \rfloor + 0 = 3 \end{align} \]

In the worse case, this will not be the answer. There are two possible searches left. We could search $0 \cdots 2$ or $4 \cdots 7$. The larger of the two sides is $4 \cdots 7$. Let us assume this is the side searched.

\[ \begin{align} m = \left \lfloor \frac{7-4}{2} \right \rfloor + 4 = 5 \end{align} \]

If the element at position 5 is the one we are looking for, the algorithm ends. We are looking for the worst case, so assume this doesn’t happen. There are two sides to search $4 \cdots 4$ and $6 \cdots 7$. The larger side is $6 \cdots 7$. Let us assume this is the side that is searched since we are looking for the worst case.

\[ \begin{align} m = \left \lfloor \frac{7-6}{2} \right \rfloor + 6 = 6 \end{align} \]

If the element at position 6 is the one we are looking for, we can stop. If it is not, we only have one section left $7 \cdots 7$. The last comparison is against the element in position 7. Either way, we will know the answer after his comparison. Every other element has been eliminated.

In total, we compared to elements at positions $3,5,6,7$. This was a total of $4$ comparisons to find the target in $8$ elements.

This manual examination would be very hard to do for 1,000,000 elements. We want to come up with a formula to approximate the recursion.

Binary Search does 1 comparison and then searches approximately half the list. Really $T(\frac{n}{2}-1)$, but that would be much harder to analyze at very little gain.

\[ \begin{align} T(n) = 1+T\left(\frac{n}{2}\right) \end{align} \]

Algebraic Expansion¶

We can do the same repeated expansion algebraically.

\[\begin{split} \begin{align} T(8) =& 1+T\left( \frac{8}{2} \right)\\ =& 1+T(4) \\ =& 1 + \left( 1 + T\left( \frac{4}{2} \right) \right) \\ =& 1+1+T(2) \\ =& 1+1+\left(1+T\left(\frac{2}{2}\right)\right) \\ =& 1+1+1+T(1) \\ =& 1+1+1+1 \\ =& 4 \end{align} \end{split}\]

This would still be very hard to solve for $T(1,000,000)$. We can try and find a closed form.

\[\begin{split} \begin{align} T(n) =& 1+T\left( \frac{n}{2} \right) \\ =& 1+1+T\left(\frac{n}{4} \right) \\ =& 1+1+1+T\left(\frac{n}{8} \right) \\ =& 1+1+1+1+T\left(\frac{n}{16} \right) \\ =& x+T\left(\frac{n}{2^x}\right) \end{align} \end{split}\]

We can solve this expression at a value of $n$ since we know $T(1)=1$. If there is only 1 element in the list, that is the only one we need to check.

\[ \begin{align} T(8) =& x+T\left( \frac{8}{2^x} \right) \end{align} \]

What is $x$ when we reach $T(1)$?

\[\begin{split} \begin{align} 1 =& \frac{8}{2^x} \\ 2^x =& 8 \\ x=&\log_2(8) \\ x=&3 \end{align} \end{split}\]

If we plug $x=3$ into the formula we get

\[\begin{split} \begin{align} T(8) =& 3+T\left( \frac{8}{2^x} \right) \\ =& 3+T(1) \\ =& 3+1 \\ =& 4 \end{align} \end{split}\]

We can simplify this process by finding a general $x$.

\[\begin{split} \begin{align} 1 =& \frac{n}{2^x} \\ 2^x =& n \\ x=&\log_2(n) \end{align} \end{split}\]

Plugging in the general $x$ we get

\[\begin{split} \begin{align} T(n) =& \log_2(n) + T \left( \frac{n}{2^{\log_2(n)}} \right) \\ =& \log_2(n) + T\left( \frac{n}{n} \right) \\ =& \log_2(n) + T(1) \\ =& \log_2(n) + 1 \end{align} \end{split}\]

This can be used to solve for any number.

\[\begin{split} \begin{align} T(8) =& \log_2(8) + 1 = 4 \\ T(1024) =& \log_2(1024)+1=11 \\ T(1000000) =& \log_2(1000000)+1 = 20.93156857 =21 \end{align} \end{split}\]

For numbers that are not powers of two, this is an approximation. It makes sense to round up. We can’t do half a comparison.

General Formulas¶

A common expression in algorithm analysis is

\[\begin{split} \begin{align} T(n) = \begin{cases} 1 & n=1 \\ aT(n/b) + f(n) & \text{otherwise} \end{cases} \end{align} \end{split}\]

where

$a$ must be a constant and $a \ge 1$
$b$ must be a constant and $b > 1$
$f(n)$ must be asymptotically positive

When these conditions are all met there is a shortcut called The Master Theorem.

First, lets look at how the expression expands. Let $k$ be the iteration counter. (A full proof will not be given here.)

\[\begin{split} \begin{align} T(n) =& aT(n/b)+f(n) & k=1 \\ =& a\left( aT(n/b^2) + f(n/b) \right) + f(n) \\ =& a^2 T(n/b^2) + af(n/b)+f(n) & k=2 \\ =& a^2\left( aT(n/b^3) + f(n/b^2) \right) + af(n/b) + f(n) \\ =& a^3 T(n/b^3) + a^2f(n/b^2) + af(n/b) + f(n) & k=3 \end{align} \end{split}\]

There is a pattern for the $k$-th iteration.

\[ \begin{align} T(n) =& a^k T(n/b^k) + \sum_{i=0}^{k-1} a^i f(n/b^i) \end{align} \]

The iterations stop when $T(n/b^k) = T(1)$.

\[\begin{split} \begin{align} \frac{n}{b^k} =& 1 \\ n =& b^k \\ k =& \log_b(n) \end{align} \end{split}\]

If we plug this in we get

\[ \begin{align} T(n) =& a^{\log_b(n)} T(n/b^{\log_b(n)}) + \sum_{i=0}^{{\log_b(n)}-1} a^i f(n/b^i) \end{align} \]

We know $T(n/b^{\log_b(n)})=T(1)=1$ which simplifies the equation to

\[ \begin{align} T(n) =& a^{\log_b(n)} + \sum_{i=0}^{{\log_b(n)}-1} a^i f(n/b^i) \end{align} \]

Let us return to the formula for Binary Search.

\[\begin{split} \begin{align} T(n) =& 1+T\left(\frac{n}{2}\right) \\ a=& 1 \\ b=& 2 \\ f(n) =& 1 \end{align} \end{split}\]

If we plug this into the formula we have computed, we get

\[\begin{split} \begin{align} T(n) =& 1^{\log_2(n)} + \sum_{i=0}^{{\log_2(n)}-1} 1^i \\ =& 1+\log_2(n) \\ =& \Theta(\log_2 n) \end{align} \end{split}\]

We could also apply this formula to Merge Sort.

\[\begin{split} \begin{align} T(n) =& 2T(n/2)+n \\ a =& 2 \\ b =& 2 \\ f(n) =& n \end{align} \end{split}\]

When plugged into the formula this simplifies.

\[\begin{split} \begin{align} T(n) =& 2^{\log_2(n)} + \sum_{i=0}^{{\log_2(n)}-1} 2^i n/2^i \\ =& n + \sum_{i=0}^{\log_2(n)-1} n \\ =& n + n \log_2(n) \\ =& \Theta(n \log_2(n) ) \end{align} \end{split}\]

The Master Theorem¶

If all we care about is the Theta/Big Oh/Big Omega answer to the formula, completing this entire sum is overkill. The Master Theorem gives us three common cases and tells us what the answer is for any expression that follows the below pattern. $$ \begin{align} T(n)=\begin{cases} 1 & n=1 \\ a T(n/b) + f(n) & \text{otherwise} \end{cases} \end{align} $$

The Master Theorem is described as follows.

Compute $c=\log_b ( a)$
Determine which of the following is true
- [CASE 1] $f(n) = O\left(n^{c-\epsilon}\right)$ for some constant $\epsilon>0$, then $T(n)=O\left(n^c\right)$ and $T(n)=\Omega\left(n^c\right)$
- [CASE 2] $f(n) = O\left( n^c \log^k n \right)$ and $f(n)=\Omega \left( n^c \log^k n \right)$ with $k\ge0$, then $T(n)=O\left(n^c \log^{k+1} n \right)$ and $T(n)=\Omega \left(n^c \log^{k+1} n \right)$
- [CASE 3] $f(n) = \Omega\left(n^{c+\epsilon} \right)$ with $\epsilon > 0$ and $a f(n/b)\le x f(n)$ for $x<1$ and large $n$, then $T(n) = O(f(n))$ and $T(n)=\Omega(f(n))$

We will now look at each case using both the master theorem and below equation to see why it works.

The cases are all derived from this expression. We will call this expressions $T_{\text{expanded}}$.

\[ \begin{align} T_{\text{expanded}}(n) =& a^{\log_b(n)} + \sum_{i=0}^{{\log_b(n)}-1} a^i f(n/b^i) \end{align} \]

Case 1¶

The following equation will fall into case 1.

\[ \begin{align} T(n)=& 16 T(n/4) + n \end{align} \]

We have

\[\begin{split} \begin{align} a=& 16 \\ b =& 4 \\ f(n) =& n \end{align} \end{split}\]

Using The Master Theorem we compute $c=\log_{4}(16) =2$.

We have $f(n)=n$ and $n^c=n^2$. We know that $n \le n^2$ for all $n \ge 1$.

We determine that $f(n)=O(n^{2-\epsilon})$ for any $0< \epsilon < 1$.

This Big Oh meet the conditions of {\bf Case 1} therefore the answer is

\[ \begin{align} T(n) = \Theta(n^2) \end{align} \]

If we plug the values into $T_{\text{expanded}}$ we will get the full expression.

\[\begin{split} \begin{align} T(n) =& 16^{\log_4(n)} + \sum_{i=0}^{{\log_4(n)}-1} 16^i n/4^i \\ =& n^2 + \sum_{i=0}^{{\log_4(n)}-1} 4^i n \\ =& n^2 + n \sum_{i=0}^{{\log_4(n)}-1} 4^i \\ =& n^2 + n \left( \frac{1}{3} 4^{\log_4(n)} - \frac{1}{3} \right) \\ =& n^2 + n \left( \frac{1}{3} n - \frac{1}{3} \right) \\ =& n^2 + \frac{1}{3} n^2 - \frac{1}{3} n \\ =& \frac{4}{3} n^2 - \frac{1}{3} n \\ =& \Theta(n^2) \end{align} \end{split}\]

Notice that is this case both the exponential $a^{log_b(n)}$ is a Theta bound for the summation. This is what Case 1 of the Master Theorem determines.

Case 3¶

The second case is the most difficult. We will skip to Case 3 for the second example.

\[\begin{split} \begin{align} T(n) =& 3 T(n/2) + n^2 \\ a=& 3 \\ b =& 2 \\ f(n) =& n^2 \\ c =& \log_2(3) \approx 1.58 \end{align} \end{split}\]

In this case, we have $n^2 \ge n^{1.58}$ for all $n \ge 1$.

This means $f(n) = \Omega( n{1.58+\epsilon})$ for all $0 < \epsilon < 0.42$.

The master theorem tells us that this will be

\[ \begin{align} T(n) =& \Theta(n^2) \end{align} \]

The second case has an addition check that is required. We will talk about this shortly.

First, we will plug the values into $T_{\text{expanded}}$.

Note that $f(n/b^i) = n^2/b^{2i}$ in this case. The denominator is also squared.

\[\begin{split} \begin{align} T(n) =& 3^{\log_2(n)} + \sum_{i=0}^{{\log_2(n)}-1} 3^i n^2/2^{2i} \\ =& n^{1.58\cdots} + n^2 \sum_{i=0}^{{\log_2(n)}-1} \frac{3^i}{2^{2i} } \\ =& n^{1.58\cdots} + n^2 \left( 4-4 \left( \frac{3}{4} \right)^{\log_2(n)} \right) \\ =& n^{1.58\cdots} + n^2 \left( 4 -4 \left( \frac{n^{\log_2(3)}}{{n^{\log_2(4)}}} \right) \right) \\ =& n^{1.58\cdots} + n^2 \left( 4 -4 n^{\log_2(3)-2} \right) \\ =& n^{1.58\cdots} + 4n^2 -4 n^{\log_2(3)} \\ =& 4n^2 -3 n^{1.58\cdots} \\ =& \Theta(n^2) \end{align} \end{split}\]

Notice that is this case, the summation overtook the exponential component.

Case 3 has a special condition. It says that $af(n/b) \le x f(n)$ for any $0 < x < 1$ and large $n$.

If we plug the values in for our case, we have

\[\begin{split} \begin{align} 3\left( \frac{n}{2} \right)^2 \le x n^2 \\ \frac{3}{4} n^2 \le x n^2 \end{align} \end{split}\]

This is true for any $\frac{3}{4} \le x < 1$. More specifically, this tells us that the terms appearing in the sum will be less than 1. If this is not the case, the summation will not simplify as show above and the answer may not be clear.

Case 2¶

The final case is Case 2. An expression that will get into Case 2 is given below.

\[\begin{split} \begin{align} T(n) =& 4T(n/2) + n^2 \\ a=& 4 \\ b=& 2 \\ f(n) =& n^2 \\ c=& \log_2(4) = 2 \end{align} \end{split}\]

In this case we $f(n)=n^c$ because $n^2=n^2$. This leads to {\bf Case 2} where $f(n) = \Theta(n^c)$.

The Master Theorem tells us the answer is

\[ \begin{align} T(n) = \theta(n^2 \log_2 n) \end{align} \]

Where did the $\log_n$ come from? This is also from $T_{\text{expanded}}$.

\[\begin{split} \begin{align} T(n) =& 4^{\log_2(n)} + \sum_{i=0}^{{\log_2(n)}-1} 4^i n^2/2^{2i} \\ =& n^2 + n^2 \sum_{i=0}^{{\log_2(n)}-1} \frac{4^i}{4^i} \\ =& n^2 + n^2 \sum_{i=0}^{{\log_2(n)}-1} 1 \\ =& n^2 + n^2 \log_2(n) \\ =& \Theta(n^2 \log_2(n) ) \end{align} \end{split}\]

Notice that in this case, the summation simplified to 1. This caused the value $f(n)$ to be multiplied by $\log_2(n)$. This is what happens in Case 2 of the Master Theorem. An additional factor of $\log_b(n)$ is multiplied by $f(n)$.

Case 2 (Second Example)¶

Here is another example for Case 2.

\[\begin{split} \begin{align} T(n) = 2 T(n/2) + n \log_2(n) \\ a=& 2 \\ b=& 2 \\ f(n) =& n \log_2(n) \\ c =& \log_2(2)=1 \end{align} \end{split}\]

In the case, the exponents on $n$ are both 1. We have $f(n) = \Omega(n^c)$ and $f(n) = \theta(n^c \log_2(n))$. The function $f(n)$ is within a log. The master theorem tells us this will be $T(n) = \Theta(f(n)*\log_b(n))$.

\[ \begin{align} T(n) =& n (\log_2(n))^2 \end{align} \]

Again, $T_{\text{expanded}}$ provides the rational.

\[\begin{split} \begin{align} T(n) =& 2^{\log_2(n)} + n \log_2(n) \sum_{i=0}^{{\log_2(n)}-1} \frac{2^i}{2^i \log^i_2(2)} \\ =& n + n \log_2(n) \sum_{i=0}^{{\log_2(n)}-1} 1 \\ =& n + n \log_2(n) \log_2(n) \\ =& n + n \log^2_2(n)\\ =& \Theta(n \log^2_2(n)) \end{align} \end{split}\]

Conclusion¶

The Master Theorem gives us a general pattern that applies to many recursive functions. We can easily determine the asymptotic complexity of a recursive function without going a huge amount of recursive analysis.

Algorithmic Foundations of Computer Science

Recursive Analysis

Contents