Algorithm Analysis

// sum.cpp  <== click here for file

#include <iostream>
using namespace std;

// function prototypes
int sum_it_up( int a[], int n );

int main()
{
  int b[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
  int x = sum_it_up( b, 10 );

  cout << "The sum is: " << x << endl;

  return 0;
}

int sum_it_up( int a[], int n )
{
  int sum = 0;
  for ( int i = 0 ; i < n ; i++ )
  {
    sum += a[i];    // sum = sum + a[i];
  }

  return sum;
}

Algorithm Analysis:

  • We analyze code to determine the time required to execute or run the code. This time measure is usually a function of the size of the data being worked on (i.e. the input).
  • Why do we do this? We want to know if one algorithm is better than another (and why it's better).
  • How do we approach this?
    1. Don't do any analysis; just use the first algorithm that comes to mind.
    2. Implement and time multiple algorithms to choose the best.
    3. Analyze algorithms by counting operations while assigning different weights to different types of operations based on how long each one takes to execute.
    4. Analyze algorithms by assuming each operation requires the same amount of time. Count the total number of operations, then multiply this count by the average cost of an operation.
  • In practice, perhaps 99% of the time, a rough count (similar to #4 above) is determined as a function of the size of the input.
  • We use order notation to simplify the resulting function. Order notation helps simplify the analysis that leads to the function.
  • How do you count the total number of operations in the sum_it_up() function above? Come up with a function based on n that describes the number of operations.
  • You are likely to come up with different answers, depending on what we consider an operation to be. How do we resolve these differences? We use order notation.

Order Notation:

  • The following discussion emphasizes intuition. That's sufficient for this course. For more details and technical depth, see any textbook on data structures and algorithms.
  • Definition: Algorithm A is O(f(n)), read "order f(n)," if constants k and n0 exist such that A requires no more than k * f(n) operations to solve a problem of size n ≥ n0.
  • Algorithms requiring 4n-3, 5n+3, 14+17n operations are all O(n) (i.e. in applying the definition of order notation f(n) = n).
  • Algorithms requiring n2/10+15n-3 and 10000+35n2 are all O(n2) (i.e. in applying the definition of order notation f(n) = n2).
  • Intuitively (and importantly), we determine the order by finding the asymptotically dominant term (function of n) and throwing out the leading coefficient. This term could involve logarithmic or exponential functions of n.
  • Implications for analysis:
    • We don't need to quibble about small differences in the numbers of operations.
    • We also do not need to worry about the different costs of different types of operations or the other programs running on the same machine.
    • Note that we don't produce an actual time. We just obtain a rough count of the number of operations. This count is used to compare algorithms.

Common Orders of Magnitude:

Here are the most commonly occurring orders of magnitude in algorithm analysis.

  • O(1): Constant time (the number of operations is independent of the size of the input)
  • O(log n): Logarithmic time
  • O(n): Linear time
  • O(n log n)
  • O(n2): Quadratic or Polynomial time
  • O(n2 log n)
  • O(n3): Cubic or Polynomial time
  • O(2n): Exponential time

Significance of Orders of Magnitude:

  • On a computer that performs 108 operations per second:
    • Algorithm A requires 15n log n operations. Thus, it requires about 3 seconds on a problem of size n = 1,000,000 and 50 minutes on a problem of size n = 100,000,000.
    • Algorithm B requires n2 operations. Thus, it requires about 3 hours on a problem of size n = 1,000,000 and 115 days on a problem of size n = 100,000,000.
    • What matters here is the n2 vs. the n log n.
  • By comparison, I'd choose algorithm A unless I wasn't in a hurry.

Best-Case, Average-Case, and Worst-Case Analysis:

  • For a given fixed size input (e.g. an array of size n), we might want to know:
    • The fewest number of operations (best case) that might occur.
    • The average number of operations (average case) that will occur.
    • The maximum number of operations (worst case) that can occur.
  • The average or worst case are the most commonly used. Best-case analysis is rarely used.

Approaching An Analysis Problem:

  • Decide the important variable (or variables) that determine the size of the problem. For arrays and container classes (e.g. vector), this will generally be the number of values stored.
  • Decide what to count. The order notation helps us here.
    • If each loop iteration does a fixed (or bounded) amount of work, then we only need to count the number of loop iterations.
    • We might also count specific operations, such as comparisons.
  • Do the count, using order notation to describe the result.

Analyzing Loops:

In each case below, give an order notation estimate as a function of n:

  • Loop A:
  • int count = 0;
    
    for ( int i = 0 ; i < n ; i++ )
      for ( int j = 0 ; j < n ; j++ )
        count++;
    
  • Loop B:
  • int count = 0;
    
    for ( int i = 0 ; i < n ; i++ )
      count++;
    
    for ( int j = 0 ; j < n ; j++ )
      count++;
    
  • Loop C:
  • int count = 0;
    
    for ( int i = 0 ; i < n ; i++ )
      for ( int j = i ; j < n ; j++ )
        count++;