Return to the Index of Meeting Notes and Topics


15-295 Meeting 3 (Wednesday, January 26, 2005)

Readings and Problems

Sorting

Sorting problem: Given an array A[1...n], rearrange its elements in ascending order: A[1] <= A[2] <= ... <= A[n].

Insertion Sort Insertion sort (Section 1.1/2.1)

Insertion sort is likely familar to everyone. The strategy of the insertion sort is to begin with an empty (sorted) list an insert the new elements one at a time into the proper position.

Example:

3   2   0   1   1               sort A[1..2]

2   3   0   1   1               then sort A[2..3]

0   2   3   1   1               then sort A[3..4]

0   1   2   3   1               and so on

0   1   1   2   3

The algorithm is as follows:

void insertion_sort (A, n) {
  for (j=2; j <=n; j++) {

    key = A[j];
    i = (j-1);

    while ( (i>0) && A[i]>key) ) {
      A[i+1] = A[i];
    }

    A[i+1] = key;

Performance:

Quick sort (Sections 8.1-8.3, 7.1-7.3)

Overall strategy:

Algorithm:

void quicksort (A, p, r) {

  if (p < r) {
    q = partition (A, p, r);
    quicksort (A, p, q);
    quicksort (A, q+1, r);
    partition (A, p, r)
   
    x = A[p];
    i = p -1;
    j = r + 1;

    while (true) {
      do { 
        j = j -1;
      while (A[j] <= x);

      do { 
        i = i -1;
      while (A[i] >= x);
    }

    if (i < j)
      swap A, i, j);
    else
      return j;
  }
}

Performance

There are two good strategies to avoid the worst-case performance. Randomly shuffling the array is a good choice. Randomizing the pivot point (x) is a clever choice. To avoid worst case:

Randomized partition:

  i = Random (p, r), a random number between p and r
  swap (A, p, r)
  return partition (A, p, r)
In our experience, insertion sort is usually faster than quick sort for (n <= 250). This threshhold depends on the datatype and may vary from 100 to 500.

Counting Sort (Section 9.2/8.2)

Assume that all elements of A[1..n] are integers between 1 and k. For each element, count the number of smaller elements.

Two auxilliary arrays:

void counting_sort (A, B, k, n) {

  // Initialize counters
  for (int i=1; i<=k; i++) C[i] = 0
  ;

  // For each i from 1...k, count elements equal to i
  for (int j=1; j<=n; j++) C[A[j]] = C[A[j]] + 1;
  ;

  // for each i, count elements <= i
  for (int j=n; j>=1; j--) {
    B[C[A[j]]] = A[j]
    C[A[j]] = C[A[j]]-1
  }
}

Please consider the following example:


A   3   6   4   1   3   4   1   4,    n=8 and k=6
B   1   1   3   3   4   4   4   6   


C   2   0   2   3   0   1

C   2   2   4   7   7   8

C   0   2   3   4   6   7

Radix sort

Radix sort is useful for orting integer numbers, digit by digit. It is also effective for sorting strings. Assume that digits are from 0 to (k-1), e.g., 0 through 9, or 0 through 255, and every number has d, or fewer, digits. Sort by the least significant digit, then by the second least significant digit, and so on.

329    720     720     329
457    355     329     355
657    436     436     436
839    457     839     457
436    657     355     657
720    329     457     839
355    839     657     839
        ^       ^       ^
        |       |       |
     sorted   sorted  sorted
       by       by      by
      least   middle   most
       sig.    sig.     sig.
      digit   sigit    digit

The algorithm is as follows:


Radix_sort (A, n, d) {
  for (i=1; i <=d; i++) {
    counting_sort (A, 1, n, i);
  }
}

Performance:

Order Statistics (Section 10.2/9.2)

The ith order statistic of an array A[1...n] is the ith smallest element. In other words, it is an element that would be ith after sorting the array in acending order.

Example:

3   2   1   1

1st order statistic: 0
3rd order statistic: 3

Special cases:

Please consider the problem: Given an array A[1..n] and value i, where 1 <=i<=n, find the ith order statistic.

The simple solution is to sort the array and return A[i]. The time complexity is, at best, is O(n Log n). But, a more clever solution takes linear time. To achieve this, we can use a clever version of QuickSort that sorts onyl close to the necessary statistic.

void select (A, p, r, i) {
  if (p == r) 
    return A[p];

  q = partition (A, p, r);
  k = (q - p) + 1;

  if (i <= k) 
    return select (A, p, r);
  else
    return select (A, q+1, r, i-k);
}

Performance:

As with quicksort, randomize the partition to avoid worst case and use a while loop instead of recursion to reduce overhead.

There is a more complex solution for finding order statistics in O(n); see section 10.3/9.3

Competition strategy