Skip to content

Problems

Problem 2-1 - Insertion sort on small arrays in merge sort

Although merge sort runs in \Theta(n\cdot log_2n) worst-case time and insertion sort runs in \Theta(n^2) worst-case time, the constant factors in insertion sort can make it faster in practice for small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the recursion by using insertion sort within merge sort when subproblems become sufficiently small. Consider a modification to merge sort in which n / k sublists of length k are sorted using insertion sort and then merged using the standard merging mechanism, where k is a value to be determined.

 

(a) Show that insertion sort can sort the n/k sublists, each of length k, in \Theta(nk) worst-case time.

If there are k elements, insertion sort takes \Theta(k^2) worst-case time. Hence, for n/k sub-arrays of k elements,

\Theta(k^2 \cdot \frac{n}{k}) = \Theta(nk)

(b) Show how to merge the sublists in \Theta(n \cdot log_2({n/k})) worst-case time.

In merge sort, we know that height of tree (by halving at each step, problem of size n into subproblems of size 1) is log_2n.

But we decided to stop when the subproblem becomes sufficiently small - i.e. at k. Hence height of the tree is log_2n - log_2k = log_2(n/k).

At each step, we are merging a total of n elements (n/k \cdot k). Hence merging takes a total of \Theta(n) time.

Therefore merging sublists takes,

\Theta(n \cdot log_2(n/k))

(c) Given that the modified algorithm runs in \Theta(nk + n \cdot log_2(n/k)) worst-case time, what is the largest value of k as a function of n for which the modified algorithm has the same running time as standard merge sort, in terms of \Theta-notation?

For the modified algorithm to have same worst-case time as original algorithm,

\Theta(nk + n \cdot log_2(n/k)) = \Theta(n\cdot log_2n)

To satisfy this condition, k cannot be more than \Theta(log_2n)

To prove this, we can plug k = log_2n to the modified running time

\Theta(nk + n \cdot log_2(n/k)) = \Theta(n\cdot log_2n + n \cdot log_2n - n \cdot log_2(log_2n))
= \Theta(2n\cdot log_2n - n \cdot log_2(log_2n))

Since log_2(log_2n) is sufficiently small, we get

= \Theta(n \cdot log_2n)

which is the original running time.

Hence, maximum value of k is

k = \Theta(log_2n)

(d) How should we choose k in practice?

In practice, k should be the value of length of array where merge sort beats insertion sort. We can retrieve this by plotting the appropriate graphs.

Problem 2-2 - Correctness of bubblesort

Bubblesort is a popular, but inefficient, sorting algorithm. It works by repeatedly swapping adjacent elements that are out of order.

BUBBLESORT(A)

1  for i = 1 to A.length - 1
2      for j = A.length downto i + 1
3            if A[j] < A[j - 1]
4                 exchange A[j] with A[j-1]

(a) Let A' denote the output of BUBBLESORT(A). To prove that BUBBLESORT is correct, we need to prove that it terminates and that

A'[1] \leq A'[2] \leq ... \leq A'[n]

where n = A.length. In order to show that BUBBLESORT actually sorts, what else do we need to prove?

We need to prove that A' contains the exact same elements as A

The next two parts will prove inequality (2.3).

(b) State precisely a loop invariant for the for loop in lines 2–4, and prove that this loop invariant holds. Your proof should use the structure of the loop invariant proof presented in this chapter.

Loop invariant: When each iteration of the loop starts, the position smallest element in the array A[i..n] is less than or equal to j

Initialization: Before the loop, any element has the position <= A.length. And j starts from A.length. Hence the condition holds true during initialization

Maintenance: During the comparison, if A[j] < A[j - 1], we swap the elements. Hence smallest element of A[i..n] must be the first j-1 position of the subarray. If A[j] > A[j-1], smallest element has position at most j-1. Hence the invariant is maintained.

Termination: At the end of the loop, smallest element of A[i..n] is in position i, and value of j is i+1

(c) Using the termination condition of the loop invariant proved in part (b), state a loop invariant for the for loop in lines 1–4 that will allow you to prove in- equality (2.3). Your proof should use the structure of the loop invariant proof presented in this chapter.

Loop invariant: When each iteration of the loop starts, subarray A[1.. i-1] contains i-1 smallest elements in sorted order

Initialization: Before the loop, i=1 means first 0 elements are already sorted.

Maintenance: From previous proof, we know that each the termination of inner loop, smallest element of A[i..n] is in position i. Since i-1 elements are already in sorted order A[1..i-1], A[i] is the ith smallest element. Thus, A[1..i] contains i smallest elements

Termination: At the end of the loop, A[1..n] contains all n elements of A in sorted order

(d) What is the worst-case running time of bubblesort? How does it compare to the running time of insertion sort?

Worst-case running time of bubblesort is \Theta(n^2). Insertion sort also has \Theta(n^2) as worst-case running time. On the other hand, bubblesort has both worst-case and best-case running time as \Theta(n^2), whereas insertion sort has best-case running time of \Theta(n)

Problem 2-3 - Correctness of Horner’s rule

The following code fragment implements Horner’s rule for evaluating a polynomial

P(x) = \sum_{k=0}^{n} a_k x^k = a_0 + x(a_1 + x(a_2 + ... +x(a_{n-1} + xa_n)...))

given the coefficients a_0, a_1, ... a_n and a value for x:

1  y = 0
2  for i = n downto 0
3      y = a_i + x * y

(a) In terms of \Theta-notation, what is the running time of this code fragment for Horner’s rule?

Running time is \Theta(n)

(b) Write pseudocode to implement the naive polynomial-evaluation algorithm that computes each term of the polynomial from scratch. What is the running time of this algorithm? How does it compare to Horner’s rule?

1   y = 0
2   for k = 0 to n
3        y = y +  (a_k * x^k)

Running time is \Theta(n^2), since it has to compute x^k at each iteration

(c) Consider the following loop invariant:

At the start of each iteration of the for loop of lines 2–3,

y = \sum_{k=0}^{n-(i+1)} a_{k+i+1} x^k

Interpret a summation with no terms as equaling 0. Following the structure of the loop invariant proof presented in this chapter, use this loop invariant to show that, at termination, y = \sum_{k=0}^{n} a_k x^k

Initialization: At the beginning of the loop, i=n, hence n-(n+1) = -1, which is 0, which is the initial value of y

Maintenance: If the equation is true for i,

y = a_i + x \sum_{k=0}^{n-(i+1)} a_{k+i+1} x^k = a_i + x \sum_{k=1}^{n-i} a_{k+i} x^{k-1} = \sum_{k=0}^{n-i} a_{k+i} x^k

Termination: At the end, i=0, hence n-(0+1) = n-1, which is the final result

Problem 2-4 - Inversions

Let A[1..n] be an array of n distinct numbers. If i < j and A[i] > A[j], then the pair (i, j) is called an inversion of A.

(a) List the five inversions of the array \langle 2, 3, 8, 6, 1 \rangle

Inversions - (1, 5), (2, 5), (3, 4), (3, 5), (4, 5)

(b) What array with elements from the set {1, 2, ... n} has the most inversions? How many does it have?

The array {n, n - 1, ... , 2, 1} has the most inversions. It has \frac{n (n-1)}{2} inversions

(c) What is the relationship between the running time of insertion sort and the number of inversions in the input array? Justify your answer.

INSERTION-SORT(A)

1  for j = 2 to A.length
2      key = A[j]
3      // Insert A[j] into the sorted sequence A[1..j-1]
4      i = j -1 
5      while i > 0 and A[i] > key
6          A[i+1] = A[i]
7          i = i - 1
8      A[i+1] = key

If there are more inversions, the while loop in line-5 will be executed more number of times. That is, more number of swaps. Hence, number of inversions is proportional to running time of insertion sort

(d) Give an algorithm that determines the number of inversions in any permutation on n elements in \Theta(n log_2n) worst-case time. (Hint: Modify merge sort.)

Pseudocode

MERGE_INVERSIONS(A, p, q, r)

1   n1 = q - p + 1
2   n2 = r - q
3   inversions = 0
4   let L[1..n1+1] and R[1..n2+1] be new arrays
5   for i = 1 to n1
6       L[i] = A[p+i-1]
7   for j = 1 to n2
8       R[j] = A[q+j]
9   L[n1 + 1] = Inf
10  R[n2 + 1] = Inf
11  i = 1
12  j = 1
13  for k = p to r
14      if L[i] <= R[j]
15          A[k] = L[i]
16          i = i + 1
17      else
18          A[k] = R[j]
19          j = j + 1
20          inversions = inversions + n1 - i + 1
21  return inversions

COUNT-INVERSIONS(A,p,r)

1   if p < r
2       q = [p + r]/2
3       left-inversions = COUNT-INVERSIONS(A, p, q)
4       right-inversions = COUNT-INVERSIONS(A, q+1, r)
5       inversions = MERGE_INVERSIONS(A, p, q, r) + left-inversions + right-inversions
6       return inversions
7   return 0

Python code

import math

def merge_inversions(A, p, q, r):
    """Counts the number of inversions while merging two subarrays

    Args:
        A: Array in which inversions needs to be counted
        p: Index of start of first subarray
        q: Index of end of first subarray
        r: Index of end of second subarray
    """
    inversions = 0
    n1 = q - p + 1 # length of subarray A[p..q]
    n2 = r - q     # length of subarray A[q+1..r]

    # Initialize arrays L and R
    L = [A[p + i]     for i in range(0, n1)]
    L.append(math.inf)

    R = [A[q + j + 1] for j in range(0, n2)]
    R.append(math.inf)

    # Construct A[p..r] from L and R in sorted order
    i, j = 0, 0
    for k in range(p, r + 1):
        if L[i] <= R[j]:
            A[k] = L[i]
            i += 1
        else:
            A[k] = R[j]
            j += 1
            inversions += n1 - i
    return inversions

def count_inversions(A):
    """Counts the number of inversions in array `A`

    Args:
        A: Array of numbers

    Returns:
        Number of inversions in `A`
    """

    # Recursively callable method
    def count_inversions_r(A, p, r):
        if p < r:
            q = (p + r) // 2 
            left_inversions = count_inversions_r(A, p    , q) 
            right_inversions = count_inversions_r(A, q + 1, r)
            inversions = merge_inversions(A, p, q, r) + left_inversions + right_inversions
            return inversions
        return 0

    inversions = count_inversions_r(A, 0, len(A) - 1)
    return inversions
tests = [
    [31, 41, 59, 26, 41, 58],
    [],
    [40, 20],
    [1],
    [10, 9, 8, 7, 6, 5],
    [7, 8, 9, 10]
]

for input_array in tests:
    print(f'Input: {input_array}')
    inversions = count_inversions(input_array)
    print(f'Inversions: {inversions}\n')
Input: [31, 41, 59, 26, 41, 58]
Inversions: 5

Input: []
Inversions: 0

Input: [40, 20]
Inversions: 1

Input: [1]
Inversions: 0

Input: [10, 9, 8, 7, 6, 5]
Inversions: 15

Input: [7, 8, 9, 10]
Inversions: 0