RECITATION NOTES 15-451 Algorithms 09/19/07
updated 02/07/08
- hand back hwk
- problem from last year's final exam
- additional topic chosen by TA
=======================================================================
1. hand back hwk.
2. Here is a problem from last year's final exam, related to the
recent material. [reworded a bit for this recitation notes]
Suppose we want to design a binary search tree over {1,...,n} to
minimize the worst-case time to look up any element. Then clearly a
balanced tree is best. But, what if we are allowed to use a
randomized construction (equivalently, a probability distribution over
search trees). Perhaps we can do better in terms of minimizing the
worst-case *expected* time of any lookup.
In this problem, we consider the case of n=3.
Over {1,2,3}, there are only five trees:
(a) the balanced tree with 2 at the root, (b) the zig-zag tree with 1
at the root, then 3, then 2, and (c) the zag-zig tree with 3, then 1,
then 2. (d) 1-2-3 (e) 3-2-1.
a) Fill in the entries for a 5x3 matrix game corresponding to this
problem. We have one row for each tree, one column for each possible
element to be looked up, and the cells in the matrix should indicate
the cost to perform the given access in the given tree (assume the
cost of an access is the depth of the element in the tree, with the
root having depth 1).
b) Before getting to the main question, what are tight *deterministic*
upper and lower bounds?
Ans: 2. The balanced tree has depth at most 2 and there's no tree
with depth 1. From the point of view of the matrix, we
are saying that there is some row in which all entries are
at most 2 (upper bound of 2), and for every row there is some
entry that is at least 2 (lower bound of 2).
c) Now for the main question: what are tight *randomized* upper and
lower bounds?
To find these, we must solve for all Nash Equilibriums (NE) of the game
we are playing. Using a strategy that is not a NE means that we could change
strategies and improve our outcome.
Let's proceed by assigning probabilities to our different strategies.
We simplify things by noticing the neccessary symmetric distribution.
Let p = probability of zig-zag = probability of zag-zig
Let q = probability of zig-zig = probability of zag-zag
Then 1-2p-2q is the probability of the balanced tree.
Let's consider the expected cost of searching for 1,2,and 3.
After simplifying, we get:
EC_1 = 2-p
EC_2 = 1+4p+2q
and of course EC_3 = EC_1
If p=0, then EC_1 = 2, and our strategy is no better than the determistic. So p>0.
Nash tells us that in equilibrium, EC_1 = EC_2. This makes sense, because otherwise
our adversary would be feeding us the same number repeatadly, probably taking advantage of us.
Simplifying EC_1 = EC_2, we get:
1 = 5p + 2q. Since we want to minimize EC_1, we need to make p as large as possible.
To achieve that, q=0. We then solve that p = 1/5, and so 1-2p-2q = 3/5.
You can compute EC_1 = 9/5, saving a 1/5 of a lookup.
Actually, our argument shows that this is optimal, because no other NE could exist (for this game)
so this implies this is also a lower bound (otherwise it wouldn't be optimal!)
but if we want we can also find an optimal strategy
for the column player. In other words, what distribution on inputs is
the worst one for any tree algorithm?
In this case, it turns out we can assume the probability on 1 and 3
are equal. So, let's call it q, and have the probability on 2 be
1-2q. We now want to maximize the minimum of:
average cost on balanced tree: 4q + (1-2q)
average cost on zig-zag tree: 3q + 3(1-2q)
average cost on zag-zig tree: 3q + 3(1-2q)
So, we want to maximize the minimum of 1 + 2q and 3 - 3q.
Again, the minimum is maximized when both are equal, so 5q=2, q=2/5.
So, the worst distribution for the inputs is 2/5 prob on 1, 2/5 prob
on 3, and 1/5 prob on 2.
========================================
3. [insert any additional material here. E.g., could go over mini]