RECITATION NOTES 15-451 Algorithms 09/19/07 updated 02/07/08 - hand back hwk - problem from last year's final exam - additional topic chosen by TA ======================================================================= 1. hand back hwk. 2. Here is a problem from last year's final exam, related to the recent material. [reworded a bit for this recitation notes] Suppose we want to design a binary search tree over {1,...,n} to minimize the worst-case time to look up any element. Then clearly a balanced tree is best. But, what if we are allowed to use a randomized construction (equivalently, a probability distribution over search trees). Perhaps we can do better in terms of minimizing the worst-case *expected* time of any lookup. In this problem, we consider the case of n=3. Over {1,2,3}, there are only five trees: (a) the balanced tree with 2 at the root, (b) the zig-zag tree with 1 at the root, then 3, then 2, and (c) the zag-zig tree with 3, then 1, then 2. (d) 1-2-3 (e) 3-2-1. a) Fill in the entries for a 5x3 matrix game corresponding to this problem. We have one row for each tree, one column for each possible element to be looked up, and the cells in the matrix should indicate the cost to perform the given access in the given tree (assume the cost of an access is the depth of the element in the tree, with the root having depth 1). b) Before getting to the main question, what are tight *deterministic* upper and lower bounds? Ans: 2. The balanced tree has depth at most 2 and there's no tree with depth 1. From the point of view of the matrix, we are saying that there is some row in which all entries are at most 2 (upper bound of 2), and for every row there is some entry that is at least 2 (lower bound of 2). c) Now for the main question: what are tight *randomized* upper and lower bounds? To find these, we must solve for all Nash Equilibriums (NE) of the game we are playing. Using a strategy that is not a NE means that we could change strategies and improve our outcome. Let's proceed by assigning probabilities to our different strategies. We simplify things by noticing the neccessary symmetric distribution. Let p = probability of zig-zag = probability of zag-zig Let q = probability of zig-zig = probability of zag-zag Then 1-2p-2q is the probability of the balanced tree. Let's consider the expected cost of searching for 1,2,and 3. After simplifying, we get: EC_1 = 2-p EC_2 = 1+4p+2q and of course EC_3 = EC_1 If p=0, then EC_1 = 2, and our strategy is no better than the determistic. So p>0. Nash tells us that in equilibrium, EC_1 = EC_2. This makes sense, because otherwise our adversary would be feeding us the same number repeatadly, probably taking advantage of us. Simplifying EC_1 = EC_2, we get: 1 = 5p + 2q. Since we want to minimize EC_1, we need to make p as large as possible. To achieve that, q=0. We then solve that p = 1/5, and so 1-2p-2q = 3/5. You can compute EC_1 = 9/5, saving a 1/5 of a lookup. Actually, our argument shows that this is optimal, because no other NE could exist (for this game) so this implies this is also a lower bound (otherwise it wouldn't be optimal!) but if we want we can also find an optimal strategy for the column player. In other words, what distribution on inputs is the worst one for any tree algorithm? In this case, it turns out we can assume the probability on 1 and 3 are equal. So, let's call it q, and have the probability on 2 be 1-2q. We now want to maximize the minimum of: average cost on balanced tree: 4q + (1-2q) average cost on zig-zag tree: 3q + 3(1-2q) average cost on zag-zig tree: 3q + 3(1-2q) So, we want to maximize the minimum of 1 + 2q and 3 - 3q. Again, the minimum is maximized when both are equal, so 5q=2, q=2/5. So, the worst distribution for the inputs is 2/5 prob on 1, 2/5 prob on 3, and 1/5 prob on 2. ======================================== 3. [insert any additional material here. E.g., could go over mini]