Homework Assignment 2

Dynamic Programming

15-211 Summer 2010

Due Thursday, July 15th, 2010 at 11:59:59pm

Overview

This assignment is designed to ensure that you know how to use dynamic programming to solve recursive problems. You will solve the Edit Distance problem, which has wide applications in computing (e.g. computational linguistics, spell checking).

Objectives

Theory Questions: 20 points

In the file theory.pdf there are some short theory questions. Please answer the questions in this file and turn it in class.

Starter Code

Download the starter code for assignment (zip file). Once you finish implementation, FTP java files to /afs/andrew.cmu.edu/course/15/211-kesden/handin

What to Submit

You FTP the following java files

Do not submit anything else like zip files, folders, desktops, class files. Make sure you do not use packages in your java files as well. Failure to do so will result in a 20 point penalty.

Edit Distance: 45 points

The edit distance between two strings is defined as the minimum number of edits required to transform one string into another. An edit is one of three operations: delete (a character from the original string), insert, and replace. There is a fourth operation, match, which does not count as an edit. Here is an example. The two input strings are "activate" and "caveat". Here is one possible transformation. For the purposes of this assignment, a transformation is represented by a string consisting of the characters D, M, R and I.

D
M
D
R
M
I
M
M
D
a
c
t
i
v
a
t
e
c
a
v
e
a
t

There are two other possible transformations, DMRDMIMMD and DMRRRMMD. The edit cost between these two strings is 5, because each transformation requires 5 operations (matches are not counted).

You will write all your code inside the class EditDistance in the file EditDistance.java.

Part 1 (20 pts): Getting Edit Cost

Write the following method:

public static int getEditCost(String s1, String s2);

It should return the minimum number of edits required to transform s1 into s2 (or vice versa; the number is the same either way). Note that the input strings may be empty, and your code should handle this case. You can trust that the input strings are both non-null, so don't worry about handling that case.

Part 2 (25 pts): Getting the Transformation

Write the following method:

public static String getTransformation(String s1, String s2);

It should return any minimum-cost transformation (represented by a String containing only the characters D, M, R and I that transforms s1 into s2 (here, order is important). Once again, the input strings may be empty, but they will be non-null. Note that although there may be multiple transformations from s1 to s2, your code can return any of them and it will be considered correct.

Extra Credit Part(10 pts): Getting All Transformations

Write the following method:

public static Set getAllTransformations(String s1, String s2);

It should return a Set containing Strings that represent all possible minimum-cost transformations that transform s1 into s2 (order is important). The input strings may be empty but will be non-null.

Longest Decreasing Subsequence (LDS): 25 points

Given an array of ints, we want to find the length of the longest decreasing subsequence. Example

11, 0, 9, 15, 15, 8, 6, 9, 2, 14

The longest decreasing subsequence is shown in bold; in this example, it has length 5. Note that the subsequence does not have to be contiguous. It is also strictly decreasing; i.e. for some element in the LDS, the next element in the LDS must be strictly less than it, as opposed to less than or equal to.

You will write all your code in the class LDS in the LDS.java file.

Write the following method:

public static int lds(int[] seq);

It returns the length of the LDS in the sequence represented by seq, an array of ints. Note that seq may be empty, but it will be non-null.

Coding Style: 10 points

Your assignment will be graded first by compiling and testing it. After that, we will read your code to determine whether appropriate data structures and algorithms were used, as well as judge the overall style of your code. Programs will be graded both on correctness as well as their ability to deal with so-called edge cases, like empty lists, lists with only one element, and so on. Points will be deducted for poor design decisions and un-commented, or un-readable code as determined by your TA.

General Advice

Important: You may not import anything other than what is already imported for you!

For any dynamic programming algorithm, think before you code! The nature of DP requires you to know exactly how the table is initialized and constructed before you can write any code. Plan out each algorithm on paper.

Edit distance:

LDS:

We realize that testing your code may be difficult with this assignment, since you have no way to verify the correctness of your results other than by inspection. This is why it is extremely important to plan your algorithm - if you prove that it is mathematically correct (which is easy to do with DP) then as long as you implement it properly, there should be no problem. Trial and error does not work when writing a DP algorithm.