skip to page content SBU
Carnegie Mellon University
Machine Learning for Problem Solving
95-828 - Spring 2017

Home
Syllabus
Assignments
Notes

Assignments

ASSIGNMENTS ARE DUE AT THE BEGINNING OF LECTURE ON THE DUE DATE


COURSEWORK:

Coursework consist of (grading in parentheses):
  • Homework (40%)
  • Midterm exam (15%)
  • Final exam (20%)
  • Project (25%)

NOTE: All assignments (except projects) are to be done individually. Please see the Collaboration policy.

IMPORTANT DATES:

Assignment Note Out Due Weight
Homework 1
Exploratory Data Analysis, Linear Models
Jan 26
Feb 21
10%
Homework 2
Decision Trees, SVM, and Kernels
Feb 21
Mar 13
10%
Homework 3
Ensemble learning, Instance-based learning, Clustering
Mar 14
Apr 11
10%
Homework 4
Semi-supervised learning, Text, Time series, Networks
Apr 11
May 4
10%
Midterm Exam
(in class)
Mar 9
--
15%
Final Exam

Thursday, May 11
9:30am - 12:30pm
HBH A301
--
20%
Project proposal
List of [datasets] [more project ideas]
--
Feb 23
1%
Midway report

--
Apr 11
7%
Project presentation
(in class)
--
May 2 & 4
7%
Project final writeup

--
May 2 & 4
10%

HOMEWORK:

Homework should be turned in at the beginning of the class on the day it is due. If you are taking late day(s), please send your homework as an email to the TA and also submit a hard copy next time in class. Note the number of late days you used on the top front of the first page of your homework.

We ask that you submit all your code that was used to complete the assignment electronically only (no print outs) via Blackboard.


EXAMS:

There will be a midterm and a final exam. Note: Both the midterm and the final will be open book, notes, papers, etc., but you are not allowed to use a computer. The tentative dates are posted above, the finalized dates will be announced during the semester.


PROJECTS:

Your class project is an opportunity for you to explore an interesting machine learning problem of your choice in the context of a real-world data set. Below, you will find some project ideas (will be posted some time during the semester). Your class project must be about new things you have done this semester; you cannot use results you have developed in previous semesters.

Projects can be done by you as an individual, or in teams of two students. The course TA will consult with you on your ideas, but of course the final responsibility to define and execute an interesting piece of work is yours.

Your project will be worth 25% of your final class grade, broken into four main deliverables:

  • Project proposal (1% of the course grade)
  • Project milestone report (7% of the course grade) (** 4 pages maximum **, including references) describing the results of your first experiments by the milestone due date (see above). Note that, as with any conference, the page limits are strict. Papers over the limit will not be considered.
  • Final project writeup (10% of the course grade)  preferably in ACM format (** 8 pages maximum, 4 pages minimum **, including references; page limit is strict)
  • Final project presentation (in-class)(7% of the course grade)

Project Proposal:

You must turn in a brief project proposal (** 1 page maximum **) on the due date (see above), in class. A list of suggested projects and data sets are posted below. Read the list carefully. You are encouraged to use one of the suggested data sets, because we know that they have been successfully used for machine learning in the past. If you prefer to use a different data set, we will consider your proposal, but you must have access to this data already, and present a clear proposal for what you would do with it.

Project proposal format: Proposals should be 1 page maximum. Include the following information:
  • Your name and Andrew ID on top of the page
  • Project title
  • Data set
  • Project idea.This should be approximately two paragraphs.
  • Papers to read. Include 1-3 relevant papers. You will probably want to read at least one of them before submitting your proposal.
  • Teammate: Will you have a teammate? If so, whom? Maximum team size is two students.
  • What will you complete by the project milestone due date? Experimental results of some kind are expected here.

Project Writeups:


Your write-ups should include the information detailed below, in approximately the order given. Your write-up need not have corresponding sections or bullet points, but course staff should be able to find the information without searching too hard. Be as precise/specific as you can.

Note: The mid-way report will be a relatively incomplete version of the final write up. It should include similar sections and address similar questions, but need not contain all the details. Think of the mid-way report as a preliminary version of the final draft. It is more of a status report, including preliminary results, issues that you are facing in developing your project, and how you plan to modify your approach to tackle some of those issues moving forward.
  • Introduction/Motivation/Problem Definition (15%)
    • What is it that you are trying to solve/achieve? Who cares and why does it matter?
    • Identify, define, and motivate the problem that you are addressing.
    • How (precisely) will a machine learning solution address the problem?

  • Data Understanding and Preparation (15%)
    • Identify and describe the data (and data sources) that will support machine learning to address the problem.
    • Include various aspects of the data such as its size (GB/TB/etc), type(s), format, etc.
    • Specify how these data are integrated to produce the format required for machine learning.

  • Methodology (30%)
    This is where you give a detailed description of your primary contributions. It is especially important that this part be clear and well written so that we can fully understand what you did.
    • How did you approach the problem? What challenges did you face? In what (unique) ways did you handle those challenges?
    • Specify the type of model(s) built and/or information/knowledge extracted.
    • Discuss choices for machine learning algorithm: what are other alternatives, and what are their pros and cons (in the context of the problem and as compared to your proposed solution)?
    • Discuss why and how this model should "solve" the problem (i.e., improve along some dimension of interest).

  • Evaluation and Results (30%)
    We are interested in seeing a clear and conclusive set of experiments which successfully evaluate the problem you set out to solve. Make sure to interpret the results and talk about what we can conclude and learn from your approach.
    • How do you evaluate your machine learning solution to the specific question(s) you have addressed?
    • What do these evaulation methods tell you about your solution?
    It is not so important how well your method performs but rather, (a) how thorough and careful your evaluation is, and (b) how interesting and clever your results and findings are.

  • Style and writing (10%)
    Overall writing, grammar, organization, figures and illustrations.
You are suggested to use the ACM format to write your project reports (8 pages maximum, 4 pages minimum, including references; this page limit is strict).

Project Presentations:

  • Think of this as an oral version of your final project writeup.
  • Present your work in a meaningful and interesting flow (eg, motivation, problem definition, data description, challenges, proposed methods, results and their interpretation).
  • Make sure to include enough details and background of your methodology (similar to a conference talk).
  • See here and here for some how-to on giving a good/bad talk.
  • Be prepared to ask (tough) questions to other project groups.
  • We will spend (the last) 2 lectures on project talks. Depending on the number of project groups, each group will be given 5-8 minutes including questions.

Datasets for Project:

We provide a long list of potential data sources for your project right here. The project is open-ended and you are expected to come up with your own project description and problem definition. In addition to your technical approach, we will evaluate your creativity in formulating an interesting and important problem for the project.



Last modified by Leman Akoglu, Mar 2017