skip to page content CMU
Carnegie Mellon University
95-828 Machine Learning for Problem Solving
Spring 2022

Home
Syllabus
Assignments
Notes

CLASS MEETS:

There are two sections of the course offered in Spring 2022.
Time:
  • Section A: Tue & Thu 10:10AM - 11:30AM
  • Section B: Tue & Thu 1:25PM - 2:45PM
Place: Both sections ONLINE @Zoom until Jan 30th. (Calendar invitations with link sent individually)

WEEKLY RECITATION:

Time: Fri 10:10AM - 11:30AM
Place: ONLINE @Zoom (See link on Canvas)


PEOPLE:

Instructor: Leman Akoglu
  • Online office hour: FRI 9-10PM EDT; also, by appointment
  • Email: invert (cs.cmu.edu @ lakoglu)

Teaching Assistants:

Luping Fang
  • Office hour: WED 7-8PM EDT
  • Email: invert (andrew.cmu.edu @ lupingf)
Lingxiao Zhao
  • Office hour: SAT 10-11AM EDT
  • Email: invert (andrew.cmu.edu @ lingxia1)

Please find all the Zoom links to office hours on Canvas.

Graders:

Sayali Moghe
  • Email: invert (andrew.cmu.edu @ smoghe)
  • Office hours: by appointment
Nikhil Gupta
  • Email: invert (andrew.cmu.edu @ nikhilgu)
  • Office hours: by appointment


Zihan Wang
  • Email: invert (andrew.cmu.edu @ zw2)
  • Office hours: by appointment
Shruti Nair
  • Email: invert (andrew.cmu.edu @ shrutina)
  • Office hours: by appointment


COURSE DESCRIPTION:


Machine Learning (ML) is centered around automated methods that improve their own performance through learning patterns in data, and then using the uncovered patterns to predict the future and make decisions. ML is heavily used in a wide variety of domains such as business, finance, healthcare, security, etc. for problems including display advertising, fraud detection, disease diagnosis and treatment, face/speech recognition, automated navigation, to name a few.

"If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem
and 5 minutes thinking about solutions." -- Albert Einstein
"A problem well put is half solved." -- John Dewey

This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, and best practices used in machine learning. The main premise of the course is to equip students with the intuitive understanding of machine learning concepts grounded in real-world applications. The course is to help students gain the practical knowledge and experience necessary for recognizing and formulating machine learning problems in the wild, as well as of applying machine learning techniques effectively in practice. The emphasis will be on learning and practicing the machine learning process, more than learning theory.

"All models are wrong, but some models are useful." -- George Box

As there exists no universally best model, we will cover a wide range of different models and learning algorithms, which have varying speed-accuracy-interpretability tradeoffs. In particular, the topics include supervised learning: linear models, decision trees, ensemble methods, kernel methods, nonparametric learning, and unsupervised learning: density estimation, clustering, and dimensionality reduction. The class will include biweekly homework each containing a mini-project (i.e., a problem solving assignment that involves programming) in addition to other conceptual and technical questions, a midterm, a final exam, and a case study at the end of the course. The case study will give students a chance to dig into a substantial problem using a large dataset and apply machine learning tools they have learned throughout the course.

Prerequisites

This course does not assume any prior exposure to machine learning theory or practice. Students are expected to have the following background: • Basic knowledge of probability • Basic knowledge of linear algebra • Basic programming skills • Familiarity with Python programming and basic use of NumPy, pandas and matplotlib.

Learning Objectives

By the end of this class, students will
  • learn the main concepts, methodologies, and tools for machine learning
  • be able to recognize machine learning tasks in real-world problems
  • develop the critical thinking for comparing and contrasting models for a given task
  • learn the best practices for reliably performing model selection and evaluation
  • gain experience with implementing ML solutions in Python and applying them to various real world datasets

BULLETIN BOARD and other info

  • For course materials, assignments, announcements, and grades please see the Canvas.
  • For submitting homework electronically, you will use Gradescope.
  • For questions and discussions please use Piazza. Here is the link to signup.
  • Carnegie Mellon 2021-2022 official academic calendar.

TEXTBOOK:

There is no required textbook for the course. I will post course notes and slides for each lecture as well as some code examples (Jupyter notebooks) on Canvas. See Resources for a list of recommended books that could help supplement your understanding of the course material.

MISC - FUN:

Fake (ML) protest @G20 Summit (2009)    ML demonstration @PittMarathon (2019)