skip to page content CMU
Carnegie Mellon University
95-828 Machine Learning for Problem Solving
Spring 2024

Home
Syllabus
Assignments
Notes

CLASS MEETS:

There are two sections of the course offered in Spring 2023.
Time:
  • Section A: Tue & Thu 11:00AM - 12:20PM
  • Section B: Tue & Thu 2:00PM - 3:20PM
Place: Both sections in HBH A301. Link to Zoom (optional) on Canvas

WEEKLY RECITATION:

Time: Fri 2:00PM - 3:20PM
Place: HBH A301 (Also see Zoom link on Canvas)


PEOPLE:

Instructor: Leman Akoglu
  • Office hour (Mini A3): THU 12:50PM - 1:50PM (starts Jan 23)
  • Office: HBH 2118C, office ph. 412-268-30 four three
  • Email: invert (andrew.cmu.edu @ lakoglu)

Teaching Assistants:

Xueying Ding
  • Office hour: MON 11:30AM to 12:30PM EDT
  • Email: invert (andrew.cmu.edu @ xding2)
Xiaobin Shen
  • Office hour: THU 10-11AM EDT
  • Email: invert (andrew.cmu.edu @ xiaobins)


Zijun Ding
  • Office hour: MON 5-6 PM EDT
  • Email: invert (andrew.cmu.edu @ zijund)

Zijun and Xiaobin have reserved HBH 2108 to hold their in-person OHs.
Xueying's OHs will be on Zoom/online (please find link on Canvas).


Graders:

Yanjun Chen
  • Email: invert (andrew.cmu.edu @ yanjunch)
  • Office hours: by appointment
Nachiketa Hebbar
  • Email: invert (andrew.cmu.edu @ nhebbar)
  • Office hours: by appointment
Longyang Xu
  • Email: invert (andrew.cmu.edu @ longyanx)
  • Office hours: by appointment


COURSE DESCRIPTION:


Machine Learning (ML) is centered around automated methods that improve their own performance through learning patterns in data, and then using the uncovered patterns to predict the future and make decisions. ML is heavily used in a wide variety of domains such as business, finance, healthcare, security, etc. for problems including display advertising, fraud detection, disease diagnosis and treatment, face/speech recognition, automated navigation, to name a few.

"If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem
and 5 minutes thinking about solutions." -- Albert Einstein
"A problem well put is half solved." -- John Dewey

This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, and best practices used in machine learning. The main premise of the course is to equip students with the intuitive understanding of machine learning concepts grounded in real-world applications. The course is to help students gain the practical knowledge and experience necessary for recognizing and formulating machine learning problems in the wild, as well as of applying machine learning techniques effectively in practice. The emphasis will be on learning and practicing the machine learning process, more than learning theory.

"All models are wrong, but some models are useful." -- George Box

As there exists no universally best model, we will cover a wide range of different models and learning algorithms, which have varying speed-accuracy-interpretability tradeoffs. In particular, the topics include supervised learning: linear models, decision trees, ensemble methods, kernel methods, nonparametric learning, and unsupervised learning: density estimation, clustering, and dimensionality reduction. The class will include biweekly homework each containing a mini-project (i.e., a problem solving assignment that involves programming) in addition to other conceptual and technical questions, a midterm, a final exam, and a case study at the end of the course. The case study will give students a chance to dig into a substantial problem using a large dataset and apply machine learning tools they have learned throughout the course.

Prerequisites

This course does not assume any prior exposure to machine learning theory or practice. Students are expected to have the following background: • Basic knowledge of probability • Basic knowledge of linear algebra • Basic programming skills • Familiarity with Python programming and basic use of NumPy, pandas and matplotlib.

Learning Objectives

By the end of this class, students will
  • learn the main concepts, methodologies, and tools for machine learning
  • be able to recognize machine learning tasks in real-world problems
  • develop the critical thinking for comparing and contrasting models for a given task
  • learn the best practices for reliably performing model selection and evaluation
  • gain experience with implementing ML solutions in Python and applying them to various real world datasets

BULLETIN BOARD and other info

  • For course materials, assignments, announcements, and grades please see the Canvas.
  • For submitting homework electronically, you will use Gradescope.
  • For questions and discussions please use Piazza. Here is the link to signup.
  • Carnegie Mellon 2023-2024 official academic calendar.

TEXTBOOK:

There is no required textbook for the course. I will post course notes and slides for each lecture as well as some code examples (Jupyter notebooks) on Canvas. See Resources for a list of recommended books that could help supplement your understanding of the course material.

MISC - FUN:

Fake (ML) protest @G20 Summit (2009)    ML demonstration @PittMarathon (2019)