skip to page content SBU
95-801 Data Mining Techniques
Fall 2017



  • 11/25: FINAL EXAM: on December 11 -- between 8:30-10:30 AM -- in HBH 1006 & 1007
  • 11/9: TA Day & Room change: Urvashi will hold office hours on Fridays in HBH 1004.
  • 10/26: Room change: Starting today,we will meet in HBH 2008.
  • 10/24: First day of class, welcome!
  • 8/5: Students are encouraged to study the Syllabus to have a general understanding of the
    course organization, as well as the Assignments to have an idea about the workload.


Instructor: Leman Akoglu, ( lakoglu @ andrew )
  • Office: HBH 2118C, office ph 412-268-30 four three
  • Office hours: Tue 2-3 PM; also, by appointment

Teaching Assistant: Urvashi Kohli, ( ukk @ andrew )
  • Office: HBH 3034   HBH 1004
  • Office hours: Thu   Fri 12-1 PM


When: Tue & Thu 4:30-5:50 PM
Where: HBH 2008


Knowledge discovery from data is "the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data" --- Fayyad et al. (1996)

Motivation: Data generated by humans and machines is available everywhere and growing steadily. In today's data-driven world it is crucial for students to acquire the fundamental skills for being able to analyze massive datasets and to develop data-driven techniques toward solving real-world problems.

This course will cover the fundamental concepts and techniques in data mining, and equip students with the basic skillset toward becoming good data scientists. Major topics include algorithms and tools for data exploration, fast similarity search, pattern mining, outlier detection, dimensionality reduction, ranking, data streams, text mining, etc. See the syllabus for details. The coursework involve mini-projects on various datasets to enable students to gain hands-on experience with data analytics.

Learning Objectives

By the end of this class, students will
  • develop basic understanding of core data mining concepts
  • learn algorithmic approaches to various data mining problems
  • be able to analyze and assess data mining algorithms based on their accuracy, computational/storage complexity, and the tradeoffs thereof.
  • gain hands-on experience using data analytics techniques on real-world datasets


We expect students to have a copy of the following book, from which most readings will be assigned. Below you can find a list of other recommended books to learn certain subjects in more depth.
I will also post the lecture notes on the Canvas. See resources for other pointers.

BULLETIN BOARD and other info

  • We will use the Canvas for course materials, homework deposits, announcements, and grades.
  • We will use Piazza for questions and discussions.
  • Carnegie Mellon 2017-2018 Official academic calendar


joke1     joke2     joke3