Instructor: Prof. Patrick McSharry (patrick@mcsharry.net)
Semester: Fall 2023
Lecture:
Kigali: |
Tue. and Thu. |
3:00 pm - 4:20 pm CAT |
|
Pittsburgh: |
Tue. and Thu. |
8:00 am - 9:20 am ET |
Recitation:
Kigali: |
Friday |
3:00 pm - 3:50 pm CAT |
|
Pittsburgh: |
Friday |
8:00 am - 8:50 am ET |
This course will provide the methods and skills required for utilizing data and quantitative models to automate predictive analytics and make improved decisions. From descriptive statistics to data analysis to machine learning the course will demonstrate the process of collecting, cleaning, interpreting, transforming, exploring, analyzing and modeling data with the goal of extracting information, communicating insights and supporting decision-making. The advantages and disadvantages of linear, nonlinear, parametric, nonparametric and ensemble methods will be discussed while exploring the challenges of both supervised and unsupervised learning. The importance of quantifying uncertainty, statistical hypothesis testing and communicating confidence in model results will be emphasized. The advantages of using visualization techniques to explore the data and communicate the outcomes will be highlighted throughout. Applications will include visualization, clustering, ranking, pattern recognition, anomaly detection, data mining, classification, regression, forecasting and risk analysis. Participants will obtain hands-on experience during project assignments that utilize publicly available datasets and address practical challenges.
The full course syllabus contains additional information and can be found in PDF format HERE.
After completing this course, students should be able to:
There will be no required textbooks, though we suggest the following to help you to study (all available online):
We will use Piazza for class discussions. Please go to the course Piazza site to join the course forum (note: you must use a cmu.edu email account to join the forum). We strongly encourage students to post on this forum rather than emailing the course staff directly (this will be more efficient for both students and staff). Students should use Piazza to:
The course Academic Integrity Policy must be followed when doing assignments and on the message boards at all times. Details on ECE's Academic Integrity Policy can be found in the course syllabus and HERE.
The grades for this course will be based on students’ performance on seven homework assignments, a final exam and class participation. Homework assignments will be done individually and turned in via Canvas by the designated due date. Late work will be acceptable until 24 hours past the deadline, but it will lose 10%. The assignments will be graded based on both a writing report and code used to achieve results presented in the report. Class participation will be evaluated based on student’s contribution to discussions both in-class and on the Piazza Discussion Board. When posting or reacting to online discussion threads, students are expected to use their own words and the post should be relevant to the topic under discussion. Make sure to introduce, summarize and explain the article in your own words to enlighten the audience on the point the article is making.
The following is the weight distribution of the grades:
Class participation 5%
Kahoot Quiz 2.5%
Piazza Participation 2.5%
Homework Assignment 1 5%
Homework Assignment 2 10%
Homework Assignment 3 10%
Homework Assignment 4 10%
Mid-Semester Exam (Multiple Choice) 10%
Homework Assignment 5 10%
Homework Assignment 6 12.5%
Homework Assignment 7 12.5%
Kaggle Project Bonus 2.5%
Final Exam (Multiple Choice) 10%
Name |
|
Office Hours |
Zoom Meeting ID |
Christian Iradukunda |
Tuesday 6:00PM - 8:00PM CAT |
||
Isaac Coffie |
Wednesday 6:00PM - 8:00PM CAT |
||
Tanya Akumu |
Thursday 6:00PM - 8:00PM CAT |
||
Eliphaz Niyodusenga |
Friday 5:00PM - 7:00PM CAT |
||
Akem Aristide Tanyi-Jong |
Saturday 7:00PM - 9:00PM CAT |
||
Innocent Mukoki |
Sunday 8:00PM - 10:00PM CAT |
Date |
Topics |
Slides |
Assignment |
Tues, Sep 01 |
Measurement |
[Week 1A] |
Assignment 1 released |
Thurs, Sep 03 |
Data Collection |
[Week 1] |
|
Fri, Sep 04 |
Recitation |
|
|
Mon, Sep 07 |
|
|
Assignment 1 due |
Tues, Sep 08 |
Data Manipulation |
[Week 2A] |
Assignment 2 released |
Thurs, Sep 10 |
Data Exploration |
[Week 2] |
|
Fri, Sep 11 |
Recitation |
|
|
Tues, Sep 15 |
Descriptive Statistics |
[Week 3A] |
|
Thurs, Sep 17 |
Distributions |
[Week 3] |
|
Fri, Sep 18 |
Recitation |
|
|
Mon, Sep 21 |
|
|
Assignement 2 due |
Tues, Sep 22 |
Statistical Hypotheses |
[Week 4A] |
Assignement 3 released |
Thurs, Sep 24 |
Quantifying Confidence |
[Week 4] |
|
Fri, Sep 25 |
Recitation |
|
|
Tues, Sep 29 |
Trends |
[Week 5A] |
|
Thurs, Oct 01 |
Decision Making |
[Week 5] |
|
Fri, Oct 02 |
Recitation |
|
|
Mon, Oct 05 |
|
|
Assignent 3 due |
Tues, Oct 06 |
Forecasting |
[Week 6A] |
Assignent 4 released |
Thurs, Oct 08 |
Model evaluation |
[Week 6] |
|
Fri, Oct 09 |
Recitation |
|
|
Mon, Oct 19 |
|
|
Assignment 4 due |
Thurs, Oct 22 |
Mid-Semester Exam |
|
|
Tues, Oct 27 |
Statistical Learning A |
[Week 7A] |
Assignent 5 released |
Thurs, Oct 29 |
Statistical Learning B |
[Week 7] |
|
Fri, Oct 30 |
Recitation |
|
|
Tues, Nov 03 |
Linear Models A |
[Week 8A] |
|
Thurs, Nov 05 |
Linear Models B |
[Week 8] |
|
Fri, Nov 06 |
Recitation |
|
|
Mon, Nov 09 |
|
|
Assignment 5 due |
Tues, Nov 10 |
Nonlinear Models A |
[Week 9A] |
Assignment 6 released |
Thurs, Nov 12 |
Nonlinear Models B |
[Week 9] |
|
Fri, Nov 13 |
Recitation |
|
|
Tues, Nov 17 |
Supervised Learning A |
[Week 10A] |
|
Thurs, Nov 19 |
Supervised Learning B |
[Week 10] |
|
Fri, Nov 20 |
Recitation |
|
|
Mon, Nov 23 |
|
|
Assignment 6 due |
Tues, Nov 24 |
Unsupervised Learning A |
[Week 11A] |
Assignment 7 released |
Thurs, Nov 26 |
Unsupervised Learning B |
[Week 11] |
|
Fri, Nov 27 |
Recitation |
|
|
Tues, Dec 01 |
Ensemble Approaches A |
[Week 12A] |
|
Thurs, Dec 03 |
Ensemble Approaches B |
[Week 12] |
|
Fri, Dec 04 |
Recitation |
|
|
Mon, Dec 07 |
|
|
Assignment 7 due Kaggle competition due |
Thurs, Dec 17 |
Final Exam |
|
|