Profile photo


I am an assistant professor at Carnegie Mellon University's Heinz College of Public Policy and Information Systems, and an affiliated faculty member of the Machine Learning Department. I primarily work on machine learning for healthcare and for information systems in developing countries. A recurring theme across my work is the use of nonparametric prediction methods in solving temporal or spatial forecasting problems. Since these methods inform interventions that can be costly and affect people’s well-being, ensuring that predictions are interpretable is essential.

Nearest neighbor survey book thumbnail

My book with Devavrat Shah is out: "Explaining the Success of Nearest Neighbor Methods in Prediction" (Foundations and Trends in Machine Learning, May 2018). Despite nearest neighbor methods appearing in text as early as the 11th century in Alhazen's "Book of Optics", it was not until fairly recently that arguably the most general, nonasymptotic theory for nearest neighbor classification was developed by Chaudhuri and Dasgupta (2014). This book goes over some of the latest nonasymptotic theoretical guarantees for nearest neighbor and related kernel regression and classification methods both in general metric spaces, and in contemporary applications where clustering structure appears (time series forecasting, recommendation systems, medical image segmentation). The book also covers some recent advances in approximate nearest neighbor search, explains why decision tree and related ensemble methods are nearest neighbor methods, and discusses the potential for far away neighbors to help in prediction. We also organized a related workshop at NIPS 2017 (slides are available for all the talks).

Current projects:

Spring 2018: I taught 95-865 "Unstructured Data Analytics" and 94-775 "Unstructured Data Analytics for Policy".

Before joining Carnegie Mellon, I finished my Ph.D. in Electrical Engineering and Computer Science at MIT, advised by Polina Golland and Devavrat Shah. My thesis developed theory for forecasting viral news, recommending products to people, and finding human organs in medical images. I also worked on satellite image analysis to help bring electricity to rural India, and modeled brain activation patterns evoked by reading sentences. Between grad school and becoming faculty, I helped develop the recommendation engine at a predictive analytics startup Celect and then was a teaching postdoc in MIT's Digital Learning Lab, where I was the primary instructor and course developer for a new edX course on computational probability and inference.

I enjoy teaching and pondering the future of education! I have previously taught at MIT, UC Berkeley, and in Jerusalem at a program MEET that brings together Israeli and Palestinian high school students. As a grad student, I served on the Task Force on the Future of MIT Education, and my time as a teaching postdoc was all about better understanding the digital learning space.

Last updated July 17, 2018. Photo credit: Danica Chang.