George H. Chen
Assistant Professor of Information Systems,
Heinz College
Affiliated Faculty,
Machine Learning Department
Carnegie Mellon University
Email: georgechen [at symbol] cmu.edu
Office: HBH 2216 (the west wing of Hamburg Hall, second floor)
About
I primarily work on building trustworthy machine learning models for
time-to-event prediction (survival analysis) and for time series
analysis. I often use nonparametric prediction models that work
well under very few assumptions on the data. My main application area
is in healthcare.
Survival analysis tutorial:
Much of what I work on is survival analysis. I developed and taught a
survival analysis tutorial at CHIL 2020 (co-taught with Jeremy Weiss)
and at SIGMETRICS 2021.
[tutorial webpage]
CoolCrop:
I occasionally also work on machine learning for the developing world.
I co-founded and now serve as an advisor for
CoolCrop, an AgriTech startup based
in India that works on providing farmers with cold storage units
(such as a refrigerator shared by a village) and market forecasts.
We're currently serving over 9000 farmers across 7 states in India
at over 40 sites.
Pre-historic:
I obtained my Ph.D. in Electrical
Engineering and Computer Science at MIT.
My thesis was on
nonparametric machine learning methods. At MIT, I also worked on
satellite
image analysis to help bring electricity to rural India, and
taught twice in Jerusalem for MEET,
a program that brings together Israeli and Palestinian high school
students to learn computer science and entrepreneurship. I completed my B.S.
at UC Berkeley, majoring in
Electrical Engineering and Computer Sciences, and
Engineering Mathematics and Statistics.
My CV can be found here.
Some News
Survival analysis symposium (Oct 2023):
Russ Greiner, Chirag Nagpal, Weijing Tang, Kevin Xu, and I are co-organizing
a AAAI 2023 fall symposium on survival analysis/time-to-event prediction.
[webpage]
Post doc Shu Hu will start in Fall 2023 as an assistant professor at IUPUI within the Purdue School of Engineering and Technology: many congrats to Shu!
Graduating PhD student Yue Zhao defended his PhD thesis and will start in Fall 2023 as an assistant professor at USC in Computer Science: many congrats to Yue!
PhD student Shahriar Noroozizadeh wins the Suresh Konda best first research paper award at CMU's Heinz College: many congrats to Shahriar!
NeurIPS 2023: I will be serving as an area chair.
MEET Summer 2023:
I'm slated to return to Jerusalem this summer to teach computer science to Israeli and Palestinian high school students as part of MIT's Middle East Entrepreneurs of Tomorrow (MEET) program. I previously taught for this program in the summers of 2015 and 2016.
Regeneron International Science and Engineering Fair (ISEF) (May 7-13, 2022):
This is a high school science and engineering fair that I was a finalist in back in 2006 (it was called Intel ISEF then). On May 11, 2022, I was on the Innovation, Entrepreneurship, and Impact panel (along with other ISEF alumni Ulyana Pena, Ashley Sarracino, Matthew Tamayo, and Mo Zerban), answering a variety of questions.
Conference on Health, Inference, and Learning (CHIL) (April 7-8, 2022):
I was proceedings chair (with Gerardo Flores and Tom Pollard)
[conference website]
NSF CAREER award (June 24, 2021):
I received an NSF CAREER award for my proposed project on developing real-time nonparametric machine learning models for healthcare with guarantees
Teaching (Spring 2023, mini 4)
95-865 "Unstructured Data Analytics" (Sections A4/B4/Z4)
Research Supervision
I've had the fortune of working with many wonderful students over the years (listed below). If you're interested in working with me and you already are a CMU student, then feel free to shoot me an email telling me what you're particularly excited about working on, why it overlaps with my research interests, and what skills you've already cultivated. I do not take on students who are not already admitted to CMU.
Current PhD student collaborators:
Past students and where they went after graduating:
- Yue Zhao (PhD 2023), Assistant Professor at USC Department of Computer Science
- Emaad Manzoor (PhD 2021), Assistant Professor at Cornell University SC Johnson Graduate School of Management
- Mi Zhou (PhD 2020), Assistant Professor at UBC Sauder School of Business
- ♣ Wei Ma (master's in ML 2018/PhD 2019), Assistant Professor at Hong Kong Polytechnic University in the Department of Civil and Environmental Engineering
- ♣ Lynn H. Kaack (master's in ML 2018/PhD 2019), Assistant Professor at the Hertie School
- Thomas Tam (MSPPM 2023), Sunstella Foundation/Jewish Healthcare Foundation
- Brenda Palma (MISM 2022), Markaaz
- Xiaotong (Maggie) Lu (MISM 2020), Uber
- Runtong (Fred) Yang (MISM 2019), Indeed
- Ren Zuo (MISM 2018), Cornerstone Research
- Linhong (Lexie) Li (B.S. 2020), McKinsey
- Junyan Pu (B.S. 2020), CMU master's degree program in CS
♣ indicates a PhD student who worked with me on a secondary master's in ML (I was their master's research advisor but not their PhD research advisor)
Past postdoc:
- Shu Hu (postdoc from Fall 2022 to Summer 2023), Assistant Professor at IUPUI Purdue School of Engineering and Technology
Papers
You can also find my papers listed on
Google Scholar.
Some Working Papers
-
"Fairness in Survival Analysis with Distributionally Robust Optimization"
Shu Hu*, George H. Chen*
(* = equal contribution)
Journal paper version of our ML4H 2022 paper
(Under review)
[working draft]
-
"Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee"
George H. Chen
[arXiv (working draft)] [code]
(Under review)
Best paper finalist (applied track) at the INFORMS Data Mining and Decision Analytics Workshop 2022
-
"Neural Topic Models with Survival Supervision: Jointly Predicting Time-to-Event Outcomes and Learning How Clinical Features Relate"
George H. Chen*, Linhong Li*, Ren Zuo, Amanda Coston, Jeremy C. Weiss
(* = equal contribution)
Journal paper version of our AIME 2020 paper
(Under review)
[working draft]
-
"Temporal Supervised Contrastive Learning for Modeling Patient Risk Progression"
Shahriar Noroozizadeh, Jeremy C. Weiss, George H. Chen
[working draft]
Preliminary version presented at the AAAI 2023 Workshop on Representation Learning for Responsible Human-Centric AI (R2HCAI)
2023
-
"Neurological Prognostication of Post-Cardiac-Arrest Coma Patients Using EEG Data: A Dynamic Survival Analysis Framework with Competing Risks"
Xiaobin Shen, Jonathan Elmer, George H. Chen
[arXiv]
Machine Learning for Healthcare (MLHC), August 2023
-
"A General Framework for Visualizing Embedding Spaces of Neural Survival Analysis Models Based on Angular Information"
George H. Chen
[arXiv] [code]
Conference on Health, Inference, and Learning (CHIL), June 2023
-
"Influence via Ethos: On the Persuasive Power of Reputation in Deliberation Online"
Emaad Manzoor, George H. Chen, Dokyun Lee, Michael D. Smith
[arXiv] [DOI] [Cornell news]
Management Science, May 2023
Best paper at the AAAI Workshop on AI for Behavior Change 2021
2022
-
"BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs"
Kay Liu*, Yingtong Dou*, Yue Zhao*, Xueying Ding, Xiyang Hu, Ruitong Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li, George H. Chen, Zhihao Jia, Philip S. Yu
(* = equal contribution)
Neural Information Processing Systems (NeurIPS) (Datasets and Benchmarks track), November-December 2022
[arXiv] [code]
-
"Distributionally Robust Survival Analysis: A Novel Fairness Loss Without Demographics"
Shu Hu*, George H. Chen*
(* = equal contribution)
Machine Learning for Health (ML4H), November 2022
[arXiv] [code]
-
"TOD: Tensor-Based Outlier Detection, a General GPU-Accelerated Framework"
Yue Zhao, George H. Chen, Zhihao Jia
Proceedings of the VLDB Endowment, Vol 16, No. 3, November 2022
[arXiv] [code]
(To be presented at the International Conference on Very Large Data Bases, August-September 2023)
-
"ECOD: Unsupervised Outlier Detection Using Empirical Cumulative Distribution Functions"
Zheng Li*, Yue Zhao*, Xiyang Hu, Nicola Botta, Cezar Ionescu, George H. Chen
(* = equal contribution)
IEEE Transactions on Knowledge and Data Engineering, March 2022
[arXiv] [code]
2021
-
"Consumer Behavior in the Online Classroom: Using Video Analytics and Machine Learning to Understand the Consumption of Video Courseware"
Mi Zhou, George H. Chen, Pedro Ferreira, Michael D. Smith
Journal of Marketing Research, December 2021
[SSRN]
2020
-
"Neural Topic Models with Survival Supervision: Jointly Predicting Time-to-Event Outcomes and Learning How Clinical Features Relate"
Linhong Li, Ren Zuo, Amanda Coston, Jeremy C. Weiss, George H. Chen
International Conference on Artificial Intelligence in Medicine (AIME), August 2020
[working draft of journal paper version (fixes various bugs in the conference paper version)]
-
"Predicting Mortality Risk in Viral and Unspecified Pneumonia to Assist Clinicians with COVID-19 ECMO Planning"
Helen Zhou*, Cheng Cheng*, Zachary C. Lipton, George H. Chen, Jeremy C. Weiss
(* = equal contribution)
International Conference on Artificial Intelligence in Medicine (AIME), August 2020
[arXiv] [code]
(Also presented at the International Conference on Machine Learning (ICML) Workshop on Machine Learning for Global Health, July 2020)
-
"Deep Kernel Survival Analysis and Subject-Specific Survival Time Prediction Intervals"
George H. Chen
Machine Learning for Healthcare (MLHC), August 2020
[arXiv] [code] [poster]
Note:
This paper is essentially a sequel to my theory paper on nearest neighbor and kernel survival analysis (ICML 2019), where an open problem encountered is how to automatically learn kernel functions for survival analysis aside from using random survival forests.
2019
-
"Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption"
Wei Ma*, George H. Chen*
(* = equal contribution)
Neural Information Processing Systems (NeurIPS), December 2019
[arXiv] [code] [poster] [talk slides]
Best paper (theoretical track) at the INFORMS Data Mining and Decision Analytics Workshop 2019
-
"Truck Traffic Monitoring with Satellite Images"
Lynn H. Kaack, George H. Chen, M. Granger Morgan
ACM Conference on Computing and Sustainable Societies (COMPASS), July 2019
[arXiv]
(Also presented at the
International Conference on Machine Learning (ICML) Workshop on Climate Change, June 2019)
-
"Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates"
George H. Chen
International Conference on Machine Learning (ICML), June 2019
[arXiv (includes minor corrections)] [code] [talk slides] [poster]
Note:
In my follow-up work at MLHC 2020, I show how to automatically learn kernel functions for survival analysis in a neural net framework, and how to use these kernel functions to help construct survival time prediction intervals.
-
"An Interpretable Produce Price Forecasting System for Small Farmers in India using Collaborative Filtering and Adaptive Nearest Neighbors"
Wei Ma, Kendall Nowocin, Niraj Marathe, George H. Chen
Information and Communication Technologies and Development (ICTD), January 2019
[arXiv]
2018
-
"Explaining the Success of Nearest Neighbor Methods in Prediction"
George H. Chen, Devavrat Shah
Foundations and Trends in Machine Learning, May 2018
[DOI]
2017
-
"Survival-Supervised Topic Modeling with Anchor Words: Characterizing Pancreatitis Outcomes"
George H. Chen, Jeremy C. Weiss
Neural Information Processing Systems (NeurIPS) Workshop on Machine Learning for Health (ML4H), December 2017
[arXiv (short workshop version)]
(Also presented at the Society for Medical Decision Making North American Meeting, October 2017)
-
"Toward Reducing Crop Spoilage and Increasing Small Farmer Profits in India: a Simultaneous Hardware and Software Solution"
George H. Chen, Kendall Nowocin, Niraj Marathe
Information and Communication Technologies and Development, November 2017
[arXiv]
2015
-
"A Latent Source Model for Patch-Based Image Segmentation"
George H. Chen, Devavrat Shah, Polina Golland
Medical Image Computing and Computer-Assisted Intervention (MICCAI), October 2015
[arXiv]
[paper]
[poster]
Note:
For a more comprehensive exposition of this paper, consider
reading Chapter 5 of my
Ph.D. thesis.
-
"Latent Source Models for Nonparametric Inference"
George H. Chen
Ph.D. thesis, MIT, May 2015
[paper]
Received the George M. Sprowls award for best Ph.D. thesis in Computer Science at MIT
-
"Targeting Villages for Rural Development Using Satellite Image
Analysis"
Kush R. Varshney, George H. Chen, Brian Abelson, Kendall
Nowocin, Vivek Sakhrani, Ling Xu, Brian L. Spatocco
Big Data, March 2015
[paper]
2014
-
"A Latent Source Model for Online Collaborative Filtering"
(alphabetical author ordering)
Guy Bresler, George H. Chen, Devavrat Shah
Neural Information Processing Systems (NeurIPS), December 2014
[arXiv - longer version]
[paper - short conference version]
[poster]
Selected as a spotlight (one of 62/1678 submissions)
Note:
An expanded version including intuition for how collaborative
filtering relates to an MAP item recommender and derivations for
the examples is in Chapter 4 of my
Ph.D. thesis;
the notation has also been changed to be more similar to the
other two papers that went toward my thesis.
2013
-
"A Latent Source Model for Nonparametric Time Series Classification"
(alphabetical author ordering)
George H. Chen, Stanislav Nikolov, Devavrat Shah
Neural Information Processing Systems (NeurIPS), December 2013
[arXiv - longer version]
[paper - short conference version]
[poster]
Note:
An expanded version with a lower bound on the misclassification
rate and further discussion is in Chapter 3 of my
Ph.D. thesis.
-
"Sparse Projections of Medical Images onto Manifolds"
George H. Chen, Christian Wachinger, Polina Golland
Information Processing in Medical Imaging (IPMI), June-July 2013
[arXiv]
[paper]
[poster]
2012
-
"Deformation-Invariant Sparse Coding"
George H. Chen
Master's thesis, MIT, May 2012
[paper]
[poster]
2011
-
"Deformation-Invariant Sparse Coding for Modeling Spatial Variability of Functional Patterns in the Brain"
George H. Chen, Evelina G. Fedorenko, Nancy G. Kanwisher, Polina Golland
Neural Information Processing Systems (NeurIPS) Workshop on Machine Learning and Interpretation in Neuroimaging, December 2011
[paper]
[talk slides]
2010
-
"Indoor Localization and Visualization Using a Human-Operated Backpack System"
Timothy Liu, Matthew Carlberg, George Chen, Jacky Chen, John Kua, Avideh Zakhor
International Conference on Indoor Positioning and Indoor Navigation (IPIN), September 2010
[paper]
-
"Indoor Localization Algorithms for a Human-Operated Backpack System"
George Chen, John Kua, Stephen Shum, Nikhil Naikal, Matthew Carlberg, Avideh Zakhor
International Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), May 2010
[paper]
2009
-
"Classifying Urban Landscape in Aerial LIDAR Using 3D Shape Analysis"
Matthew Carlberg, Peiran Gao, George Chen, Avideh Zakhor
International Conference on Image Processing (ICIP), November 2009
[paper]
-
"2D Tree Detection in Large Urban Landscapes Using Aerial LIDAR Data"
George Chen, Avideh Zakhor
International Conference on Image Processing (ICIP), November 2009
[paper]
-
"Image Augmented Laser Scan Matching for Indoor Dead Reckoning"
Nikhil Naikal, John Kua, George Chen, Avideh Zakhor
International Conference on Intelligent Robots and Systems (IROS), October 2009
[paper]
Last updated 7/4/2023.