Assistant Professor of Statistics
- Department of Statistics, Carnegie Mellon University
- Office: Baker Hall 132-B
- Email: mgsell(at)cmu.edu
Brief Academic Bio
Dr. G'Sell is an Assistant Professor of Statistics in the Department of Statistics at Carnegie Mellon University, as well as affiliated faculty in the Machine Learning Department. His research broadly focuses on questions at the interface of statistical inference and machine learning, with a particular interest in applications to complex problems in the sciences and beyond. His work is supported through funding from organizations including the National Science Foundation (NSF), National Institutes of Health (NIH), and the Simons Foundation, as well as support for computing from the Pittsburgh Supercomputer Center and the NSF XSEDE program.
Dr. G'Sell received a PhD in Statistics from Stanford University, where he was advised by Robert Tibshirani, and a B.S. in Physics and Applied and Computational Mathematics from the California Institute of Technology.
ResearchI am broadly interested in questions of inference and interpretability at the interface of statistics and machine learning. These interests have included questions of data-guided hypothesis testing that arise in selective inference, as well as applications in neuroscience, cellular biology, genetics, and in an interesting twist, early modern english literature. I am interested in increasing the accessibility of these statistical tools to other fields and in developing in-depth collaborations with experts from other fields who are trying to carefully leverage predictive modeling and creative statistical testing.
I have ongoing collaborations in neuroscience with Avniel Ghuman on interpretable models for intracranial EEG data, as well as a novel experimental design for natural vision data with Avniel and L.P. Morency ; in cellular biology with Manoj Puthenveedu and Zara Weinberg on understanding the impact of human factors on predictive models of cellular events of microscopy; with Kathryn Roeder and Bernie Devlin on the application of tools from selective inference to better guide large-scale -omics testing; with Christopher Warren and Taylor Berg-Kirkpatrick applying statistical analysis, computer vision, and NLP to questions of analytical bibliography and quantitative book history in early modern english literature.
Publications and Preprints:
H-MAGMA, inheriting a shaky statistical foundation, yields
excess false positives
with Ronald Yurko, Kathryn Roeder and Bernie Devlin.
A Probabilistic Generative Model for Typographical Analysis
of Early Modern Printing
with Kartik Goyal, Chris Dyer, Christopher Warren and Taylor Berg-Kirkpatrick.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020)
Fairness evaluation in the presence of biased noisy labels
with Riccardo Fogliato and Alexandra Chouldechova
In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2020)
Damaged Type and Areopagitica’s Clandestine Printers.
with Christopher Warren, Pierce Williams, Shruti Rijhwani.
Milton Studies 62, no. 1 (2020) (press)
A selective inference approach for false discovery rate control using
multiomics covariates yields insights into disease risk
with Ronald Yurko, Kathryn Roeder, and Bernie Devlin.
Proceedings of the National Academy of Sciences (PNAS 2020)
Endogenous activity modulates stimulus and circuit-specific neural
tuning and predicts perceptual behavior
with Yuanning Li, Michael J. Ward, R. Mark Richardson and Avniel Singh Ghunam.
Nature Communications (2020)
Post-selection inference for
changepoint detection algorithms with application to copy number variation
with Sangwon Hyun, Kevin Lin and Ryan J. Tibshirani
Sharp Instruments for Classifying Compliers and Generalizing Causal Effects.
with Edward H Kennedy, Sivaraman Balakrishnan.
Annals of Statistics (2020)
Bootstrapping and Sample Splitting
for High-Dimensional, Assumption-Free Inference.
with Alessandro Rinaldo, Larry Wasserman.
The Annals of Statistics (2019)
- Shrinkage Classification for Overlapping Time Series: An interpretable method for
mapping stimulus-differentiated evoked response.
with Peter W. Elliott, Matthew J. Boring, Yuanning Li, R. Mark Richardson, Avniel Singh Ghuman
Distribution-free predictive inference for regression.
with Jing Lei, Alessandro Rinaldo, Ryan Tibshirani, and Larry Wasserman.
Journal of the American Statistical Association (2018)
- Exact post-selection inference for the generalized lasso
with Sangwon Hyun and Ryan J. Tibshirani.
Electronic Journal of Statistics (2018)
- Architecture of a multi-cellular polygenic network
governing immune homeostasis
with Tania Dubovik, Elina Starosvetsky, Benjamin LeRoy, Rachelly Normand, Yasmin Admon, Ayelet Alpert, Yishai Ofran, and Shai S. Shen-Orr
Fairer and more accurate, but for whom?
with Alexandra Chouldechova
Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017)
Automated acoustic detection of mouse scratching.
with Peter Elliott, Lindsey M. Snyder, Sarah E. Ross, and Valerie Ventura (2017).
PloS one (2017)
Sequential selection procedures and false discovery rate control
with Stefan Wager, Alexandra Chouldechova and Robert Tibshirani
Journal of the Royal Statistical Society: Series B (2016)
A rational approach to legacy data validation when transitioning between electronic health record systems.
with Pageler, N., M., Chandler, W., Mailes, E., Yang, C., Longhurst, C. (2016).
Journal of the American Medical Information Association (JAMIA) (2016)
Sensitivity Analysis for Inference
with Partially Identifiable Covariance Matrices.
with Shai S. Shen-Orr, Robert Tibshirani.
Computational Statistics (2013)
- Adaptive Testing for the Graphical Lasso
with Jonathan Taylor, Robert Tibshirani.
False Variable Selection Rates in Regression.
with Trevor Hastie, Robert Tibshirani.