Yuanzhi Li

I'm an Assistant Professor in the Machine Learning Department at CMU (2019.9-present). Previously, I was a Postdoc in the Computer Science Department at Stanford University (2018.9-2019.8). I obtained my Ph.D. in computer science at Princeton University (2014.9-2018.8) under the advise of Sanjeev Arora and my B.S.E. in computer science and mathematics at Tsinghua University (2010.9-2014.8).

I work on Machine Learning. My goal is to designing efficient and provable algorithms for practical machine learning problems. I am also very interested in convex/non-convex optimization.


$l at andrew.cmu.edu, replace $ with my first name.


  • Chasing Nested Convex Bodies Nearly Optimally. With Sébastien Bubeck, Yin Tat Lee and Mark Sellke. SODA 2020
  • Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks. With Colin Wei and Tengyu Ma. NeurIPS 2019
  • What Can ResNet Learn Efficiently, Going Beyond Kernels?. With Zeyuan Allen-Zhu. NeurIPS 2019
  • Can SGD Learn Recurrent Neural Networks with Provable Generalization?. With Zeyuan Allen-Zhu. NeurIPS 2019
  • Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers. With Zeyuan Allen-Zhu and Yingyu Liang. NeurIPS 2019
  • On the Convergence Rate of Training Recurrent Neural Networks. With Zeyuan Allen-Zhu and Zhao Song. NeurIPS 2019
  • Improved Path-length Regret Bounds for Bandits. With Sébastien Bubeck, Haipeng Luo and Chen-Yu Wei. COLT 2019
  • Near-optimal method for highly smooth convex optimization. With Sébastien Bubeck, Qijia Jiang, Yin Tat Lee and Aaron Sidford. COLT 2019
  • A Convergence Theory for Deep Learning via Over-Parameterization. With Zeyuan Allen-Zhu and Zhao Song. ICML 2019
  • Competitively Chasing Convex Bodies. With Sébastien Bubeck, Yin Tat Lee and Mark Sellke. STOC 2019
  • Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees. With Yuping Luo, Huazhe Xu, Yuandong Tian, Trevor Darrell and Tengyu Ma. ICLR 2019
  • Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data. With Yingyu Liang. NIPS 2018
  • Online Improper Learning with an Approximation Oracle. With Elad Hazan, Wei Hu and Zhiyuan Li. NIPS 2018
  • Neon2: Finding Local Minima via First-Order Oracles. With Zeyuan Allen-Zhu. NIPS 2018
  • Algorithmic Regularization in Over-parameterized Matrix Recovery. With Tengyu Ma and Hongyang Zhang. COLT 2018
  • Learning Mixtures of Linear Regressions with Nearly Optimal Complexity. With Yingyu Liang. COLT 2018
  • The Well Tempered Lasso. With Yoram Singer. ICML 2018
  • Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits. With Zeyuan Allen-Zhu and Sebastien Bubeck. ICML 2018
  • An Alternative View: When Does SGD Escape Local Minima?. With Robert Kleinberg and Yang Yuan. ICML 2018
  • Operator Scaling via Geodesically Convex Optimization, Invariant Theory and Polynomial Identity Testing. With Zeyuan Allen-Zhu, Ankit Garg, Rafael Oliveira and Avi Wigderson. STOC 2018
  • An homotopy method for Lp regression provably beyond self-concordance and in input-sparsity time. With Sebastien Bubeck, Michael B. Cohen and Yin Tat Lee. STOC 2018
  • Linear algebraic structure of word senses, with applications to polysemy. With Sanjeev Arora, Yingyu Liang, Tengyu Ma and Andrej Risteski. TACL 2018
  • Sparsity, variance and curvature in multi-armed bandits. With Sebastien Bubeck and Michael B. Cohen. ALT 2018
  • An Instance Optimal Algorithm for Top-k Ranking under the Multinomial Logit Model. With Xi Chen and Jieming Mao. SODA 2018
  • Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls. With Zeyuan Allen-Zhu, Elad Hazan and Wei Hu. NIPS 2017
  • Convergence Analysis of Two-layer Neural Networks with ReLU Activation . With Yang Yuan. NIPS 2017
  • Much Faster Algorithms for Matrix Scaling. With Zeyuan Allen-Zhu, Rafael Oliveira and Avi Wigderson. FOCS 2017
  • Fast Global Convergence of Online PCA. With Zeyuan Allen-Zhu. FOCS 2017
  • Near-Optimal Design of Experiments via Regret Minimization. With Zeyuan Allen-Zhu, Aarti Singh and Yining Wang. ICML 2017
  • Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations. With Yingyu Liang. ICML 2017
  • Follow the Compressed Leader: Faster Algorithm for Matrix Multiplicative Weight Updates. With Zeyuan Allen-Zhu. ICML 2017
  • Faster Principal Component Regression via Optimal Polynomial Approximation to sgn(x). With Zeyuan Allen-Zhu. ICML 2017
  • Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition. With Zeyuan Allen-Zhu. ICML 2017
  • RAND-WALK: a latent variable model approach to word embeddings. With Sanjeev Arora, Yingyu Liang, Tengyu Ma and Andrej Risteski. TACL 2016
  • Even Faster SVD Decomposition Yet Without Agonizing Pain. With Zeyuan Allen-Zhu. NIPS 2016
  • Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods. With Andrej Risteski. NIPS 2016
  • Tight algorithms and lower bounds for approximately convex optimization. With Andrej Risteski. NIPS 2016
  • Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates. With Yingyu Liang and Andrej Risteski. NIPS 2016
  • Recovery guarantee of weighted low-rank approximation via alternating minimization. With Yingyu Liang and Andrej Risteski. ICML 2016
  • Geodesic cycles in random graphs. (Undergrad thesis in Mathematics) With Lingsheng Shi. Journal of Discrete Mathematics 2018
  • A Theoretical Analysis of NDCG Ranking Measures. With Yining Wang, Liwei Wang, Di He, Wei Chen and Tie-Yan Liu. COLT 2013
  • ...
  • Teaching

  • ML10-725: Convex Optimization, Spring 2020 . Teacher
  • ML10-725: Convex Optimization, Fall 2020 . Teacher
  • COS423: Theory of Algorithms, Fall 2016. TA
  • COS521: Advance algorithm design, Spring 2016. TA