Yuanzhi Li

I'm a Assistant Professor in the Machine Learning Department at CMU. Previously I was a postdoc at Stanford Computer Science Department (20180-2019). I obtained my Ph.D in computer science at Princeton University (2014-2018) and my B.S.E. in computer science and mathematics at Tsinghua University (2010-2014).

I work on Deep Learning (Theory).

Contact

$l at andrew dot cmu dot edu, replace $ with my first name.

Papers (See google scholar for the full list)

A Convergence Theory for Deep Learning via Over-Parameterization. With Zeyuan Allen-Zhu and Zhao Song. Manuscript

Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers. With Zeyuan Allen-Zhu and Yingyu Liang. Submitted

On the Convergence Rate of Training Recurrent Neural Networks. With Zeyuan Allen-Zhu and Zhao Song. Submitted

Chasing Nested Convex Bodies Nearly Optimally. With Sébastien Bubeck, Yin Tat Lee and Mark Sellke. Submitted

Competitively Chasing Convex Bodies. With Sébastien Bubeck, Yin Tat Lee and Mark Sellke. Submitted

Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees. With Yuping Luo, Huazhe Xu, Yuandong Tian, Trevor Darrell and Tengyu Ma. Submitted

Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data. With Yingyu Liang. NIPS 2018

Online Improper Learning with an Approximation Oracle. With Elad Hazan, Wei Hu and Zhiyuan Li. NIPS 2018

Neon2: Finding Local Minima via First-Order Oracles. With Zeyuan Allen-Zhu. NIPS 2018

Algorithmic Regularization in Over-parameterized Matrix Recovery. With Tengyu Ma and Hongyang Zhang. COLT 2018

Learning Mixtures of Linear Regressions with Nearly Optimal Complexity. With Yingyu Liang. COLT 2018

The Well Tempered Lasso. With Yoram Singer. ICML 2018

Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits. With Zeyuan Allen-Zhu and Sebastien Bubeck. ICML 2018

An Alternative View: When Does SGD Escape Local Minima?. With Robert Kleinberg and Yang Yuan. ICML 2018

Operator Scaling via Geodesically Convex Optimization, Invariant Theory and Polynomial Identity Testing. With Zeyuan Allen-Zhu, Ankit Garg, Rafael Oliveira and Avi Wigderson. STOC 2018

An homotopy method for Lp regression provably beyond self-concordance and in input-sparsity time. With Sebastien Bubeck, Michael B. Cohen and Yin Tat Lee. STOC 2018

Linear algebraic structure of word senses, with applications to polysemy. With Sanjeev Arora, Yingyu Liang, Tengyu Ma and Andrej Risteski. TACL 2018

Sparsity, variance and curvature in multi-armed bandits. With Sebastien Bubeck and Michael B. Cohen. ALT 2018

An Instance Optimal Algorithm for Top-k Ranking under the Multinomial Logit Model. With Xi Chen and Jieming Mao. SODA 2018

Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls. With Zeyuan Allen-Zhu, Elad Hazan and Wei Hu. NIPS 2017

Convergence Analysis of Two-layer Neural Networks with ReLU Activation . With Yang Yuan. NIPS 2017

Much Faster Algorithms for Matrix Scaling. With Zeyuan Allen-Zhu, Rafael Oliveira and Avi Wigderson. FOCS 2017

Fast Global Convergence of Online PCA. With Zeyuan Allen-Zhu. FOCS 2017

Near-Optimal Design of Experiments via Regret Minimization. With Zeyuan Allen-Zhu, Aarti Singh and Yining Wang. ICML 2017

Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations. With Yingyu Liang. ICML 2017

Follow the Compressed Leader: Faster Algorithm for Matrix Multiplicative Weight Updates. With Zeyuan Allen-Zhu. ICML 2017

Faster Principal Component Regression via Optimal Polynomial Approximation to sgn(x). With Zeyuan Allen-Zhu. ICML 2017

Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition. With Zeyuan Allen-Zhu. ICML 2017

RAND-WALK: a latent variable model approach to word embeddings. With Sanjeev Arora, Yingyu Liang, Tengyu Ma and Andrej Risteski. TACL 2016

Even Faster SVD Decomposition Yet Without Agonizing Pain. With Zeyuan Allen-Zhu. NIPS 2016

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods. With Andrej Risteski. NIPS 2016

Tight algorithms and lower bounds for approximately convex optimization. With Andrej Risteski. NIPS 2016

Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates. With Yingyu Liang and Andrej Risteski. NIPS 2016

Recovery guarantee of weighted low-rank approximation via alternating minimization. With Yingyu Liang and Andrej Risteski. ICML 2016

A Theoretical Analysis of NDCG Ranking Measures. With Yining Wang, Liwei Wang, Di He, Wei Chen and Tie-Yan Liu. COLT 2013

...

Teaching

COS423, Princeton: Theory of Algorithms, Fall 2016. TA

COS521, Princeton: Advance algorithm design, Spring 2016. TA

COS10725, CMU: Optimization in Machine Learning (Convex Optimization), Spring and Fall, 2020-2022. Instructor