Research overview and selected papers
My research develops theoretically principled solutions to algorithmic problems spanning economics and machine learning, drawing on techniques from computer science. I'm driven by questions that advance fundamental understanding and have clear pathways to real-world applications.
Specifically, my research explores how algorithms can make better decisions in dynamic, online environments. The key challenge stems from not knowing what the future will look like. This is especially so in the presence of multiple agents, each with their own goals that may align or conflict with our own. Below, I outline the central themes of my work.
Algorithmic aspects of no-regret learning in games
Self-play—whereby an agent learns by playing against itself over a sequence of rounds—has been instrumental in landmark AI breakthroughs, from mastering poker to defeating world champions at chess and Go. Learning in complex multiagent environments also features prominently in autobidding systems, a multi-billion-dollar industry. Performance in such settings is measured by the fundamental notion of regret.
Basic question: How many rounds are needed for players to have low regret?
Our work has provided near-optimal guarantees for this problem.
Algorithmic results
- First near-optimal guarantees for swap regret in general-sum multi-player games (STOC 2022). Swap regret regret is a formidable performance benchmark associated with correlated equilibria, a fundamental solution concept in economics.
- A powerful framework for obtaining logarithmic swap regret in multi-player games (NeurIPS 2022, oral). We were able to extend this framework to general convex games (NeurIPS 2022), where players can have complex strategies, as well as sequential games with imperfect information, also known as extensive-form games (EC 2022 and ICML 2023).
- Efficient online learning algorithms in extensive-form games for the best achievable notion of regret (EC 2025).
-
A class of learning algorithms can attain an infinite rate of convergence in general-sum games (NeurIPS 2022).
Complexity lower bounds
- First lower bounds for self-play in extensive-form games, establishing a super-polynomial separation vis-à-vis normal-form games (ITCS 2024). This paper also shows the first lower bounds for seminal algorithms such as optimistic hedge in self-play.
- Sum-of-squares lower bounds for self-play in normal-form games and tight lower bounds for no-regret learning with non-trivial welfare guarantees (STOC 2025).
Convergence of learning dynamics in games and interplay with optimization
The convergence of no-regret dynamics in multiagent settings is a major question with diverse applications ranging from reinforcement learning to supervised learning—adversarial training is a prime example—and generative models such as GANs. It turns out that classic results from the single-agent setting often fail to transfer in the presence of multiple players. For example, gradient descent—the workhorse of modern machine learning—does not converge even in zero-sum games!
Basic questions: When do no-regret dynamics converge? If they converge, how fast do they do so? If not, what can be said about their long-term behavior?
Our work has provided powerful new tools and results addressing these questions.
- A framework for characterizing the convergence of a broad class of no-regret dynamics in multi-player games (ICML 2022). This paper also shows that when such algorithms fail to converge, they yield improved social welfare guarantees!
-
Convergence of regret matching(+) in constrained optimization and potential games (arXiv 2025).
This paper shows for the first time that regret matching+ is a sound first-order optimizer, whereas regret matching is exponentially slow.
The regret matching family has been instrumental in AI breakthroughs in zero-sum games, but its behavior was a mystery beyond.
- Scalable first-order methods converge exponentially fast in zero-sum games without the condition number in the smoothed analysis setting (NeurIPS 2024). In other words, such algorithms should perform well beyond pathological instances.
- Convergence in time-varying games (NeurIPS 2023) and the meta-learning in games framework (ICLR 2023). Part of the motivation for this work stems from the ubiquitous setting in which many similar games ought to be solved. Another application is when a principle dynamically shapes a game via payments (as in mechanism design) to steer to desired equilibria (EC 2024).
Brigdes between game theory and optimization
The contributions highlighted above leverage techniques from optimization to shed light on the complex behavior of simple learning algorithms in multiagent problems. Conversely, we have found that there is fertile ground in employing concepts and techniques from game theory to make progress on fundamental problems in optimization. This is exemplified by the following results.-
The notion of expected variational inequalities, a tractable solution concept for general optimization problems (ICML 2025, oral).
This paper brings ideas from correlated equilibria and no-regret learning to a host of other settings beyond games. Along the way, we were also able to use the optimization viewpoint to uncover intriguing new phenomena even in games.
-
The first polynomial-time algorithm for solving variational inequalities under the Minty condition (arXiv 2025).
This classic condition generalizes monotonicity (itself a generalization of convexity) and goes back to the 1970s. Our algorithm combines the famous ellipsoid with an optimistic step,
a technique popularized in the context of no-regret learning in games.
Complexity of equilibria in games
Understanding the complexity of different equilibrium concepts in games is a central question in algorithmic game theory. Hardness results cast doubt on the plausibility of the equilibrium notion ("If your laptop cannot find it, neither can the market"). Conversely, operationalizing a solution concept hinges on the development of scalable algorithms for computing it. The most standard solution concept in noncooperative settings is the Nash equilibrium. While a series of celebrated works showed that it's hard to compute one in general games, many important questions surrounding its complexity remain open. Other equilibrium concepts that either strengthen or relax the Nash equilibrium, such as correlated equilibria, are also of central importance.
Basic question: What's the complexity of computing different solution concepts, such as Nash equilibria, in games?
Our work has made important advances in this line of research.
- A complete characterization of the complexity of Nash equilibria in adversarial team games (EC 2023 and NeurIPS 2025, spotlight). These are games that feature both competing and aligning interests, in line with many real-world applications. We also gave the first efficient algorithm in the multi-round, multi-state (i.e. Markov) setting (ICLR 2023, oral).
- The first polynomial-time algorithm for computing Nash equilibria in the class of harmonic games, and more broadly, games adhering to the Minty condition (arXiv 2025).
-
A complete characterization of the complexity of equilibrium refinements in potential games (arXiv 2025).
Our algorithms offer a principled way to improve the quality of equilibria and refine the convergence landscape of gradient descent.
Policy optimization for heart transplant allocation
Heart transplantation is a viable path for patients suffering from advanced heart failure, but this lifesaving option is severely limited due to donor shortage. Although the current allocation policy was recently revised in 2018, a major concern is that it does not adequately take into account pretransplant and post-transplant mortality. A key challenge is that decisions have to be made online without knowledge about future arrivals. There is also significant selection bias in the data with which typical survival models are trained. Finally, there are ramping manipulability issues that undermine the efficiency of the allocation system.
Basic question: How can we improve the efficiency of the allocation subject to the above challenges?
- We found that the policy current employed in the US is highly suboptimal in terms of life years gained as it fails to account for pretransplant and post-transplant mortality (AHA 2025). This paper also develops significantly improved non-myopic policies.
- Under assumptions validated in the historical UNOS data, we developed an online policy whose performance provably matches the omniscient one which knows the exact sequence of future arrivals (paper forthcoming).
Computational social choice
Social choice theory studies how to aggregate individual preferences into collective decisions. My research in this area centered on metric distortion, a framework which quantifies the performance of different voting rules in the presence of limited information concerning voter preferences. Think of a referendum: a voter can choose either "yes" or "no," but is unable to indicate how strongly they feel through the ballot; that is, one cannot distinguish between an indifferent voter and one who is vehemently in favor of one choice. This problem of "distortion" gets significantly more complex when there are multiple alternatives to choose from.
Our research has contributed the following to this line of work.
-
Design and analysis of voting rules under various types of limited information (JAIR 2022).
Check out the paper if you are curious about how the Eurovision Song Contest or Formula 1 would rank under the theoretically optimal voting rule!
-
A characterization of the performance of STV, a widely popular voting rule in practice (AAAI 2022).
This paper tightly connects STV to the intrinsic dimensionality of the space of preferences; for instance, the caricature of "left" and "right"
in the political spectrum can be thought of as a special case.
Distributed algorithms
Distributed computing studies how to design fast algorithms for networks of processors that must coordinate to solve a common task with limited communication. Different models exist based on how information is exchanged. One extreme is when only adjacent nodes can communicate, but without any bandwidth constraints. Conversely, other models allow global communication but with severe bandwidth constraints. The relevance of each model depends on the application at hand.
Our work has contributed faster and often near-optimal algorithms for fundamental problems.
- Algorithms and lower bounds in the hybrid model, which combines different communication modes in line with many practical scenarios (DISC 2021).
- A universally optimal distributed algorithm for solving Laplacian systems, a problem at the heart of many breakthroughs in computer science (DISC 2022; extended abstract at PODC 2022, and invited to the journal Distributed Computing). This means that our algorithm is essentially optimal for any possible network topology.
Learning theory
Learning theory provides the mathematical foundations underpinning machine learning. A pervasive challenge in modern applications is noisy data: machine learning algorithms are often trained on imperfect data on account of systematic biases or even adversarial corruptions. A central question thus is how to design algorithms robust to different types of noise.
- Noise-robust algorithms through the framework of statistical queries for a broad class of challenging noise models (AISTATS 2021). This includes the first efficient algorithms for the Tsybakov noise model.
Teaching
- LP duality and minimax theorem. Iterative methods and regret minimization. PDF
- Extensive-form games. Kuhn's theorem. Counterfactual regret minimization and regret circuits. PDF
- Learning in general-sum games: correlated equilibria and Φ-regret. The GGM framework. PDF
- Swap regret over nonlinear deviations. Expected fixed points. Ellipsoid against hope. PDF
- Faster no-regret learning dynamics and last-iterate convergence using optimism. PDF
- Deep learning in game solving: Monte Carlo CFR, Deep CFR, ESCHER. PDF
Publications
Experience
Google DeepMind
Student researcher • June 2025 – August 2025
Meta AI Research
Research intern • May 2023 – August 2023
Max Planck Institute for Informatics
Research intern • June 2021 – August 2021, July 2020 – October 2020
Academic Service
Conference reviewing: EC 2025, NeurIPS (2023, 2024, 2025), ICML (2022, 2024, 2025), ICLR (2024, 2025, 2026), AAAI (2024, 2025, 2026), AISTATS (2021, 2025, 2026)
Journal reviewing: Games and Economic Behavior (2022), Theoretical Computer Science (2023), IEEE/ACM Transactions on Networking (2023), International Journal on Approximate Reasoning (2024), IEEE Transactions on Neural Networks and Learning Systems (2024), European Journal of Operational Research (2025), Proceedings of National Academy of Sciences (2025)
Sub-reviewing: STOC 2021, FOCS (2021, 2024), FCT 2021, MFCS 2023, CDC 2024