Research

Distributed games with jumps: An α-potential game approach (with Xin Guo and Xinyu Li)
Submitted (2025) [Preprint]
Policy optimization for continuous-time linear-quadratic graphon mean field games (with Philipp Plank)
Submitted (2025) [Preprint]
Efficient learning for entropy-regularized Markov Decision Processes via Multilevel Monte Carlo (with Matthieu Meunier and Christoph Reisinger)
Submitted (2025) [Preprint]
Accuracy of discretely sampled stochastic policies in continuous-time reinforcement learning (with Yanwei Jia and Du Ouyang)
Revision at SIAM Journal on Control and Optimization (2025) [Preprint]
Continuous-time mean field games: a primal-dual characterization (with Xin Guo, Anran Hu, and Jiacheng Zhang)
Submitted (2025) [Preprint]
Pricing and hedging of decentralised lending contracts (with Lukasz Szpruch, Marc Sabaté Vidales, and Tanut Treetanthiploet)
Submitted (2024) [Preprint]
Continuous-time dynamic decision making with costly information (with Christoph Knochenhauer and Alexander Merkel)
Revision at Mathematics of Operations Research (2024) [Preprint]
ε-policy gradient for online pricing (with Tanut Treetanthiploet and Lukasz Szpruch)
Revision at Applied Mathematics and Optimization (2024) [Preprint]
An offline learning approach to propagator models (with Eyal Neuman and Wolfgang Stockinger)
Revision at Mathematical Finance (2023) [Preprint] [Colab Notebook]
Optimal regularity of extended mean field controls and their piecewise constant approximation (with Christoph Reisinger and Wolfgang Stockinger) (2020) [Preprint]

Mirror descent for stochastic control problems with measure-valued controls (with Bekzhan Kerimkulov, David Siska and Lukasz Szpruch)
Stochastic Processes and Their Applications , forthcoming (2025) [Preprint]
Entropy annealing for policy mirror descent in continuous time and space (with Deven Sethi and David Siska)
SIAM Journal on Control and Optimization, 63 (2025), pp. 3006-3041 [pdf] [Preprint]
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces (with Bekzhan Kerimkulov, James-Michael Leahy, David Siska and Lukasz Szpruch)
Foundations of Computational Mathematics, Online First (2025) [pdf] [Preprint]
An α-potential game framework for N-player dynamic games (with Xin Guo and Xinyu Li)
SIAM Journal on Control and Optimization, 63 (2025), pp. 2964-3005 [pdf] [Preprint]
Statistical learning with sublinear regret of propagator models (with Eyal Neuman)
The Annals of Applied Probability, forthcoming (2025) [Preprint]
Towards an analytical framework for dynamic potential games (with Xin Guo)
SIAM Journal on Control and Optimization, 63 (2025), pp. 1213-1242 [pdf] [Preprint]
A fast iterative PDE-based algorithm for feedback controls of nonsmooth mean-field control problems (with Christoph Reisinger and Wolfgang Stockinger)
SIAM Journal on Scientific Computing, 46 (2024), pp. A2737-A2773 [pdf] [Preprint]
Exploration-exploitation trade-off for continuous-time episodic reinforcement learning with linear-convex models (with Lukasz Szpruch and Tanut Treetanthiploet)
The Annals of Applied Probability, forthcoming (2024) [Preprint]
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems (with Michael Giegrich and Christoph Reisinger)
SIAM Journal on Control and Optimization, 62 (2024), pp. 1060-1092 [pdf] [Preprint]
Optimal scheduling of entropy regulariser for continuous-time linear-quadratic reinforcement learning (with Lukasz Szpruch and Tanut Treetanthiploet)
SIAM Journal on Control and Optimization, 62 (2024), pp. 135-166 [pdf] [Preprint]
Linear convergence of a policy gradient method for some finite horizon continuous time control problems (with Christoph Reisinger and Wolfgang Stockinger)
SIAM Journal on Control and Optimization, 61 (2023), pp. 3526-3558 [pdf] [Preprint]
A posteriori error estimates for fully coupled McKean-Vlasov forward-backward SDEs (with Christoph Reisinger and Wolfgang Stockinger)
IMA Journal of Numerical Analysis, online first, 2023 [pdf] [Preprint]
Reinforcement learning for linear-convex models with jumps via stability analysis of feedback controls (with Xin Guo and Anran Hu)
SIAM Journal on Control and Optimization, 61 (2023), pp. 755-787 [pdf] [Preprint]
Logarithmic regret for episodic continuous-time linear-quadratic reinforcement learning over a finite-time horizon (with Matteo Basei, Xin Guo and Anran Hu)
Journal of Machine Learning Research, 23 (2022), pp. 1–34 [pdf] [Preprint]
Regularity and stability of feedback relaxed controls (with Christoph Reisinger)
SIAM Journal on Control and Optimization, 59 (2021), pp. 3118–3151 [pdf] [Preprint]
A penalty scheme and policy iteration for non-local HJB variational inequalities with monotone drivers (with Christoph Reisinger)
Computers and Mathematics with Applications, 93 (2021), pp. 199-213 [pdf] [Preprint]
Rectified deep neural networks overcome the curse of dimensionality for nonsmooth value functions in zero-sum games of nonlinear stiff systems (with Christoph Reisinger)
Analysis and Applications, 18 (2020), pp. 951-999 [Preprint]
A neural network based policy iteration algorithm with global $H^2$-superlinear convergence for stochastic games on domains (with Kazufumi Ito and Christoph Reisinger)
Foundations of Computational Mathematics, 21 (2021), pp. 331–374 [pdf]
Error estimates of penalty schemes for quasi-variational inequalities arising from impulse control problems (with Christoph Reisinger)
SIAM Journal on Control and Optimization, 58 (2020), pp. 243-276 [pdf]
A penalty scheme for monotone systems with interconnected obstacles: convergence and error estimates (with Christoph Reisinger)
SIAM Journal of Numerical Analysis, 57 (2019), pp. 1625-1648 [pdf]
Approximation schemes for mixed optimal stopping and control problems with nonlinear expectations and jumps (with Roxana Dumitrescu and Christoph Reisinger)
Applied Mathematics & Optimization, 83 (2021), pp. 1387–1429 [pdf]

Understanding Deep Architectures with Reasoning Layer (with Xinshi Chen, Christoph Reisinger and Le Song)
Advances in Neural Information Processing Systems (NeurIPS), 2020. [Preprint]

Insurance pricing on price comparison websites via reinforcement learning (with Tanut Treetanthiploet, Lukasz Szpruch, Isaac Bowers-Barnard, Henrietta Ridley, James Hickey and Chris Pearce) (2023) [Preprint]
Path regularity of coupled McKean-Vlasov FBSDEs (with Christoph Reisinger and Wolfgang Stockinger) (2020) [Preprint]

Office: 803, Weeks Building,
16-18 Prince’s Gardens
South Kensington Campus

Mail: Department of Mathematics
180 Queen's Gate
South Kensington Campus
Imperial College London
LONDON, SW7 2AZ

Email: yufei.zhang@imperial.ac.uk

Office: 803, Weeks Building,16-18 Prince’s GardensSouth Kensington Campus

Mail: Department of Mathematics180 Queen's GateSouth Kensington CampusImperial College LondonLONDON, SW7 2AZ

Email: yufei.zhang@imperial.ac.uk

Office: 803, Weeks Building,
16-18 Prince’s Gardens
South Kensington Campus

Mail: Department of Mathematics
180 Queen's Gate
South Kensington Campus
Imperial College London
LONDON, SW7 2AZ