Publications
*: indicating equal contribution or alphabetic ordering.
Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon S. Du
ICLR 2024
Runlong Zhou, Zihan Zhang, Simon S. Du
ICML 2023
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
Runlong Zhou, Ruosong Wang, Simon S. Du
ICML 2023
Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization
Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon S. Du
TMLR
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret
Jean Tarbouriech*, Runlong Zhou*, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric
NeurIPS 2021 (Spotlight, 3% acceptance rate)
Preprints
*: indicating equal contribution or alphabetic ordering.
The Crucial Role of Samplers in Online Direct Preference Optimization
Ruizhe Shi*, Runlong Zhou*, Simon S. Du
Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Natalia Zhang*, Xinqi Wang*, Qiwen Cui*, Runlong Zhou, Sham M. Kakade, Simon S. Du