Publications

*: indicating equal contribution or alphabetic ordering.

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning

Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon S. Du

ICLR 2024


Understanding Curriculum Learning in Policy Optimization for Online Combinatorial Optimization

Runlong Zhou, Zelin He, Yuandong Tian, Yi Wu, Simon S. Du

TMLR


Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Jean Tarbouriech*, Runlong Zhou*, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric

NeurIPS 2021 (Spotlight, 3% acceptance rate)