Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning

RL Theory

Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon S. Du

ICLR 2024

We study the strengths and weaknesses of return-conditioned supervised learning, and propose an empirically improved algorithm.