🤖 Current Research Project

Motivation. We hypothesize that the current RLVR setting is constrained by limited exploration.

Ongoing. We are developing a new fine-tuning algorithm that augments RLVR with an SFT phase to encourage exploration.

🎖 Awards

  • 2025 Eleanor Quinlan Graduate Teaching Award, Department of Computer Science and Engineering, The Ohio State University