Shawn/Yuxuan Tong (童雨轩)

👋 Hi!

and a core contributor of verl,
a large-scale reinforcement learning library for LLMs.

I used to

Research Interests

I aim to build AI systems capable of independently conducting or substantially assisting humans in complex reasoning in long context.

Specifically, I am interested in the following topics:

Large Language Model (LLM)
AI for (Complex) Reasoning like (Advanced)
- Education (e.g. Eureka Labs)
- Research (e.g. SciCode-Bench)
- Software Engineering (e.g. SWE-Bench)
Reinforcement Learning for Large Language Models (e.g. OpenAI o1)
Evaluation & Alignment（e.g. Scalable Oversight)
Hardware-Aware Algorithm Design (e.g. Flash Attention)

(*) denotes co-first authors

Qiying Yu*, Zheng Zhang*, Yuxuan Tong*\(^{\dagger}\), Guangming Sheng*\(^{\dagger}\), …, Yonghui Wu, Mingxuan Wang

Preprint, under review (\(\dagger\) denotes infrastructure lead)

Edward Yeo*, Yuxuan Tong*, Xinyao Niu, Graham Neubig, Xiang Yue

Accepted by ICML 2025; Awarded as Best Paper by ICLR 2025 FM-Wild Workshop

Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He

Accepted by NeurIPS 2024

Jiazheng Xu*, Xiao Liu*, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, Yuxiao Dong

Accepted by NeurIPS 2023

Undergraduate (2021.09 - 2025.07, Expected)
- Department of Computer Science and Technology (DCST), Tsinghua University (THU)
- Bachelor's degree in progress
- GPA: 3.7/4.0

2025.02 - Present
- Seed-Infrastructures, ByteDance
- Research intern, advised by Chi Zhang and Haibin Lin
- Working on large-scale reinforcement learning infrastructure.

2024.07 - 2025.02
- Language Technologies Institute (LTI), Carnegie Mellon University (CMU)
- Research intern, advised by Prof. Graham Neubig and Dr. Xiang Yue.
- Worked on building and understanding models capable of complex reasoning like OpenAI o1.

2022.11 - 2023.06
- Knowledge Engineer Group, Tsinghua University (THUKEG)
- Research intern, advised by advised by Prof. Jie Tang and Prof. Yuxiao Dong.
- Worked on reward modeling and RLHF for text-to-image generation.

Maintainer and core contributor of the LLM RL library verl (2025.02 - Present)

Tsinghua University Research Scholarship (2023, 2024)
Tsinghua University Comprehensive Merit Scholarship (top 5% undergraduates) (2022)

Programming Languages: Python, C/C++, TypeScript/JavaScript, SpinalHDL, System Verilog, etc.
ML Libraries: PyTorch, DeepSpeed, HuggingFace, vLLM/SGLang, etc.
Tools: Git, Shell, SLURM, Linux Utilities, LaTeX, etc.

I am from Shengzhou (嵊州), Shaoxing, Zhejiang in China, which is the birthplace of Yue Opera (越剧) and also famous for delicious food like Xiaolongbao (小笼包).
I spent six years (2009-2015) of my childhood at Yaohua Primary School (耀华小学) in Tianjin, which is quite happy time.
Before entering Tsinghua University, I studied at Shengzhou High School (嵊州中学), which is located in a small town but full of excellent teachers and classmates.