![]() |
👋 Hi! I am a senior undergraduate at and a core contributor of verl, I used to
E-mail: tongyuxuan361@gmail.com / tongyx21@mails.tsinghua.edu.cn |
I aim to build AI systems capable of independently conducting or substantially assisting humans in complex reasoning in long context.
Specifically, I am interested in the following topics:
Large Language Model (LLM)
AI for (Complex) Reasoning like (Advanced)
Education (e.g. Eureka Labs)
Research (e.g. SciCode-Bench)
Software Engineering (e.g. SWE-Bench)
Reinforcement Learning for Large Language Models (e.g. OpenAI o1)
Evaluation & Alignment(e.g. Scalable Oversight)
Hardware-Aware Algorithm Design (e.g. Flash Attention)
(*) denotes co-first authors
Qiying Yu*, Zheng Zhang*, Yuxuan Tong*\(^{\dagger}\), Guangming Sheng*\(^{\dagger}\), …, Yonghui Wu, Mingxuan Wang
Preprint, under review (\(\dagger\) denotes infrastructure lead)
[🏠 Homepage] [📝 Paper@arXiv] [🐱 Code@GitHub] [🤗 Datasets&Models@HF]
Edward Yeo*, Yuxuan Tong*, Xinyao Niu, Graham Neubig, Xiang Yue
Accepted by ICML 2025; Awarded as Best Paper by ICLR 2025 FM-Wild Workshop
[📝 Paper@arXiv]
[🧠 Publication@ICML]
[🐱 Code@GitHub]
[🤗 Datasets&Models@HF]
[🐦 Thread@X(Twitter)]
[📑 BibTeX]
Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He
Accepted by NeurIPS 2024
[📝 Paper@arXiv]
[🧠 Publication@NeurIPS]
[🐱 Code@GitHub]
[🤗 Datasets&Models@HF]
[🐦 Thread@X(Twitter)]
[🐶 中文博客@知乎]
[📊 Leaderboard@PapersWithCode]
[📑 BibTeX]
Jiazheng Xu*, Xiao Liu*, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, Yuxiao Dong
Accepted by NeurIPS 2023
[📝 Paper@arXiv]
[🧠 Publication@NeurIPS]
[🐱 Code@GitHub]
[🖼️ Dataset@HF]
[🤖 Model@HF]
[🐦 Thread@X(Twitter)]
[🐶 中文博客@知乎]
[📑 BibTeX]
Undergraduate (2021.09 - 2025.07, Expected)
Department of Computer Science and Technology (DCST), Tsinghua University (THU)
Bachelor's degree in progress
GPA: 3.7/4.0
2025.02 - Present
Seed-Infrastructures, ByteDance
Research intern, advised by Chi Zhang and Haibin Lin
Working on large-scale reinforcement learning infrastructure.
2024.07 - 2025.02
Language Technologies Institute (LTI), Carnegie Mellon University (CMU)
Research intern, advised by Prof. Graham Neubig and Dr. Xiang Yue.
Worked on building and understanding models capable of complex reasoning like OpenAI o1.
2023.07 - 2024.06
NLP Group, Hong Kong University of Science and Technology (HKUST-NLP)
Research intern, advised by Prof. Junxian He.
Worked on synthetic data for mathematical reasoning since 2023.12,
before which I worked on
process supervision, reward modeling, and constrained decoding for mathematical reasoning,
model merging.
2022.11 - 2023.06
Research intern, advised by advised by Prof. Jie Tang and Prof. Yuxiao Dong.
Worked on reward modeling and RLHF for text-to-image generation.
Maintainer and core contributor of the LLM RL library verl (2025.02 - Present)
Tsinghua University Research Scholarship (2023, 2024)
Tsinghua University Comprehensive Merit Scholarship (top 5% undergraduates) (2022)
Conference Review: NeurIPS 2024, ICLR 2025, ICML 2025, NeurIPS 2025
Programming Languages: Python, C/C++, TypeScript/JavaScript, SpinalHDL, System Verilog, etc.
ML Libraries: PyTorch, DeepSpeed, HuggingFace, vLLM/SGLang, etc.
Tools: Git, Shell, SLURM, Linux Utilities, LaTeX, etc.
I am from Shengzhou (嵊州), Shaoxing, Zhejiang in China, which is the birthplace of Yue Opera (越剧) and also famous for delicious food like Xiaolongbao (小笼包).
I spent six years (2009-2015) of my childhood at Yaohua Primary School (耀华小学) in Tianjin, which is quite happy time.
Before entering Tsinghua University, I studied at Shengzhou High School (嵊州中学), which is located in a small town but full of excellent teachers and classmates.