I am on the job market for full-time research scientist/engineer positions. Email for CV if interested!
Brief Bio
I am a final-year PhD candidate at the University of Central Florida, advised by Jun Wang.
I build efficient, scalable, and fault-tolerant machine learning systems (MLSys) for large-scale foundation models.
I am one of the cohorts for ML and System Rising Stars 2025.
| 08/2025 - 02/2026: |
Student researcher at Google. I worked on post-training optimization for Gemini on-device deployment. |
| 05/2025 - 08/2025: |
Research Intern at Microsoft Research. I worked on system-driven test-time scaling for reasoning models. |
| 05/2024 - 08/2024: |
Research Intern at d-Matrix. I worked on system optimization for diffusion-based video generation models. |
Selected Publications
-
GhostServe: A Lightweight Checkpointing System in the Shadow for Fault-Tolerant LLM Serving
Shakya Jayakody*, Youpeng Zhao*†, Chinmay Nehate, Jun Wang
MLSys 2026
[PDF]
[Code]
(*: equal contribution , †: project lead)
-
MeRino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao, Ming Lin, Huadong Tang, Qiang Wu, Jun Wang
AAAI 2025
[Arxiv]
-
ALISE: Accelerating Large Language Model Serving with Speculative Scheduling
Youpeng Zhao, Jun Wang
ICCAD 2024
[Arxiv]
-
ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching
Youpeng Zhao, Di Wu, Jun Wang
ISCA 2024
[Arxiv]
Links
Fun Facts
- I used to play some elecctronic keyboard and won a national prize in 2008 🎹
- I played a bit college Esports (R6 Siege) and was the founding member of GT R6 Team in 2020 ⚔️
- My Chinese name is 有朋, which means having friends from all over the world 🤓
- My English name is Kenneth, but people usually call me Ken or Kenny 😎