Xinhao Cheng

I’m a first year PhD. student in the Computer Science Department of Carnegie Mellon University, affliated with Catalyst Group and Parallel Data Lab. I’m fortunately to be advised by Zhihao Jia. Before this, I received my Master degree from Carnegie Mellon University, advised by Zhihao Jia. Before CMU, I received my B.S. degree from Dalian University of Technology.

I am interested in building efficient and scalable systems for machine learning applications.

Gates Hillman Centers, 6003

4902 Forbes Ave, Pittsburgh, PA 15213

Email: xinhaoc@cs.cmu.edu

Projects

• FlexFlow Serve is a high performance serving system which accelerate LLM inference with speculative decoding and tree-based verification.

• Mirage is a super optimizer that automatically discovers highly-optimized GPU kernels for Machine learning applications

SpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree Verification
[ASPLOS 2024]Xupeng Miao*, Gabriele Oliaro*, Zhihao Zhang*, Xinhao Cheng*, Zeyu Wang, Zhengxin Zhang, Rae Ying Yee Wong, Alan Zhu, Lijie Yang, Xiaoxiang Shi, Chunan Shi, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, and Zhihao Jia

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
[Preprint]Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, and Tianqi Chen, Zhihao Jia

[Preprint]Mengdi Wu, Xinhao Cheng, Oded Padon, and Zhihao Jia