Haisheng Chen

About Me

I am Haisheng Chen, a master’s student at ECE, UC San Diego and a research intern in Z Lab, under the supervision of Dr. Zhijian Liu. My research centers on efficient LLM – everything from model quantization and sparse-attention to building inference systems.

Before coming to UC San Diego, I spent a gap year at AMD’s Shanghai office as a full-time GPU software intern. There, I gained a deep understanding of GPU architectures and honed my skills in developing high-performance CUDA kernels. I also helped enable AMD’s support for the vLLM inference framework—designing Docker-based deployment pipelines and troubleshooting a variety of compatibility issues along the way.

I’m passionate about contributing to open-source LLM inference projects like vLLM and SGLang. Just like countless other community-driven initiatives, these systems thrive on collaboration from developers around the world. And beyond the thrill of low-level optimizations, I find it endlessly fascinating to put billions of matmuls and other computations into producing what feels like intelligence.

Career Goal

I will graduate from UC San Diego in December 2025. I am now actively seeking roles in the LLM systems industry. I am open to opportunities based in the United States or China and eager to contribute to efficient large-scale LLM deployment.

About this site

This site is designed with the following goals in mind:

Discuss LLM System Technical Details
Dive into the architectures, optimizations, and implementation tricks that power large-language-model inference systems.
Showcase Research Projects

Haisheng Chen

https://hschen-sys.github.io/2025/06/17/hello-world/

Author

Haisheng Chen

Posted on

2025-06-17

Updated on

2025-10-01

Licensed under

#home

Haisheng Chen

About Me

Career Goal

About this site

Author

Posted on

Updated on

Licensed under

Categories

Tags

Recents