Hi, I’m Shiyang Li (李世阳)
About Me
I’m a PhD student in the Department of Computer Science & Engineering at the University of Minnesota-Twin Cities. I work with Prof. Caiwen Ding and Prof. Pen-Chung Yew on research in GPU programming, computer architecture, and agentic machine learning systems.
I was fortunate to be advised by Prof. Gong Xiaoli during my B.S. and M.S. at Nankai University, where I spent 7 wonderful years.
Research Interests
My research focuses on computer architecture and GPU programming, with emphasis on:
- GPU memory management
- High-efficiency CUDA kernel design
- CPU-GPU heterogeneous computing
Recently, I have been working on LLM training/inference optimization on GPUs, agentic MLSys, and agentic EDA frameworks.
I’m looking for motivated undergraduate/graduate interns interested in coding agent design, CUDA programming, and agentic MLSys. If interested, please email me at li004074@umn.edu.
Publications
GSR-GNN: Training Acceleration and Memory-Saving Framework of Deep GNNs on Circuit Graph Yuebo Luo, Shiyang Li, Yifei Feng, Vishal Kancharla, Shaoyi Huang, Caiwen Ding. In Proceedings of the 63rd ACM/IEEE Design Automation Conference (DAC ‘26). June 2026, San Francisco, USA.
StitchCUDA: An Automated Multi-Agents End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning Shiyang Li, Zijian Zhang, Winson Chen, Yuebo Luo, Mingyi Hong, Caiwen Ding. arXiv preprint arXiv:2603.02637. March 2026.
XuanJia: A Comprehensive Virtualization-Based Code Obfuscator for Binary Protection Xianyu Zou, Xiaoli Gong, Jin Zhang, Shiyang Li, Pen-Chung Yew. arXiv preprint arXiv:2601.10261. January 2026.
CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization Zijian Zhang, Rong Wang, Shiyang Li, Yuebo Luo, Mingyi Hone, Caiwen Ding. arXiv preprint arXiv:2511.01884. October 2025.
DR-CircuitGNN: Training Acceleration of Heterogeneous Circuit Graph Neural Network on GPUs Yuebo Luo, Shiyang Li, Junran Tao, Kiran Gautam Thorat, Xi Xie, Hongwu Peng, Nuo Xu, Caiwen Ding, Shaoyi Huang. In Proceedings of the 39th ACM International Conference on Supercomputing (ICS ‘25). June 2025, Salt Lake City, USA.
Liberator: A Data Reuse Framework for Out-of-Memory Graph Computing on GPUs Shiyang Li, Ruiqi Tang, Jingyu Zhu, Ziyi Zhao, Xiaoli Gong, Wenwen Wang, Jin Zhang, Pen-Chung Yew. IEEE Transactions on Parallel and Distributed Systems (TPDS) 34.6 (2023): 1954-1967.
OneGraph: A Cross-Architecture Framework for Large-Scale Graph Computing on GPUs Based on oneAPI Shiyang Li, Jingyu Zhu, Jiaxun Han, Yuting Peng, Zhuoran Wang, Xiaoli Gong, Gang Wang, Jin Zhang, Xuqiang Wang. CCF Transactions on High Performance Computing (CCF-THPC) 6.2 (2024): 179-191.
Services
Teaching Assistant
- Computer Architecture (CSCI 4203), Fall 2025, University of Minnesota-Twin Cities
- Introduction to Parallel Programming, Spring 2022, Nankai University
- Principles of Computer Organization, Spring 2023, Nankai University
- Operating System, Fall 2020 & 2021 & 2022, Nankai University
