Ruokai Yin

Research Scientist, Meta

E-mail: yruokai@gmail.com

Bio

Ruokai is currently an AI Research Scientist at Meta, where he works on improving the runtime efficiency of AI models on Meta's custom silicon.

He received his Ph.D. from the Department of Electrical Engineering at Yale University in 2026, advised by Prof. Priyadarshini Panda. His thesis focused on designing efficient computer architectures, systems, and algorithms for asymmetric AI workloads, including low-precision and sparse LLMs and neuromorphic deep learning models.

Prior to that, he earned his B.S. from the University of Wisconsin-Madison in 2021, majored in Electrical Engineering, Computer Science, and Mathematics. During his undergraduate, he worked with Prof. Joshua San Miguel on designing computer architectures for stochastic computing.

Albuquerque, NM, Dec 2024

(CV latest update: June 2026)

News

2026/07: » ICCAD 2026 papers accepted.
2026/05: » Selected as one of the 2026 MLCommons Rising Stars.
2026/04: » Defend my Ph.D. thesis. I will join Meta as a Research Scientist.
2025/10: » Learn to Shard selected as Spotlight Presentation at ML4Sys NeurIPS 2025 (Microsoft internship's project).
2025/09: » NeurIPS 2025 workshop paper accepted (Microsoft internship's project).
2025/09: » NeurIPS 2025 paper accepted.
2025/03: » I will join Microsoft Azure as a Research Intern, working with AI System Architectur team (June - August).
2025/02: » DAC 2025 paper accepted.
2024/07: » MICRO 2024 paper accepted.
2024/03: » I will join Cerebras System as a Research Intern, working with ASIC team (June - August).
2024/03: » IEEE TETCI paper accepted.
2023/10: » MINT nominated for best paper award at ASP-DAC 2024.
2023/09: » ASP-DAC 2024 paper accepted.

Selected Publications

(please see the full publication list in my google scholar;)

DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs

Ruokai Yin, Yuhang Li, Donghyun Lee, Priyadarshini Panda

39th Conference on Neural Information Processing Systems (NeurIPS), 2025

[paper] [code]

Thumbnail for DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs

Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference

Ruokai Yin, Sattwik Deb Mishra, Xuan Zuo, Hokchhay Tann, Preyas Shah, Apala Guha

Workshop on ML for Systems, NeurIPS, 2025

Spotlight Presentation

[paper] [code]

Thumbnail for Learning to Shard: RL for Co-optimizing the Parallelism Degrees and Per-operator Sharding Dimensions in Distributed LLM Inference

PacQ: A SIMT Microarchitecture for Efficient Dataflow in Hyper-asymmetric GEMMs

Ruokai Yin, Yuhang Li, Priyadarshini Panda

62nd ACM/IEEE Design Automation Conference (DAC), 2025

[paper] [code]

Thumbnail for PacQ: A SIMT Microarchitecture for Efficient Dataflow in Hyper-asymmetric GEMMs

LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks

Ruokai Yin, Youngeun Kim, Di Wu, Priyadarshini Panda

57th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2024

[paper] [code] [slides]

Thumbnail for LoAS: Fully Temporal-Parallel Dataflow for Dual-Sparse Spiking Neural Networks

MINT: Multiplier-less INTeger Quantization for Energy Efficient Spiking Neural Networks

Ruokai Yin, Yuhang Li, Abhishek Moitra, Priyadarshini Panda

29th Asia and South Pacific Design Automation Conference (ASP-DAC), 2024

Best Paper Nomination

[paper] [code] [slides]

Thumbnail for MINT: Multiplier-less INTeger Quantization for Energy Efficient Spiking Neural Networks

Workload-Balanced Pruning for Sparse Spiking Neural Networks

Ruokai Yin, Youngeun Kim, Yuhang Li, Abhishek Moitra, Nitin Satpute, Anna Hambitzer, Priyadarshini Panda

IEEE Transactions on Emerging Topics in Computational Intelligence, 2024

[paper] [code]

Thumbnail for Workload-Balanced Pruning for Sparse Spiking Neural Networks

Wearable-based Human Activity Recognition with Spatio-Temporal Spiking Neural Networks

Yuhang Li, Ruokai Yin, Hyoungseob Park, Youngeun Kim, Priyadarshini Panda

Workshop on Learning from Time Series for Health, NeurIPS, 2022

Spotlight Paper

[paper] [code]

Thumbnail for Wearable-based Human Activity Recognition with Spatio-Temporal Spiking Neural Networks

SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks

Ruokai Yin, Abhishek Moitra, Abhiroop Bhattacharjee, Youngeun Kim, Priyadarshini Panda

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2022

[paper] [code]

Thumbnail for SATA: Sparsity-Aware Training Accelerator for Spiking Neural Networks

uGEMM: Unary Computing Architecture for GEMM Applications

Di Wu, Jingjie Li, Ruokai Yin, Hsuan Hsiao, Younghyun Kim, and Joshua San Miguel

47th Annual International Symposium on Computer Architecture (ISCA), 2020

IEEE 2020 Toppick

[paper] [code]

Thumbnail for uGEMM: Unary Computing Architecture for GEMM Applications

Experience

May 2025-Aug 2025

Research Intern

Microsoft, Azure AI, Architecture and Systems for AI team, Mentors: Apala Guha & Xuan Zuo.

Autonomous sharding planning in large scale distributed LLM inference system.

Paper: https://arxiv.org/abs/2509.00217 (Accepted to Workshop on ML for Systems, NeurIPS 2025)

May 2024-Aug 2024

Research Intern

Cerebras Systems, ASIC team, Mentor: Vipin Sharma.

Architecture design and modeling for Cerebras’s next-generation wafer-scale engine.

July 2021-Present

Graduate Research Assistant

Yale University, Intelligent Computing Lab, PI: Prof. Priyadarshini Panda.

Working on projects that improving the energy efficiency of neural networks, in particular, spiking neural networks.

June 2019-May 2021

Undergraduate Research Assistant

University of Wisconsin-Madison, STACS Lab, PI: Prof. Joshua San Miguel.

Worked on projects that applying unary computing to the deep neural networks. Construted a PyTorch-basede library for unary computing.

Home

Misc

Ruokai Yin

Research Scientist, Meta

E-mail: yruokai@gmail.com

Bio

(CV latest update: June 2026)

News

Selected Publications

Experience

May 2025-Aug 2025

Research Intern

Microsoft, Azure AI, Architecture and Systems for AI team, Mentors: Apala Guha & Xuan Zuo.

May 2024-Aug 2024

Research Intern

Cerebras Systems, ASIC team, Mentor: Vipin Sharma.

July 2021-Present

Graduate Research Assistant

Yale University, Intelligent Computing Lab, PI: Prof. Priyadarshini Panda.

June 2019-May 2021

Undergraduate Research Assistant

University of Wisconsin-Madison, STACS Lab, PI: Prof. Joshua San Miguel.

Teaching

2023 Fall

Teaching Fellow, EENG 439 Neural Networks and Learning Systems

Yale University

2023 Spring

Teaching Fellow, EENG 348/CPSC 338: Digital Systems

Yale University

Home

Misc

Ruokai Yin

Research Scientist, Meta

E-mail: yruokai@gmail.com

Bio

(CV latest update: June 2026)

News

Selected Publications

Experience

May 2025-Aug 2025

Research Intern

Microsoft, Azure AI, Architecture and Systems for AI team, Mentors: Apala Guha & Xuan Zuo.

May 2024-Aug 2024

Research Intern

Cerebras Systems, ASIC team, Mentor: Vipin Sharma.

July 2021-Present

Graduate Research Assistant

Yale University, Intelligent Computing Lab, PI: Prof. Priyadarshini Panda.

June 2019-May 2021

Undergraduate Research Assistant

University of Wisconsin-Madison, STACS Lab, PI: Prof. Joshua San Miguel.

Teaching

2023 Fall

Teaching Fellow, EENG 439 Neural Networks and Learning Systems

Yale University

2023 Spring

Teaching Fellow, EENG 348/CPSC 338: Digital Systems

Yale University

Teaching Fellow, EENG 348/CPSC 338: Digital Systems