Bio | Minglai Yang

General Information

Name	Minglai Yang
Title	Final-year CS undergraduate, University of Arizona
Email	mingly@arizona.edu
Phone	+1 (240) 453 1294
Website	https://ymingl.com/
Location	Tucson, AZ, USA
About me	I am a final-year undergraduate in the Computer Science department at the University of Arizona. My research focuses on the fundamental principles and mechanisms of large language models (LLMs), aiming to understand their reasoning processes, improve their robustness, and develop more efficient inference techniques. My recent work includes uncovering universal laws of LLM reasoning under distracting context, advancing LLM-guided reinforcement learning, and accelerating LLM inference with speculative decoding. I am passionate about bridging theory and practice in AI, and actively contribute to research at the intersection of language modeling, machine learning, and cognitive science.

Education

2023.08 – 2025.12

Tucson, AZ, USA

B.S. in Computer Science

University of Arizona

Expected graduation December 2025 (on track to complete B.S. in 2.5 years).
Relevant courses: I have built a webpage for my coursework.

2016.09 – 2023.05

Shanghai, China

Secondary Education

Shanghai Nanyang Model High School

Graduated with distinction; awarded for excellence in science, technology, and the arts.

Experience

2025.05 - 2025.08

Beijing, China

Research Intern

Tsinghua University, THUKEG

Advised by Dr. Juanzi Li and mentored by Zijun Yao.
Research in mechanism interpretability and LLM reasoning.

2024.10 –

Tucson, AZ

Undergraduate Research Assistant

The University of Arizona, CLU-LAB

Advised by Dr. Liangming Pan, Dr. Mihai Surdeanu.
Explored the physics of language models, focusing on how LLMs reason under distracting context and identifying universal principles for robust reasoning. (First Author, Under Submission EMNLP 2025, Score 4/4/3.5)
Advanced LLM-guided reinforcement learning, improving data efficiency and generalization, collaborating with Dr. Chicheng Zhang. (Second Author, Under Submission NeurIPS 2025)
Speculative decoding with copying mechanism, boosting the speed 4x. (Second Author, Under Submission EMNLP 2025, Score 4/2.5/2.5)

2023.10 –

Tucson, AZ

Undergraduate Research Assistant

The University of Arizona, ML4AI-LAB and IVILAB

Advised by Dr. Kobus Barnard, Dr. Adarsh Pyarelal.
Developing dynamic Bayesian networks and Theory of Mind models for the ToMCAT project, focusing on modeling human coordination and distinguishing genuine interpersonal synchrony.

2024.01 – 2024.05

Tucson, AZ

Undergraduate Research Assistant

The University of Arizona, HDC-LAB

Advised by Dr. Reyan Ahmed, Dr. Stephen Kobourov.
Conducted deep learning research on graph drawing, focusing on evaluating and interpreting the behavior of Graph Neural Networks.

2024.05 – 2024.08

Kensington, MD

Machine Learning Engineer Intern (Full Time)

Coretechs Consulting Inc.

Built GPT-powered bots for Slack using RAG, integrated Slack-based live chat into the official website, orchestrated AWS website deployment, simulated 500+ concurrent users with Selenium, and received the Best Intern Award.

2024.02 – 2024.03

Tucson, AZ

Team Leader, Mathematical Contest in Modeling (MCM)

University of Arizona

Advised by Dr. Patrick Shipman.
Led a team for the 2024 MCM, responsible for project coordination, modeling, and paper writing.

Publications

2025
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark
- Ran controlled experiments showing irrelevant extra sentences consistently degrade LLM reasoning in a predictable pattern.
- Found harder (deeper, multi-step) problems are more easily distracted, with mistakes from both picking wrong paths and arithmetic calculations.
- Developed a “hard distractor” training regimen that noticeably increases robustness, even on out-of-distribution problems.
- Added a reward-guided stepwise Tree-of-Thought that gives an up to 6.29% accuracy boost on tough, out-of-distribution cases.
2025
Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
- Proposed a method to improve RL data-efficiency by leveraging LLMs for warm-starting, achieving significant gains in learning speed and performance.
2025
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality
- Developed a method enabling LLMs to speculatively copy repeated outputs, reducing unnecessary computation.
- Introduced the MT-Redundant dataset to benchmark LLM performance on follow-up turns with repeated content.
- Achieved up to 3.08x speed-up on conversational benchmarks and a 49% additional speed-up over speculative decoding, with no extra memory requirements.

Honors and Awards

2025
- Galileo Circle Scholar, College of Science, University of Arizona
2024
- Mathematical Contest in Modeling (MCM) Award, Consortium for Mathematics and Its Applications (COMAP)
2023
- Academic Highest Distinction (Dean's List), University of Arizona
- Global Wildcat Scholarship, University of Arizona
2020
- Shanghai Youth Science and Technology Innovation Competition (Third Prize), Shanghai Nanyang Model High School
2018
- China National Youth Arts Competition (Group First Prize), Shanghai Nanyang Model High School

Leadership

2024.08 - Present

Tucson, AZ

President

AI Club at University of Arizona

Organized workshops on AI agents and LLM applications, including a Hack Arizona event with 100+ participants.
Led weekly lectures and invited speaker sessions to foster AI learning across disciplines.
Organized a reading group that mentors students interested in AI research.
Raised over $12,000 in sponsorships to support club activities and student research projects.

Teaching

2024.09 - 2025.05

Tucson, AZ

Instructor, Math for AI Workshop Series

AI Club, University of Arizona

2025.01 - 2025.05

Tucson, AZ

TA, CSC-144: Discrete Mathematics

Computer Science, University of Arizona

Service

2024.05 - 2024.06

International

Reviewer

ICML AI4MATH Workshop

Served as a reviewer for the ICML 2024 AI4MATH Workshop, evaluating submissions on reinforcement learning for LLM post-training and decoding mechanisms.

General Information

Education

B.S. in Computer Science

Secondary Education

Experience

Research Intern

Undergraduate Research Assistant

Undergraduate Research Assistant

Undergraduate Research Assistant

Machine Learning Engineer Intern (Full Time)

Team Leader, Mathematical Contest in Modeling (MCM)

Publications

How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark

Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM

CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality

Honors and Awards

Leadership

President

Teaching

Instructor, Math for AI Workshop Series

TA, CSC-144: Discrete Mathematics

Service

Reviewer