Bio
You can find a PDF version of my full CV on the right.
General Information
Name | Minglai Yang |
Title | Final-year CS undergraduate, University of Arizona |
mingly@arizona.edu | |
Phone | +1 (240) 453 1294 |
Website | https://ymingl.com/ |
Location | Tucson, AZ, USA |
About me | I am a final-year undergraduate in the Computer Science department at the University of Arizona. My research focuses on the fundamental principles and mechanisms of large language models (LLMs), aiming to understand their reasoning processes, improve their robustness, and develop more efficient inference techniques. My recent work includes uncovering universal laws of LLM reasoning under distracting context, advancing LLM-guided reinforcement learning, and accelerating LLM inference with speculative decoding. I am passionate about bridging theory and practice in AI, and actively contribute to research at the intersection of language modeling, machine learning, and cognitive science. |
Education
-
2023.08 – 2025.12 Tucson, AZ, USA
B.S. in Computer Science
University of Arizona - Expected graduation December 2025 (on track to complete B.S. in 2.5 years).
- Relevant courses: I have built a webpage for my coursework.
-
2016.09 – 2023.05 Shanghai, China
Secondary Education
Shanghai Nanyang Model High School - Graduated with distinction; awarded for excellence in science, technology, and the arts.
Experience
-
2025.05 - 2025.08 Beijing, China
Research Intern
Tsinghua University, THUKEG - Advised by Dr. Juanzi Li and mentored by Zijun Yao.
- Research in mechanism interpretability and LLM reasoning.
-
2024.10 – Tucson, AZ
Undergraduate Research Assistant
The University of Arizona, CLU-LAB - Advised by Dr. Liangming Pan, Dr. Mihai Surdeanu.
- Explored the physics of language models, focusing on how LLMs reason under distracting context and identifying universal principles for robust reasoning. (First Author, Under Submission EMNLP 2025, Score 4/4/3.5)
- Advanced LLM-guided reinforcement learning, improving data efficiency and generalization, collaborating with Dr. Chicheng Zhang. (Second Author, Under Submission NeurIPS 2025)
- Speculative decoding with copying mechanism, boosting the speed 4x. (Second Author, Under Submission EMNLP 2025, Score 4/2.5/2.5)
-
2023.10 – Tucson, AZ
Undergraduate Research Assistant
The University of Arizona, ML4AI-LAB and IVILAB - Advised by Dr. Kobus Barnard, Dr. Adarsh Pyarelal.
- Developing dynamic Bayesian networks and Theory of Mind models for the ToMCAT project, focusing on modeling human coordination and distinguishing genuine interpersonal synchrony.
-
2024.01 – 2024.05 Tucson, AZ
Undergraduate Research Assistant
The University of Arizona, HDC-LAB - Advised by Dr. Reyan Ahmed, Dr. Stephen Kobourov.
- Conducted deep learning research on graph drawing, focusing on evaluating and interpreting the behavior of Graph Neural Networks.
-
2024.05 – 2024.08 Kensington, MD
Machine Learning Engineer Intern (Full Time)
Coretechs Consulting Inc. - Built GPT-powered bots for Slack using RAG, integrated Slack-based live chat into the official website, orchestrated AWS website deployment, simulated 500+ concurrent users with Selenium, and received the Best Intern Award.
-
2024.02 – 2024.03 Tucson, AZ
Team Leader, Mathematical Contest in Modeling (MCM)
University of Arizona - Advised by Dr. Patrick Shipman.
- Led a team for the 2024 MCM, responsible for project coordination, modeling, and paper writing.
Publications
-
2025 How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark
- Ran controlled experiments showing irrelevant extra sentences consistently degrade LLM reasoning in a predictable pattern.
- Found harder (deeper, multi-step) problems are more easily distracted, with mistakes from both picking wrong paths and arithmetic calculations.
- Developed a “hard distractor” training regimen that noticeably increases robustness, even on out-of-distribution problems.
- Added a reward-guided stepwise Tree-of-Thought that gives an up to 6.29% accuracy boost on tough, out-of-distribution cases.
-
2025 Improving the Data-efficiency of Reinforcement Learning by Warm-starting with LLM
- Proposed a method to improve RL data-efficiency by leveraging LLMs for warm-starting, achieving significant gains in learning speed and performance.
-
2025 CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality
- Developed a method enabling LLMs to speculatively copy repeated outputs, reducing unnecessary computation.
- Introduced the MT-Redundant dataset to benchmark LLM performance on follow-up turns with repeated content.
- Achieved up to 3.08x speed-up on conversational benchmarks and a 49% additional speed-up over speculative decoding, with no extra memory requirements.
Honors and Awards
-
2025 - Galileo Circle Scholar, College of Science, University of Arizona
-
2024 - Mathematical Contest in Modeling (MCM) Award, Consortium for Mathematics and Its Applications (COMAP)
-
2023 - Academic Highest Distinction (Dean's List), University of Arizona
- Global Wildcat Scholarship, University of Arizona
-
2020 - Shanghai Youth Science and Technology Innovation Competition (Third Prize), Shanghai Nanyang Model High School
-
2018 - China National Youth Arts Competition (Group First Prize), Shanghai Nanyang Model High School
Leadership
-
2024.08 - Present Tucson, AZ
President
AI Club at University of Arizona - Organized workshops on AI agents and LLM applications, including a Hack Arizona event with 100+ participants.
- Led weekly lectures and invited speaker sessions to foster AI learning across disciplines.
- Organized a reading group that mentors students interested in AI research.
- Raised over $12,000 in sponsorships to support club activities and student research projects.
Teaching
-
2024.09 - 2025.05 Tucson, AZ
Instructor, Math for AI Workshop Series
AI Club, University of Arizona -
2025.01 - 2025.05 Tucson, AZ
TA, CSC-144: Discrete Mathematics
Computer Science, University of Arizona
Reviewing
-
2024.05 - 2024.06 International
Reviewer
ICML AI4MATH Workshop - Served as a reviewer for the ICML 2024 AI4MATH Workshop, evaluating submissions on reinforcement learning for LLM post-training and decoding mechanisms.