CS Colloquium

Building Efficient and Scalable Machine Learning Systems

Time and Location:

March 10, 2026 at 2PM; 60 Fifth Avenue, Room 150

Speaker:

Qinghao Hu, Massachusetts Institute of Technology

Abstract:

The rapid evolution of foundation models is increasingly bottlenecked by a widening gap between algorithmic demands and system efficiency. As model scale and context lengths explode, infrastructure efficiency plateaus. Addressing these challenges requires a full-stack rethinking of machine learning systems. In this talk, I will present a research framework centered on algorithm–system co-design to push the efficiency frontier across the ML lifecycle. I first demonstrate how system-level support for algorithm advancement can drastically reduce the cost of large-scale hyperparameter exploration. Next, I tackle the long-tailed execution bottlenecks in post-training reinforcement learning, introducing a co-designed system approach that delivers substantial efficiency gains while preserving on-policy RL training. I also introduce system designs that enable vision–language models to scale to million-token contexts by resolving fundamental memory and communication constraints. I conclude by discussing the future of system support for agentic models at scale.