Global AI Frontier Lab: From Tokens to Coordinates: Do Molecular Models Really Understand Chemistry?

Speaker: Bing yan

Location: 1 MetroTech Center

Date: Monday, December 1, 2025

Please RSVP by filling out this Google Form. For online attendees, a Zoom link will be sent out prior to the event.

Machine learning has accelerated molecular design, yet a fundamental question remains: Do our models understand chemistry, or merely exploit surface-level patterns? I approach this question from two directions. First, I examine representation consistency: whether models make stable predictions across different encodings of the same molecule (SMILES vs. IUPAC). Our findings reveal that large language models often rely on symbolic patterns rather than underlying chemical identity. Second, I introduce EVA-Flow, an environment-aware, equivariant model for 3D molecular conformation generation. Using flow-matching and environment conditioning, EVA-Flow produces physically plausible conformations in vacuum, solvent, crystals, and protein binding pockets, capturing key molecule-environment interactions. I conclude by showing how consistency analysis and geometry-based generation together outline a path toward chemically grounded and reliable molecular ML systems.

Bio: Bing Yan is a PhD candidate in computer science at New York University, advised by Prof. @Kyunghyun Cho. She is also a visiting researcher at Meta, mentored by Ricky Chen and Ben Miller. Her research focuses on machine learning for molecular systems, specifically how models can evaluate chemical properties, reason consistently across representations, and generate realistic 3D structures. Prior to NYU, she earned a PhD in chemistry from MIT, where she worked on electrocatalysis and materials design. Her long-term vision is to build ML systems that not only predict molecules, but that truly understand chemical behavior.