VIDA Seminar: AIRFoundry and AI-Driven Data Integration for Therapeutic Discovery

Speaker: Zachary Ives (UPenn)

Location: 370 Jay Street, Room 1114
Videoconference link: https://nyu.zoom.us/j/97228053949

Date: Monday, May 11, 2026

NYU has instituted a mandatory waiting room for all Zoom meetings. You can bypass the waiting room by logging into your NYU Zoom account before accessing the colloquium Zoom. 
 
Abstract:
Modern AI technologies (LLMs that can call out to agents) hold tremendous promise in enabling true open-world data integration.  At the same time, user tasks have become much more ambitious, including problem solving, planning, and discovery tasks over arbitrary unstructured data.  At the University of Pennsylvania’s AIRFoundry, we are building the infrastructure to accelerate RNA therapeutic manufacturing. This talk focuses on the KAIR (Knowledge-driven AI for Research) Assistant, a system designed to bridge the gap between unstructured literature and structured molecular tools. We will discuss our approach to structured agentic planning—treating agent actions as a query planning problem—and how we implement multi-step semantic reasoning that combines data provenance and data integration in a hybrid DBMS-LLM pipeline.
 
Work done in collaboration with Jiaming Liang, Varun Jana, Haydn Jones, Jacob Gardner, and Mark Yatskar, with funding by the National Science Foundation and the Laude Institute.
 
Bio: 
 
Zachary Ives is the Department Chair and Adani President's Distinguished Professor of Computer and Information Science at the University of Pennsylvania. Zack's research interests include agentic data engineering, data integration and provenance, and machine learning systems.  He is a recipient of the NSF CAREER award, and an alumnus of the DARPA Computer Science Study Panel and Information Science and Technology advisory panel.  He has also been awarded the Christian R. and Mary F. Lindback Foundation Award for Distinguished Teaching and an IEEE Technical Committee on Data Engineering Education Award, and he is a Fellow of the ACM. He is a co-author of the textbook Principles of Data Integration.