Events
CILVR Seminar: From Health AI to Table Foundation Models; and back?
Speaker: Gaël Varoquaux (Inria)
Location:
60 Fifth Avenue, Room 7th floor open space
Videoconference link:
https://nyu.zoom.us/j/92777504824
Date: Monday, March 16, 2026
Electronic health records open amazing opportunities for AI in health because they are close to routine care, and they have great population and time coverage. This data is typical of complex organizations: tabular data, spread across many tables, heterogeneous columns mixing strings and numbers. It requires tedious preparation to feed in a learner, and until recently deep learning didn't help with feature engineering as it did not outperform gradient-boosted trees.
To facilitate AI in such settings, we started a research program inspired from the successes of pre-trained language models: baking in as much background information in foundation models. With adequate architectures, it leads to models that markedly improve learning on tables, either by modeling jointly strings and numbers to contextualize the data [1], or with pretrained in-context learning on numbers [2].
But many challenges remain open to transform improved prediction into better health: operationalizing the models, distribution shifts, causality, censoring...
[1] MJ Kim, L Grinsztajn, G Varoquaux, CARTE: Pretraining and Transfer for Tabular Learning, ICML 2024
[2] Qu, Jingang, David Holzmueller, Gaël Varoquaux, and Marine Le Morvan. "Tabicl: A tabular foundation model for in-context learning on large data." ICML 2025
Varoquaux has worked at UC Berkeley, McGill, and University of Florence. He did a PhD in quantum physics supervised by Alain Aspect and is a graduate from Ecole Normale Superieure, Paris.
Bio: Gaël Varoquaux is a research director working on data science at Inria (French computer science national research) where he leads the Soda team. He is also co-founder and chief science officer of Probabl.
Varoquaux's research covers fundamentals of artificial intelligence, statistical learning, natural language processing, causal inference, as well as applications to health, with a current focus on public health and epidemiology. He also creates technology: he co-founded scikit-learn, one of the reference machine-learning toolboxes, and helped build various central tools for data analysis in Python.
Varoquaux has worked at UC Berkeley, McGill, and University of Florence. He did a PhD in quantum physics supervised by Alain Aspect and is a graduate from Ecole Normale Superieure, Paris.