Events
Learning hierarchical representations to compose new data
Speaker: Matthieu Wyart
Location: 60 Fifth Avenue, Room 7th floor open space
Date: Thursday, May 30, 2024
Abstract: Learning generic tasks in high dimension is impossible. Yet, deep networks classify images, large models learn the structure of language and produce meaningful texts, and diffusion based models generate new images of high quality. In all these cases, building a hierarchical representation of the data is believed to be key to success. How is it achieved? How many data are needed for that, and how does it depend on the data structure? Once such a representation is obtained, how can it be used to compose a new whole data from known low-level features? I will introduce generative models of hierarchical data for which an understanding of these questions is emerging. I will discuss recent results on (i) supervised learning, (ii) next token prediction and (iii) score-based generative models. In the last two cases, our framework makes novel predictions that we test both on text and image data sets.