Generative diffusion models: optimization, generalization and fine-tuning

Speaker: Renyuan Xu

Location: 60 Fifth Avenue, Room Room 150

Date: Thursday, November 7, 2024

Recently, generative diffusion models have outperformed previous architectures, such as GANs, in generating high-quality synthetic data, setting a new standard for generative AI. A key component of these models is learning the associated Stein's score function. Though diffusion models have demonstrated practical success, their theoretical foundations are far from mature, especially regarding whether gradient-based algorithms can provably learn the score function. In this talk, I will present a suite of non-asymptotic theory aimed at understanding the data generation process in diffusion models and the accuracy of score estimation. Our analysis addresses both the optimization and generalization aspects of the learning process, establishing a novel connection to supervised learning and neural tangent kernels.
Building on these theoretical insights, another key challenge arises when fine-tuning pre-trained diffusion models for specific tasks or datasets to improve performance. Fine-tuning requires refining the generated outputs based on particular conditions or human preferences while leveraging prior knowledge from the pre-trained model. In the second part of the talk, we formulate this fine-tuning as a stochastic control problem, establishing its well-definedness through the Dynamic Programming Principle and proving convergence for an iterative Bellman scheme.