Generalization properties of deep convolutional networks in kernel regimes

Speaker: Alessandro Favero

Location: 60 Fifth Avenue, Room 650

Date: Friday, March 10, 2023

Understanding how convolutional neural networks (CNNs) can efficiently learn high-dimensional functions remains a fundamental challenge. A popular belief is that these models harness the local and hierarchical structure of natural data such as images. Yet, we lack a quantitative understanding of how such structure affects their performance, i.e., the rate of decay of the generalization error with the number of training examples. In this talk, I will focus on deep CNNs in the kernel regime, where I will show that the spectrum of the corresponding kernel inherits the hierarchical structure of the network. Then, I will use this result together with classical generalization bounds to prove that deep CNNs adapt to the spatial scale of the target function. In particular, if the target function depends on low-dimensional subsets of adjacent input variables, the rate of decay of the error is controlled by the effective dimensionality of these subsets and not the full input dimensionality. Overall, this result demonstrates a separation between CNNs and fully-connected networks and between deep and shallow models.