CILVR seminar: An Asymptotic Theory of Random Search for Hyperparameters in Deep Learning

Speaker: Nick Lourie

Location: 60 Fifth Avenue, Room 7th floor open space
Videoconference link: https://nyu.zoom.us/s/98988587056

Date: Wednesday, December 4, 2024

Scale is essential in modern deep learning, but scale makes experiments very expensive. Much of the cost comes from finding good hyperparameters, so we should invest only as much effort as necessary when searching for them. Unfortunately, this can be difficult to determine, since hyperparameter search is often opaque. Thus, we derive an asymptotic theory of random search. Its central result is a new limit theorem that explains random search in terms of four interpretable quantities: the effective number of hyperparameters, the variance due to random seeds, the concentration of probability around the optimum, and the best hyperparameters’ performance. These four quantities parametrize a new probability distribution, the noisy quadratic, which characterizes the behavior of random search. Once fitted, each parameter of the noisy quadratic answers an important question---such as what is the best possible performance. We test our theory against three practical deep learning scenarios, including pretraining in vision and fine-tuning in language. The theory achieves excellent fit and lets you infer how performance might improve as search continues. Join us to learn more about this theory and how to apply it in your own experiments!