Bayesian Machine Learning
ORIE 6741
Fall 2016
Course
Information
Title: Bayesian Machine Learning
Course Number: ORIE 6741
Semester: Fall 2016
Times: Tu/Th 11:40 am - 12:55 pm
Room: Hollister Hall 320
Course
Syllabus [PDF]
Instructor
Andrew Gordon Wilson
Assistant Professor
Rhodes Hall 235
Website:
https://people.orie.cornell.edu/andrew
E-Mail: andrew@cornell.edu
Office Hours: Tuesday 4:00 pm - 5:00 pm, or by
appointment
Overview
To answer scientific questions, and reason
about data, we must build models and perform inference
within those models. But how should we approach
model construction and inference to make the most
successful predictions? How do we represent
uncertainty and prior knowledge? How flexible should
our models be? Should we use a single model, or
multiple different models? Should we follow a
different procedure depending on how much data are
available?
In this course, we will approach these
fundamental questions from a Bayesian perspective.
From this perspective, we wish to faithfully incorporate
all of our beliefs into a model, and represent uncertainty
over these beliefs, using probability distributions.
Typically, we believe the real world is in a sense infinitely
complex: we will always be able to add flexibility
to a model to gain better performance. If we are
performing character recognition, for instance, we can
always account for some additional writing styles for
greater predictive success. We should therefore aim
to maximize flexibility, so that we are capable of
expressing any hypothesis we believe to be possible.
For inference, we will not have a priori certainty that
any one hypothesis has generated our observations.
We therefore typically wish to weight an uncountably
infinite space of hypotheses by their posterior
probabilities. This Bayesian model averaging
procedure has no risk of overfitting, no how matter how
flexible our model. How we distribute our a priori
support over these different hypotheses determines our inductive
biases. In short, a model should distribute
its support across as wide a range of hypotheses as
possible, and have inductive biases which are aligned to
particular applications.
This course aims to provide students with a
strong grasp of the fundamental principles underlying
Bayesian model construction and inference. We will
go into particular depth on Gaussian process and deep
learning models.
The course will be comprised of three units:
Model Construction and Inference:
Parametric models, support, inductive biases, gradient
descent, sum and product rules, graphical models, exact
inference, approximate inference (Laplace approximation,
variational methods, MCMC), model selection and hypothesis
testing, Occam's razor, non-parametric models.
Gaussian Processes: From finite basis
expansions to infinite bases, kernels, function space
modelling, marginal likelihood, non-Gaussian likelihoods,
Bayesian optimisation.
Bayesian Deep Learning: Feed-forward,
convolutional, recurrent, and LSTM networks.
Depending on the available time, we may omit
some of these topics. Most of the material will be
derived on the chalkboard, with some supplemental
slides. The course will have both theoretical and
practical (e.g. coding) aspects.
After taking this course, you should:
- Be able to think about any problem from a
Bayesian perspective.
- Be able to create models with a high degree
of flexibility and appropriate inductive biases.
- Understand the interplay between model
specification and inference, and be able to construct a
successful inference algorithm for a given model.
- Have familiarity with Gaussian process and
deep learning models.
Announcements
Schedule