My goal as a researcher is to uncover the mathematical foundations of Monte Carlo methods
in order to optimize their efficiency.
In pursuit of this goal, I combine numerical analysis of existing Monte Carlo methods
with development of new methods that improve accuracy.

Projects

Rare events can be highly impactful.
However, to estimate the probability \(p\) of a rare event
by direct Monte Carlo sampling requires a very large sample size (\(> 100p^{-1}\)).
Since generating such a large sample can be prohibitively expensive,
are there more practical rare event sampling methods
that calculate small probabilities with reduced sample size requirements?

To address this question,
I have introduced or analyzed several Monte Carlo "splitting" methods
for rare probability estimation,
including "Quantile Diffusion Monte Carlo" (QDMC) and "weighted ensemble".

In an exciting new application of QDMC,
I worked with
Dorian Abbot,
Sam Hadden,
and Jonathan Weare
to calculate the probability that Mercury will become unstable and collide with another celestial body over the next 2 billion years.
We calculated the probability to be \(\sim10^{-4}\) and obtained a speed-up of up to \(100x\) as compared to direct sampling.

Practical rare event sampling for extreme mesoscale weather Robert J. Webber, David A. Plotkin, Morgan E O'Neill, Dorian S. Abbot, Jonathan Weare
[Chaos, 2019][PDF]

Rare event sampling improves Mercury instability statistics Dorian S. Abbot, Robert J. Webber, Sam Hadden, Jonathan Weare
[arXiv, 2021][PDF]

A splitting method to reduce MCMC variance Robert J. Webber, David Aristoff, Gideon Simpson
[arXiv, 2020][PDF][Code]

Many physical models have a dazzlingly high number of dimensions.
For example, a simulation model for the atoms in a protein can have millions of dimensions, while a simulation model for the Earth’s weather can have billions of
dimensions.
This raises the question: how can we provide accurate statistical
estimates for such high-dimensional models?

An important observation underlying many Monte Carlo algorithms for
high-dimensional problems is that statistical estimates often depend most critically on a low-dimensional subspace of coordinates.
These coordinates may
vary from problem to problem, yet their existence helps to make estimation
problems easier and Monte Carlo methods more efficient.

The eigendecomposition of a Markov transition operator identifies slowly
decorrelating eigenfunctions, which are often essential for understanding a
Markov process’s dynamics, as well as quickly decorrelating eigenfunctions,
which can often be ignored.
With Jonathan Weare and the research group of
Aaron Dinner,
I analyzed and improved
a popular method for identifying slowly decorrelating eigenfunctions
called the "variational approach to conformational dynamics" (VAC).

The Karhunen-Loève (KL) expansion reduces an infinite-dimensional random field into a
low-dimensional random field that is much easier to sample.
With Jeremie Coullon,
I used the KL expansion to build a new MCMC sampler for infinite-dimensional Bayesian inverse problems that outperforms other gradient-free methods.

Integrated VAC:
A robust strategy for identifying eigenfunctions of dynamical operators Chatipat Lorpaiboon, Erik H. Thiede, Robert J. Webber, Jonathan Weare, Aaron R. Dinner
[Journal of Physical Chemistry B, 2020][PDF][Code]

The ground state and the first few excited states
determine the fundamental properties of quantum systems at low temperatures.
However, as the system size increases,
it becomes exponentially more difficult to calculate these objects using traditional numerical methods.
To address this curse of dimensionality, I have developed several modern methods
that harness the power of Monte Carlo sampling.

With Michael Lindsey, I introduced the Rayleigh-Gauss-Newton (RGN) method,
which uses Monte Carlo sampling to efficiently optimize a high-dimensional
neural network model for the ground-state wavefunction.

With Timothy Berkelbach,
Samuel Greene,
and Jonathan Weare, I helped develop Fast Randomized Iteration (FRI), which provides accurate stochastic estimates
for the dominant eigenvalues and eigenvectors of an extremely large matrix.

My collaborators and I have applied RGN to spin systems with up to 400 spins (hence \( 2^{400} \) possible spin configurations) and have applied FRI to molecules as large as nitrogen (which has 14 interacting electrons).

Approximating matrix eigenvalues by randomized subspace iteration Samuel M. Greene, Robert J. Webber, Timothy C. Berkelbach, Jonathan Weare
[arXiv, 2021][PDF][Code]