Speaker: Andrea L. Bertozzi, UCLA

Title: Uncertainty quantification in graph-based classification of high dimensional data


Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. I discuss recent work that develops algorithms for, and investigate the properties of a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based on the graph formulation of semi-supervised learning. We provide a unified framework which brings together a variety of methods that have been introduced in different communities within the mathematical sciences. We introduce efficient numerical methods, suited to large datasets, for both MCMC-based sampling and gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semi-supervised learning algorithms. I conclude with an application involving a human-in-the-loop, such as a security analyst, who can interact with the algorithm by hand classifying a subset of data determined by the machine. I discuss recent results on these problems for classification of ego motion in body worn camera data. This is joint work with Andrew Stuart, Xiyang Luo, Konstantinos Zygalakis and Hao Li.

Speaker: Michal Branicki, The University of Edinburgh

Title: Lagrangian uncertainty quantification and information inequalities for stochastic flows


We present an information-theoretic framework for quantification and mitigation of error in probabilistic Lagrangian (trajectory-based) predictions which are obtained from uncertain (Eulerian) vector fields generating the underlying dynamical system. This work is motivated by the desire to improve Lagrangian predictions in multi-scale systems based on simplified, data-driven models. Here, discrepancies between probability measures μ and ν associated with the true dynamics and its approximation are defined via so-called φ-divergencies (premetrics defined by a class of strictly convex functions φ). We derive general information bounds on the uncertainty in estimates of observables based on the approximate dynamics in terms of the φ-divergencies. This new framework provides a systematic link between Eulerian (field-based) model error and the resulting uncertainty in Lagrangian (trajectory-based) predictions.

Speaker: Nan Chen, University of Wisconsin-Madison

Title: A Nonlinear Conditional Gaussian Framework for Extreme Events Prediction, State Estimation and Uncertainty Quantification in Complex Dynamical System


A nonlinear conditional Gaussian framework for extreme events prediction, state estimation (data assimilation) and uncertainty quantification in complex dynamical systems will be introduced in this talk. Despite the conditional Gaussianity, the models within this framework remain highly nonlinear and are able to capture strongly non-Gaussian features such as intermittency and extreme events. The conditional Gaussian structure allows efficient and analytically solvable conditional statistics that facilitates the real-time data assimilation and prediction.

In the first part of this talk, the general framework of the nonlinear conditional Gaussian systems, including a gallery of examples in geophysics, fluids, engineering, neuroscience and material science, will be presented. This is followed by its wide applications in developing the physics-constrained data-driven nonlinear models and the stochastic mode reduction. In the second part, an efficient statistically accurate algorithm is developed for solving the Fokker-Planck equation in large dimensions, which is an extremely important and challenging topic in prediction, data assimilation and uncertainty quantification. This new efficient algorithm involves a novel hybrid strategy for different subspaces, a judicious block decomposition and statistical symmetry. Rigorous mathematical analysis shows that this method is able to overcome the curse of dimensionality. In the third part of this talk, a low-order model within the nonlinear conditional Gaussian framework is developed to predict the intermittent large-scale monsoon extreme events in nature. The nonlinear low-order model shows higher prediction skill than the operational models and it also succeeds in quantifying the uncertainty in prediction. Other applications of this nonlinear conditional Gaussian framework, such as assimilating multiscale turbulent ocean flows and parameter estimation, will be briefly mentioned at the end of the talk.

Speaker: Weinan E, Princeton University

Title: Mathematical Theory of Neural Network-based Machine Learning


The task of supervised learning is to approximate a function using a given set of data. In low dimensions, its mathematical theory has been established in classical numerical analysis and approximation theory in which the function spaces of interest (the Sobolev or Besov spaces), the order of the error and the convergence rate of the gradient-based algorithms are all well-understood. Direct extension of such a theory to high dimensions leads to estimates that suffer from the curse of dimensionality as well as degeneracy in the over-parametrized regime. In this talk, we attempt to put forward a unified mathematical framework for analyzing neural network-based machine learning in high dimension (and the over-parametrized regime). We illustrate this framework using kernel methods, shallow network models and deep network models. The talk is based mostly on joint work with Chao Ma, Lei Wu as well as Qingcan Wang.

Speaker: Bjorn Engquist, University of Texas, Austin

Title: Multiscale Simulation for Boundary Layer Flow


The standard no-slip boundary condition for Navier-Stokes equations may generate small scales in the flow, which are hard to computationally resolve. We will consider cases where effective slip boundary conditions can be derived based on local high-resolution simulations. For Large Eddy Simulation of turbulent flow local direct numerical simulations are used to computationally determine appropriate effective boundary conditions. In the case of creeping flow at rough boundaries the correct form of effective boundary conditions can be rigorously derived.

Speaker: Dimitris Giannakis, NYU Courant

Title: Quantum mechanics and data assimilation


We discuss a framework for data assimilation combining aspects of operator-theoretic ergodic theory and quantum mechanics. This framework adapts the Dirac-von Neumann formalism of quantum dynamics and measurement to perform data assimilation (filtering) of a partially observed, measure-preserving dynamical system, using the Koopman operator on the L2 space associated with the invariant measure as an analog of the Heisenberg evolution operator in quantum mechanics. In addition, the state of the data assimilation system is represented by a trace-class operator analogous to the density operator in quantum mechanics, and the assimilated observables by self-adjoint multiplication operators. A quantization approach is also employed, rendering the spectrum of the assimilated observables discrete, and thus amenable to numerical approximation. We present a data-driven formulation of the quantum mechanical data assimilation approach, utilizing kernel methods from machine learning and delay-coordinate maps of dynamical systems to represent the evolution and measurement operators via matrices in a data-driven basis. Applications to periodic oscillators and the Lorenz 63 system demonstrate that the framework is able to naturally handle non-Gaussian statistics, complex state space geometries, and chaotic deterministic dynamics.

Speaker: Martin Hairer, Imperial College London

Title: TBA

Speaker: John Harlim, Penn State University

Title: Manifold learning based computational methods


Recent success of machine learning has drawn tremendous interests in applied mathematics and scientific computations. In this talk, I will discuss recent efforts in using manifold learning algorithms (a branch of machine learning) to overcome the shortcoming of traditional computational methods in parameter estimation and modeling of dynamical systems. For parameter estimation, I will demonstrate how to use machine learning and existing tools from statistics and functional analysis to perform efficient Bayesian inferences. For modeling application, I will demonstrate how to discover missing dynamics using the available data, connecting to the classical averaging theory of multiscale dynamical systems. If time is permitting, I will also demonstrate how to use a manifold learning technique to approximate solutions of elliptic PDE's on smooth manifolds.

Speaker: Boualem Khouider, University of Victoria

Title: Bayesian parameter inference for a stochastic model for clouds and convection using Giga-LES data


The poor representation of tropical convection is believed to be responsible for much of the uncertainty in climate and numerical weather prediction models. The stochastic multicloud model (SMCM) was recently developed by Khouider et al. (2010) to mimic the interaction of clouds and organized tropical convection and improve its representation in GCMs. The SMCM is a stochastic lattice systems where the key three cloud types (congestus, deep and stratiform) that characterize organized convection occupy randomly the lattice sites as the states of a Markov process where probability transitions are formalized in terms of seven cloud timescale parameters conditional on the large scale state--the predictors. Here, the Bayesian inference procedure is applied to estimate these key cloud timescales from the Giga-LES dataset, a 24-hr large-eddy simulation (LES) of deep tropical convection over a domain comparable to a GCM gridbox (Khairoutdinov et al, 2009). After a successful validation of the Bayesian methodology, using synthetic data, a sequential learning strategy is devised and adopted (De La Chevrotiere et al. 2015). The Giga-LES domain is partitioned into a few subdomains, and atmospheric variable time series obtained on each subdomain are used to train the Bayesian procedure incrementally. Convergence of the marginal posterior densities for all seven parameters is demonstrated for two different grid partitions, and sensitivity tests to other model parameters are also presented. A single column model simulation using the SMCM parameterization with the Giga-LES inferred parameters reproduces many important statistical features of the Giga-LES run, without any further tuning. In particular it exhibits intermittent dynamical behavior in both the stochastic cloud fractions and the large scale dynamics, with periods of dry phases followed by a coherent sequence of congestus, deep, and stratiform convection, varying on timescales of a few hours consistent with the Giga-LES time series (De La Chevrotiere et al. 2016). The chaotic variations of the cloud area fractions were captured fairly well both qualitatively and quantitatively demonstrating the stochastic nature of convection in the Giga-LES simulation. When implemented in the Climate Forecasting Model (NCEP's atmosphere-ocean coupled model), with the parameters inferred from the Giga-LES dataset, according to this procedure, it drastically improves the representation of the tropical modes of variability as never before, such as the Madden Julian Oscillation, the Monsoon Intra-Seasonal Oscillation, and synoptic scale convectively coupled waves.


  • De La Chevrotière M, Khouider B, Majda AJ (2014) Calibration of the stochastic multicloud model using bayesian inference. SIAM J Sci Comput 36(3):B538–B560
  • De La Chevrotière, M., B. Khouider, and A. J. Majda, 2015: Stochasticity of convection in giga-LES data. Climate Dyn., 47, 1845–1861
  • Khairoutdinov MF, Krueger SK, Moeng CH, Bogenschutz PA, Randall DA (2009): Large-eddy simulation of maritime deep tropical convection. Journal of Advances in Modeling Earth Systems 1(12)
  • Khouider B, Biello J, Majda AJ (2010) A stochastic multicloud model for tropical convection. Communications in Mathematical Sciences 8(1):187-216
  • B. B. Goswami, B. Khouider, R. Phani, P. Mukhopadhyay, and A. J. Majda. Improved tropical modes of variability in the NCEP Climate Forecast System (Version 2) via a stochastic multicloud model. Journal of the Atmospheric Sciences, 74(10):3339–3366, 2017

Speaker: Peter Kramer, RPI

Title: Stochastic Spatial Modeling of Intracellular Transport from Molecular to Cellular Scale


Transport of organelles and other vital material in biological cells is conducted largely by specialized molecular motor proteins moving along actin or microtubule filaments. Their dynamics are inherently stochastic due to both the importance of thermal fluctuations on their length scale and the dependence of their motion on discrete binding events to ATP molecules. This presentation will begin by reviewing some of the prevalent methods for representing the dynamics of molecular motors on a single microtubule, and describe ongoing and emerging research questions concerning experimental observations for how molecular motors move through networks of microtubules. As a complement to detailed computational models for addressing these questions, we formulate some relatively simple modeling scenarios for which asymptotic stochastic procedures can quantify how the parameters describing motor function relate to their effective transport on larger scales.

Speaker: Yoonsang Lee, Dartmouth College

Title: Importance sampling for computationally expensive target distributions


We present an efficient and robust importance sampling approach for computationally expensive target distributions. The method uses minimal information of the target distribution, that is, only evaluation of the target density on the fly, without specifying any proposal distribution in advance that often requires statistical or analytic information of the target distribution. The key feature of the presented approach is to reweigh incomplete Markov Chain Monte Carlo (MCMC) samples, which is computationally efficient in comparison with complete MCMC sampling. This feature enables to use a large number of walkers but a short chain length to explore a wide range of sample values effectively, whose computational time due to the large number of walkers can be mitigated by parallel computing. The method is tested for a two-dimensional model problem with multiple modes and for a realistic parameter estimation problem of kinetic reaction rates in a computationally challenging combustion model, which has strongly non-Gaussian statistics. This is joint work with J Bell and M Day at LBNL.

Speaker: Richard McLaughlin, University of North Carolina, Chapel Hill

Title: The interplay of geometry, diffusion, and advection in interior and exterior problems in homogeneous and stratified fluids


We examine two problems: in the first, we explore how the shape of a straight tube carrying solute can be used to adjust the delivery properties of the solute downstream, from arriving with a sharp front and tapering tail for skinny tubes, to the opposite for tubes with order one aspect ratios. The mechanisms for this involve a subtle interplay between flow, diffusion and geometry through imposing the no-flux boundary condition. In the second problem we review our extensive work on matter interacting with stratified fluids, and present some quite new findings we have recently made regarding the interplay of diffusion and geometry, producing a highly unexpected collective dynamics. Mathematical modeling focusing on single and double ellipsoids reveals the underlying physical and geometrical mechanisms responsible for this surprising and unusual behavior.

This is joint work with Roberto Camassa, Dan Harris, and many grad and undergrad students.

Speaker: Di Qi, NYU Courant

Title: Statistical reduced models and rigorous analysis for uncertainty quantification of turbulent geophysical flows


The capability of using imperfect statistical reduced-order models to capture crucial statistics in turbulent flows is investigated. Much simpler and more tractable block-diagonal models are proposed to approximate the complex and high-dimensional turbulent flow equations. A systematic framework of correcting model errors with empirical information theory is introduced, and optimal model parameters under this unbiased information measure can be achieved in a training phase before the prediction. It is demonstrated that crucial principal statistical quantities in the most important large scales can be captured efficiently with accuracy using the reduced-order model in various dynamical regimes of the flow field with distinct statistical structures.

Speaker: Themis Sapsis, MIT

Title: Extreme events in complex dynamical systems: prediction and statistical quantification


For many natural and engineering systems, extreme events, corresponding to large excursions, have significant consequences and are important to predict. Examples include extreme environmental events: rogue waves in the ocean, flooding events, climate transitions, as well as extreme events in engineering systems: unsteady flow separation, large ship motions, and dangerous structural loads. Therefore, predicting and understanding extreme events is an essential task for reliability assessment and design, characterization of operational capabilities, control and suppression of extreme transitions, just to mention a few. Despite their importance, understanding of extreme events for chaotic systems with intrinsically high-dimensional attractors has been a formidable problem, due to the stochastic, nonlinear, and essentially transient character of the underlying dynamics. Here we discuss two themes in contemporary, equation-assisted, data-driven modeling of dynamical systems related to extreme events: the prediction problem and the statistical quantification problem. For the first theme, a major challenge is the computation of low-energy patterns or signals, which systematically precede the occurrence of these extreme transient responses. We develop a variational framework for probing conditions that trigger intermittent extreme events in high-dimensional nonlinear dynamical systems. The algorithms exploit in a combined manner some physical properties of the chaotic attractor, as well as, finite-time stability properties of the governing equations. In the second part of the talk we develop a method for the evaluation of extreme event statistics associated with nonlinear dynamical systems from a small number of samples. From an initial dataset of design points, we formulate a sequential strategy that provides the ‘next-best’ data point (set of parameters) that when evaluated results in improved estimates of the probability density function (pdf) for any scalar quantity of interest. The approach combines machine learning and optimization methods to determine the ‘next-best’ design point that maximally reduces uncertainty between the estimated bounds of the pdf prediction. We assess the performance of the derived schemes through direct numerical simulations. Applications are presented for many different areas, including prediction of extreme events in turbulent fluid flows and ocean waves, and probabilistic quantification of extreme events in fluid-structure interactions and ship motions.

Speaker: Takis Souganidis, University of Chicago

Title: Stochastic Hamilton-Jacobi equations and qualitative properties, Part II


I will discuss some new results about intermittent regularity and long time dependence for stochastic viscosity solutions of Hamilton-Jacobi equations.

Speaker: Katepalli Sreenivasan, Physics, Courant and Engineering, NYU

Title: On the vortex reconnection


Fundamental to classical and quantum fluids, as well as superconductors, magnetic flux tubes, liquid crystals, cosmic strings, and DNA, is the phenomenon of reconnection of line-like objects. I will report our work on the reconnection of vortex lines in quantum fluids: unlike in classical fluids where vorticity is a continuous field, it occurs in the form of isolated line-like singularities whose circulation is quantized, and so the phenomenon is easier to study in both concept and practice. I will discuss their experimental visualization as well as the scaling laws that govern their behavior before and after reconnection in both quiet and noisy environments, etc.

Speaker: Sam Stechmann, University of Wisconsin-Madison

Title: New Perspectives on Atmospheric Data for Subseasonal-to-Seasonal Predictions and Majda's Work on PDEs


Subseasonal-to-Seasonal (S2S) prediction is aimed at time scales of weeks and months, at the intersection of weather and climate, and it was foreseen by John von Neumann in 1955 to be the final frontier of forecasting. S2S efforts are expanding, and they motivated two projects on data analysis that will be presented here. The projects highlight the role of mathematical theory in drawing out new understanding of weather and climate, and they make connections with Andy Majda's seminal work on weather, climate, and partial differential equations (PDEs).

The first project involves a paradoxical situation in parameter estimation: a strong damping, with short time scale of 1 to 5 days, has long been estimated and believed necessary to properly model the Walker circulation, a climate phenomenon with long time scale of months, years, or longer. We resolve the paradox by showing that previous parameter estimates were hampered by an unnoticed singular limit, and, when accounted for, we show that in fact no damping is needed at all. The second project provides the first assessment of an asymptotic limit that is relevant for S2S prediction: the equatorial long-wave approximation. We show that simple assessments suggest a narrow range of validity, but a careful treatment of eigenmode decompositions reveals a wide range of validity. Rigorous theory for the asymptotic limit was shown by Dutrifoy and Majda, and it overcomes the challenge of a singular limit with fast and spatially varying coefficients.

Speaker: Esteban Tabak, NYU Courant

Title: Conditional density estimation and simulation through optimal transport


Conditional probability estimation and simulation provides data-based answers to all kinds of critical questions, such as the expected response of specific patients to different medical treatments, weather and climate forecasts, and the effect of political measures on the economy. In the complex systems behind these examples, the outcome of a process depends on many and diverse factors and is probabilistic in nature, due in part to our ignorance of other relevant factors and to the chaotic nature of the underlying dynamics.

This talk will describe a general procedure for the estimation and simulation of conditional probabilities, based on the removal of the effect of covariates through a data-based, generalized optimal transport barycenter problem. This barycenter problem can be formulated as an adversarial game, which we solve through an implicit descent methodology for min-maximization. Prototypal analysis is used to extend the procedure to covariates more general than real or categorical variables.

Speaker: Pierre-Louis Lions, Collège de France

Title: Stochastic Hamilton-Jacobi equations and qualitative properties, Part I


I will present a review of the theory of stochastic viscosity solutions of Hamilton-Jacobi and discuss some new results about domain of dependence.

Speaker: Xin Tong, National University of Singapore

Title: Ensemble Kalman filter in high dimensions


Ensemble Kalman filter (EnKF) is an algorithm designed for data assimilation of high dimensional geophysical systems. It can produce skillful forecast of nonlinear models of millions of dimension with around 100 samples. We seek rigorous explanations of such surprisingly good performance. Assuming the underlying dynamics is linear, we show the filter error can be bounded by the sample covariance, as long as there exists 1) a low effective dimension, or 2) a stable spatially localized covariance structure. We will further develop a framework that guarantees the spatially localized covariance structures, and can be applied to models such as FitzHugh-Nagumo with mean field interactions.

Speaker: Eric Vanden-Eijnden, NYU Courant

Title: Extreme event quantification in dynamical systems with random components


A central problem in uncertainty quantification is how to characterize the impact that our incomplete knowledge about models has on the predictions we make from them. This question naturally lends itself to a probabilistic formulation, by making the unknown model parameters random with given statistics. In this talk I will explain how to use this approach in concert with tools from large deviation theory (LDT) and optimal control to estimate the probability that some observables in a dynamical system go above a large threshold after some time, given the prior statistical information about the system’s parameters and/or its initial conditions. Specifically, I will establish under which conditions such extreme events occur in a predictable way, as the minimizer of the LDT action functional. I will also explain how this minimization can be numerically performed in an efficient way using tools from optimal control. Finally, I will show that the approach can be used to predict and analyse extreme realizations of water surface elevation observed in a wave flume experiment: for a wide range of parameters of the energy spectrum of deep water waves, these rogue waves are well described by the minimizers of the LDT action associated with the nonlinear Schroedinger equation (NLSE) with random initial conditions.

This is joint work with Giovanni Dematteis, Tobias Grafke, and Miguel Onorato.

© New York University