Stochastic models are widely accepted as an important component in the analysis of complex systems. At the same time, models motivated by real-world phenomena yield many technical challenges, such as systems with sparse data, memory, or high dimensional components. As an applied mathematician, my research focuses on interesting problems at the intersection of these two themes. I use tools from applied probability, uncertainty quantification, numerical analysis, and scientific computing to advance our understanding of stochastic models that are rooted in real-world phenomenon with applications in fields as diverse as materials science, biophysics, systems control, and geophysics. The natural progression of this work is to explore links to data science, such as the potential for data-informed uncertainty quantification tools to complement and improve machine learning techniques and hierarchical Bayesian approaches to modeling.
Causality and Bayesian network PDEs for multiscale representations of porous media
Microscopic (pore-scale) properties of porous media affect and often determine their macroscopic (continuum- or Darcy-scale) counterparts. Understanding the relationship between processes on these two scales is essential to both the derivation of macroscopic models of transport phenomena in natural porous media and the design of novel materials for energy storage. Most microscopic properties exhibit complex statistical correlations and geometric constraints, which presents challenges for the estimation of macroscopic quantities of interest (QoIs) for example in the context of global sensitivity analysis (GSA) of macroscopic QoIs with respect to miscroscopic material properties. We present a systematic way of building correlations into stochastic multiscale models through Bayesian networks. This allows us to construct the joint probability density function (PDF) of model parameters through causal relationships that emulate engineering processes in the design of hierarchical nanoporous materials. Such PDFs also serve as input for the forward propagation of parametric uncertainty; our findings indicate that the inclusion of causal relationships impacts predictions of macroscopic QoIs. To assess the impact of correlations and causal relationships between microscopic parameters on macroscopic material properties, we use a moment-independent GSA based on the differential mutual information. Our GSA accounts for the correlated inputs and complex non-Gaussian QoIs. The global sensitivity indices are used to rank the effect of uncertainty in microscopic parameters on macroscopic QoIs, to quantify the impact of causality on the multiscale model’s predictions, and to provide physical interpretations of these results for hierarchical nanoporous materials.
- Causality and Bayesian Network PDEs for multiscale representations of porous media, with K. Um, M. Katsoulakis, and D. Tartakovsky, Journal of Computational Physics, 394 (2019), p. 658–678.
Robust information divergences for model-form uncertainty in random PDE
Steady-state subsurface flow is a complex system described by a random PDE of elliptic type where the diffusion coefficient is given by a geostatistical model. The geostatistical model, with properties inferred from relevant data, represents a source of model-form uncertainty that poses a challenge for making predictions of quantities of interest. In this work, we describe a novel application of hybrid information divergences to a steady-state flow model in the context of a decision support framework. Information divergences, based on the Donsker--Varadhan variational principle from large deviations theory, have a form that balances observable and data dependent quantities. Although information divergences have been used for sensitivity analysis in other contexts, the presence of model-form uncertainty necessitates distinguishing various uncertain aspects of the system to capture essential features. The hybrid nature of the information divergences presented in this work allow us to represent, aggregate, and distinguish sources of uncertainty and in particular to quantify the propagation of model-form uncertainty. We derive tight and robust bounds for modeling errors and apply these bounds to important uncertainty quantification tasks including parametric sensitivity analysis and gauging model misspecification due to sparse data. We also explore data-informed predictions, such as quantifying the impact of incomplete data on a quantity of interest. Further, we leverage the connection between the hybrid information divergences and certain concentration inequalities for efficient computing.
- Robust information divergences for model-form uncertainty arising from sparse data in random PDE, with M. Katsoulakis, SIAM/ASA Journal on Uncertainty Quantification, 6 (2018), p. 1364–1394.
Uncertainty quantification for the generalized Langevin equation
The generalized Langevin equation GLE is a non-Markovian stochastic dynamical system (a system with "memory") that models anomalous diffusion arising in the context of viscoelastic media such as complex biofluids. Under physically reasonable assumptions, a Markovian approximation to the GLE exists and this Markovian extended variable formulation contains many parameters that must be tuned. Although uncertainties in drift, vibration, and tracking measurements may be present in the microrheology data used to tune the GLE model, a central problem here is one of epistemic or model-form uncertainty. There is a wealth of data, but few methods that allow one to compare and evaluate the models that are suggested by this data. Therefore, a key challenge surrounds analyzing the sensitivity of the dynamics to local and global perturbations. However, well known sensitivity analysis techniques such as likelihood ratio and pathwise methods are not applicable to key parameters of interest in this context. In particular, it is relevant to understand discrete changes in the model such as perturbations to the degrees of freedom of the extended variable system.
In this work we present efficient finite difference estimators for goal-oriented sensitivity indices. These easily implemented estimators are formed by coupling the nominal and perturbed dynamics appearing in the finite difference through a common driving noise, or common random path. After developing a general framework for variance reduction via coupling, we demonstrate the optimality of the common random path coupling in the sense that it produces a minimal variance surrogate for the difference estimator relative to sampling dynamics driven by independent paths. In order to build intuition for the common random path coupling, we evaluate the efficiency of the proposed estimators for a comprehensive set of examples of interest in particle dynamics. These reduced variance difference estimators are also a useful tool for performing global sensitivity analysis and for investigating non-local perturbations of parameters, such as discrete changes to the degrees of freedom in the extended variable system.
- Uncertainty quantification for generalized Langevin dynamics, with M. Katsoulakis and L. Rey-Bellet, Journal of Chemical Physics, 145 (2016), p. 224108.
Computable estimates for random PDE with rough stochastic coefficients
In the geophysics literature, elliptic PDE with uncertain data arise in the study of time-independent groundwater flow on the local scale (on the order of 100's of meters). In application, prescribing the conductivity requires more information than is reasonably possible to acquire; a common feature of groundwater flow at the local scale is the spatial heterogeneity of the medium. Such uncertainty in the problem data is incorporated by modeling the conductivity as a random field. In applications to subsurface flow, the law of the hydraulic conductivity is assumed to be a normal field with Lipschitz covariance.
In this work we give a computable estimate for observables of the Galerkin error committed in standard piecewise linear finite element approximations. For this model, the standard a posteriori analysis for the dual weighted residual does not give a reliable estimate. In contrast to the case of a smooth conductivity, we see that for a rough lognormal conductivity the residual contains non-negligible high frequency content.
Analyzing the frequency content for the 1-dimensional problem suggests that this high frequency contribution can be approximated by low frequency content. A related assumption on scales yields a computable estimator of the Galerkin error observable based on local error indicators. We also obtain estimators for the expected quadrature error committed in the finite element approximations as, in contrast to the case of a smooth conductivity, the quadrature error is observed to be on the same order as the Galerkin error in this setting. These estimates, derived using easily validated assumptions, can be computed at a relatively low cost and fill a much needed gap by providing an important and novel computational tool for PDE with rough stochastic coefficients.
- Computable error estimates for finite element approximations of elliptic partial differential equations with rough stochastic data, with H. Hoel, M. Sandberg, A. Szepessy, R. Tempone, SIAM Journal on Scientific Computing, 38 (2016), p. A3773–A3807.
Accelerated finite difference schemes for SPDE
Filtering problems pertain to a stochastic dynamical system, possibly nonlinear, modeled by a signal process that cannot be sampled directly and an observation process that yields partial (noisy) information about the signal. The goal of the filtering problem is to estimate the density of a functional of the signal process at a given time conditioned on knowledge of the observation process. The evolution of such a density can be described by the solution to the Cauchy problem for the Zakai equation, a second order linear SPDE of parabolic type. In applications, such as satellite tracking and guidance, solutions are desired in real time and information about the error is required in a stochastically strong, or pathwise, manner. Since analytic solutions are unavailable, there is a demand for accurate and effective numerical schemes for approximating the solutions to these problems. My PhD research focuses on the analysis of finite difference approximations for a class of parabolic SPDEs that includes the Zakai equation. In these works, I give sufficient conditions on when Richardson's method can be used to obtain higher order spatial approximations. I also extend these results to the case of degenerate parabolic SPDE, a significant contribution as one is not guaranteed the strong parabolicity condition in physically relevant applications.
- Higher order spatial approximations for degenerate parabolic stochastic partial differential equations, SIAM Journal on Mathematical Analysis, 45 (2013), p. 2071–2098.
- Accelerated spatial approximations for time descretized stochastic partial differential equations, SIAM Journal on Mathematical Analysis, 44 (2012), p. 3162–3185.
- Accelerated Numerical Schemes for Deterministic and Stochastic Partial Differential Equations of Parabolic Type, Ph.D. thesis, University of Edinburgh, 2013.