Skip to main content Link Menu Expand (external link) Document Search Copy Copied

Dec 19 Tuesday

Schedule

7.00: Breakfast

8.45: Opening

9:00: Ilias Diakonikolas UW Madison Algorithmic Robust Statistics

The field of Robust Statistics studies the problem of designing estimators that perform well even when the data significantly deviates from the idealized modeling assumptions. The classical statistical theory, going back to the pioneering works by Tukey and Huber in the 1960s, characterizes the information-theoretic limits of robust estimation for a number of statistical tasks. On the other hand, until fairly recently, the computational aspects of this field were poorly understood. Specifically, no scalable robust estimation methods were known in high dimensions, even for the most basic task of mean estimation.

A recent line of work in computer science developed the first computationally efficient robust estimators in high dimensions for a range of learning tasks. This tutorial will provide an overview of these algorithmic developments and discuss some open problems in the area.

Back to day schedule


10.00: Coffee break

Session: The Mark’s session

Chair: Natalia Bochkina, University of Edinburgh

10:20: Markus Reiß HU Berlin TBA - Theory Based on Applications

We develop a general lower bound framework for parameter estimation in stochastic evolution equations under noisy observations. This gives new structural insights already for Ornstein-Uhlenbeck processes, but clarifies in particular which parameters in a stochastic partial differential equation (SPDE) can be estimated well or badly. This extends to nonparametric problems involving some non-trivial analysis, but there remain fundamental problems open, connected to a certain ellbox effect. The theory is inspired by applications from cell motility experiments where estimators have a very desirable robustness property.

Based on joint work with Randolf Altmeyer and Gregor Pasemann

Back to day schedule

In this presentation we review some recent results on non-parametric estimation of the interaction function in high dimensional particle systems (McKean-Vlasov SDEs). Such particle systems have been originally introduced as models of gas particles in physics, but later found manifold applications in e.g. biology and finance. In particular, we consider a particle system with the drift component being a convolution of an unknown interaction function and the empirical measure of the system. Our main goal is to construct a non-parametric estimator of the interaction function and study its rate of convergence. This statistical inverse problem turns out to be quite non-standard and we will comment on various mathematical issues.

Back to day schedule

11:40: Marc Hoffmann PARIS-DAUPHINE University On estimating multidimensional diffusions from discrete data

We revisit an old statistical problem: estimate non-parametrically the drift vector field and diffusion matrix of a diffusion process from discrete data \((X_0,X_D, X_{2D}, \ldots, X_{ND}) \). The novelty are: (i) the multivariate case: only few results have been obtained in this setting from discrete data (and, to the best of our knowledge, no results for the diffusion matrix) (ii) the sampling scheme has high frequency but is arbitrarily slow: \(D=D_N \rightarrow 0 \) and ((ND_N^q\) bounded from some possibly arbitrarily large \(q\) (à la Kessler) and (iii) the process lies in a (not necessarily convex, nor necessarily bounded) domain in \(\mathbb R^d\) with reflection at the boundary. (In particular we recover the case of a bounded domain or the whole Euclidean space \(R^d\).) We establish a relatively standard minimax (adaptive) program for integrated squared error loss over bounded domains (and more losses in the simpler case of the drift) over standard smoothness classes. When \(ND_N^2 \rightarrow 0\) and in the special case of the conductivity equation over a bounded domain, we actually obtain contraction rates in squared error loss in a nonparametric Bayes setting. The main difficulty here lies in controlling small ball probabilities for the likelihood ratios; we develop small time expansions of the heat kernel with a bit of Riemannian geometry to control adequate perturbations in KL divergence, using old ideas of Azencott and others. That last part is joint with K. Ray.

Although this problem could have been methodologically addressed almost two decades ago, we heavily rely on the substantial progress that in the domain to clarify and quantify the stability of ergodic averages via concentration chaining techniques and explicit mixing bounds (Dirksen, Nickl, Paulin, Ray, Reiss, Strauch and many others).

Back to day schedule


12.30: Lunch


16.00: Round Table: Fairness and robustness in AI

17.00: Coffee break

Session: Generative modeling

Chair: Johannes Schmidt-Hieber, Twente University

17:40: Arnak Dalalyan ENSAE Generative Modeling with Maximum Deviation from the Empirical Distribution

Generative modeling is a widely-used machine learning method with various applications in scientific and industrial fields. Its primary objective is to simulate new examples drawn from an unknown distribution given training data while ensuring diversity and avoiding replication of examples from the training data.

In this talk, we present theoretical insights into training a generative model with two properties: (i) the error of replacing the true data-generating distribution with the trained data-generating distribution should optimally converge to zero as the sample size approaches infinity, and (ii) the trained data-generating distribution should be far enough from any distribution replicating examples in the training data.

We provide non-asymptotic results in the form of finite sample risk bounds that quantify these properties and depend on relevant parameters such as sample size, the dimension of the ambient space, and the dimension of the latent space. Our results are applicable to general integral probability metrics used to quantify errors in probability distribution spaces, with the Wasserstein-1 distance being the central example.

Back to day schedule

18:20: Andrej Risteski Carnegie-Mellon University Representational power of generative models - a view from conditioning

In the recent decade, there has been a proliferation of different types of generative models, e.g. GANs, normalizing flows, neural ODEs, diffusion models. Very little is known about the relative strengths and weaknesses of these models on a foundational level—making it difficult to draw conclusions about whether the performance of a particular family is due to a feat of engineering, or a fundamental advantage.

In this talk, I will focus on Lipschitzness and robust invertibility as a lens to reason about the power of different types of generative models—in particular normalizing flows with affine couplings, score-based diffusion models, and general pushforward models. In many of these families, non-Lipschitz or close-to-singular maps cause numerical instabilities during training, make evaluating the model brittle on points far from the data manifold, and can require deeper neural networks to fit them. Technically, we uncover interesting interplays between the existence of Lipschitz maps and geometric properties of the data distribution, as well as the analytic behavior of certain SDEs used to sample from it.

Based on https://arxiv.org/abs/2010.01155, https://arxiv.org/abs/2107.02951, and some work currently in progress.

Back to day schedule

19.30: Dinner