Dec 20 Wednesday
Schedule
- 7.00: Breakfast
- 8.45: Opening
- 9:00: John Duchi Stanford University Privacy - Definitions, Procedures, Open Problems
- 10.00: Coffee break
- 10:20: Guillaume Lecué ESSEC Business School Learning with a linear loss function. Excess risk and estimation bounds for ERM, minmax MOM and their regularized versions.
- 11:00: Marco Avella Medina Columbia University M-estimation, noisy optimization and user-level (local) privacy.
- 11:40: Aleksei Kroshnin WIAS Berlin Robust k-means clustering in metric spaces
- 12.30: Lunch
- Free Afternoon
- 19.30: Dinner
7.00: Breakfast
8.45: Opening
9:00: John Duchi Stanford University Privacy - Definitions, Procedures, Open Problems
I will provide a broad overview of differential privacy, which provides guarantees that a data analysis protects the privacy of data contributors. The main focus will be on the private computation and release of different statistics, both classical (low-dimensional) and high-dimensional statistics. In addition to giving a high-level program for the development of optimal private estimators, I will likely discuss a few open questions as well.
10.00: Coffee break
Session: Robust statistics
Chair: Anru Zhang, Duke University
10:20: Guillaume Lecué ESSEC Business School Learning with a linear loss function. Excess risk and estimation bounds for ERM, minmax MOM and their regularized versions.
Motivated by several examples, we consider a general framework of learning with linear loss functions. In this context, we provide excess risk and estimation bounds that hold with large probability for four estimators: ERM, minmax MOM and their regularized versions. These general bounds are applied for the problem of robustness in sparse PCA. In particular, we improve the state of the art result for this this problems, obtain results under weak moment assumptions as well as for adversarial contaminated data. We also apply our general bounds in community detection, signed clustering, and angular synchronization.
11:00: Marco Avella Medina Columbia University M-estimation, noisy optimization and user-level (local) privacy.
We propose a general optimization-based framework for computing differentially private M-estimators. We first show that bounded-influence M-estimators can be used in conjunction with noisy gradient descent or noisy Newton methods in order to obtain optimal private estimators with global linear and quadratic convergence, respectively. We establish global convergence guarantees showing that our private estimators converge with high probability to a nearly optimal neighborhood of the non-private M-estimators. We then consider the same problem under local differential privacy (LDP) constraints where a group of users, with each user possessing multiple data points, communicates with an untrusted central server that learns the parameters of an underlying model. Contrary to most works that aim to protect a single data point, here focus on user-level privacy, that aims to protect the entire set of data points belonging to a user. Our main algorithm is a noisy version of the standard gradient descent algorithm, combined with a user-level LDP mean estimation procedure to privately compute the average gradient across users at each step. We will highlight the challenges associated with guaranteeing the notion of user-level privacy, and then present performance guarantees for our algorithm in the form of finite sample bounds on the parameter estimation error.
11:40: Aleksei Kroshnin WIAS Berlin Robust k-means clustering in metric spaces
In this talk, we consider robust algorithms for the k-means clustering (quantization) problem where a quantizer is constructed based on an i.i.d. sample. While the well-known asymptotic result by Pollard shows that the existence of two moments is sufficient for strong consistency of an empirically optimal quantizer in a Euclidean space, non-asymptotic bounds are usually obtained under the assumption of bounded support. We discuss a robust k-means in metric spaces based on trimming (similar to the one proposed by Cuesta-Albertos et al. in 1997), and prove novel non-asymptotic bounds on the excess distortion depending on the optimal distortion rather then the second moment of the distribution. In particular, this bound decreases the gap to the lower bound by Bartlett et al. (1998). The talk is based on the joint work with Y. Klochkov and N. Zhivotovskiy and on an ongoing project with A. Suvorikova and N. Zhivotovskiy.