Oral Communications

 

OC1

Marcel Klatt

Institute for Mathematical Stochastics University of Goettingen (Germany)
marcel.klatt@stud.uni-goettingen.de

Title: Distributional limits for optimal transport on finite spaces
Abstract:
We present recently derived distributional limit results for empirical transport distances of probability measures supported on finite and discrete spaces. For the finite setting we extend these results to regularized empirical transport distances and show further consistency of the (naive) bootstrap which in contrast fails for the empirical transport distance. In particular, we prove that the regularized transport plan itself asymptotically follows a high dimensional Gaussian law. The theory includes the Boltzmann-Shannon entropy regularization and hence a limit law for the widely applied dual Sinkhorn divergence. For the nonregularized transport distance the proof strategy is based on a sensitivity analysis for the optimal value of the transport problem in conjunction with a refined delta method. However, for regularized transport distances we require a sensitivity analysis for the regularized transport plan, i.e., the optimal solution rather than the optimal value. This is accomplished by an application of the implicit function theorem to necessary and sufficient optimality conditions. We illustrate the results with synthetic data.(This is joint work with Max Sommerfeld, Carla Tameling and Axel Munk).

OC2

Paula Gordaliza

Université de Toulouse (France)
pgordaliza94@gmail.com

Title: Obtaining fairness using optimal transport theory
Abstract:
Statistical algorithms are usually helping in making decisions in many aspects of our lives. But, how do we know if these algorithms are biased, involve ilegal discrimination or are unfair? Fairness is generally studied in a probabilistic framework where it is assumed that there exists a protected variable, whose use as an input of the algorithm may imply discrimination. There are different definitions of Fairness in the literature. In this paper we focus on two of them which are called Disparate Impact (DI) and Balanced Error Rate (BER). Both are based on the outcome of the algorithm across the different groups determined by the protected variable. The relationship between these two notions is also studied. The goals of this paper are to detect when a binary classification rule lacks of fairness and to fight against the potential discrimination committed by it. This can be done by modifying either the classifiers or the data itself. Our work falls into the second category and modifies the input data using optimal transport theory.

OC3

Hristo Inouzhe Valdes

Universidad de Valladolid (Spain)
hristo.inouzhe@uva.es

Title: Clustering meta-analysis based on Wasserstein barycenters
Abstract:
Cluster analysis addresses the detection of data grouping in data sets. Within this, too vague, description, model-based clustering aims to find particularly shaped groupings -clusters- according to specified distributions. In this setting, the clusters provided by the method are described by probability (often Gaussian) distributions, that can be considered as elements of an abstract space. Particular interest has been deserved by the L_2 Wasserstein distance, leading to a rich set-up for developing statistical concepts in a parallel way to those known on Euclidean spaces. This is the case of the k-barycenters, the abstract version of k-means, by large the widest used method in clustering problems, recently introduced in the Wasserstein space even in a robust version. We focus on the application of the (trimmed) Wasserstein k-barycenters to some of the fundamental problems present in cluster analysis. This includes parallelization or stabilization of procedures and even improvement of initial solutions for the algorithms involved in the methods, but we will also pay special attention to the meta-analysis tools arising from this robust aggregation procedure: Stability (or coherence) criteria, applied to the provided aggregation, will give descriptive signs on the number of clusters or on the adequacy of the clustering procedure. We present illustrative examples of the previously mentioned concepts.

OC4

Aleksei Kroshnin

Institute for Information Transmission Problems (Rusia)
kroshnin@phystech.edu

Title: On new complexity bounds for Sinkhorn’s and Iterative Bregman Projections algorithms
Abstract:
Sinkhorn’s algorithm is a popular method of solving Monge-Kantorovich problem and Iterative Bregman Projections (Cuturi et al., 2014) is its generalization to other related problems e.g. computing of Wasserstein barycenters. In this talk we present a new complexity bound \(O(n^2 / \epsilon^2)\) for this family of algorithms what improves the best known bound \(O(n^2 / \epsilon^3)\) for Sinkhorn’s algorithm.