The Probability and Statistics seminar is a meeting of the research teams of Prof. Baraud, Prof. Nourdin, Prof. Peccati, Prof. Podolskij, Prof Ley and Prof. Thalmaier. Its aim is to present both research works and surveys of mathematical areas of common interest. An archive of talks before 2020 can be seen here.

# Probability & Statistics Seminar

- Home >
- Probability & Statistics Seminar

- Thursday 12.05.2022, 1pm, TBA
**Federico Camia (NYU Abu Dhabi)**,*Conformal Probability: A Personal Perspective***Abstract:**The last two decades have seen the emergence of a new area of probability theory concerned with random fractal structures characterized by a certain invariance under conformal (angle-preserving) transformations. These structures often emerge when taking the continuum scaling limit of two-dimensional models of statistical mechanics with parameters chosen at or near those values where a continuous phase transition occurs. The study of such structures has had deep repercussions on both mathematics and physics, generating tremendous progress in areas ranging from probability theory to statistical mechanics to conformal field theory. In this talk, I will give a personal perspective on some aspects of this new area of probability theory, focusing for concreteness on two specific examples, the Ising model and percolation. - Thursday 05.05.2022, 1pm, MNO 1.050
**Isao Sauzedde (Oxford University)**,*A random rough path extension for the Brownian motion***Abstract:**We will talk about some pathwise properties of the planar Brownian motion, related to its winding around deterministic and random points. This will allow to state a Green formula, which describes Stratonovich integrals as areas enclosed by the path. Then, we will consider the average winding around the points of a Poisson process with large intensity. In this model motivated by the physics of an electron moving inside a crystal, the law of large numbers fails and must be replaced with a 1-stable CLT. By endowing the Brownian motion with a random non-continuous rough path extension, we will describe the physical effects of the Poisson impurities in the crystal as rough path integrals along the path.

#### Past sessions

- Thursday 28.04.2022, 1pm, TBA
**Charles-Philippe Diez**,*Multidimensional semicircular approximations via free Malliavin-Stein method***Abstract:**In this talk, focused on free probability, we will define respectively the semicircular distribution and Wigner chaos which are respectively the free analogues of the gaussian random variable and Wiener chaos. Firstly, we will discuss how non-commutative stochastic calculus and Free Malliavin calculus invented by Biane and Speicher in 1998 has allowed several authors to contribute to the univariate free Malliavin-Stein method. Secondly, we will use the notion of Free Stein kernel introduced by M.Fathi and B.Nelson in 2017, as well as free functional inequalities to obtain multivariate extensions of these results. In particular, we will provide quantitative bounds for the (non-commutative) Wasserstein distance between self-adjoint vector-valued Wigner integrals and semicircular families with positive definite covariance matrix. This last result can be seen as a (weaker) non-commutative counterpart of the famous theorem of I.Nourdin, G.Peccati and A.Revéillac which estimates the Wasserstein distance between a vector of Malliavin differentiable random variables and a gaussian vector with positive definite covariance matrix. And finally, we will provide several examples of applications, such as the rate of convergence in the multivariate free Breuer-Major CLT for the free fractional Brownian motion. - Thursday 21.04.2022, 1pm, MSA 2.380
**Antoine Ayache (Université de Lille)**,*Lower bounds for local oscillations of Hermite processes***Abstract:**The most known example of a class of non-Gaussian stochastic processes which belongs to the homogeneous Wiener chaos of an arbitrary order N > 1 are probably Hermite processes of rank N. They generalize fractional Brownian motion (fBm) and Rosenblatt process in a natural way. They were introduced several decades ago. Yet, in contrast with fBm and many other Gaussian and stable stochastic processes and fields related to it, few results on path behavior of Hermite processes are available in the literature. The goal of our talk is to derive a quasi-optimal lower bound for the asymptotic behavior of local oscillations of paths of Hermite processes of any rank N.

- Thursday 07.04.2022, 1pm, MNO 1.050
**Christophe Ley**,*Statistics meets Sports – when figures are more than numbers***Abstract:**In this talk I will provide a gentle introduction to the growing world of sport analytics. After talking about its genesis and giving some striking examples from professional sports, I will describe how one can use probability distributions to model the outcomes of football matches, and how this can be combined with machine learning procedures to predict big tournaments and hereby even outperform bookmakers. I will conclude with an outlook on how these findings can be translated to sports medicine and, in particular, the estimation of injury risks. - Thursday 31.03.2022, 1pm, MNO 1.050
**Bartlomiej Polaczyk (University of Warsaw)**,*From modified log-Sobolev inequalities to Beckner inequalities and moment estimates***Abstract:**We show the equivalence between the modified log-Sobolev inequality and a family of Beckner inequalities in the context of general Markov processes. As a consequence, we deduce that moment estimates implied by the modified log-Sobolev inequality are of the same form as those implied by the usual log-Sobolev inequality. We illustrate our findings with applications to the Poisson space.

- Thursday 24.03.2022, 1pm, MNO 1.050
**Thomas Verdebout (Université libre de Bruxelles)**,*Asymptotic power of Sobolev tests for**uniformity on hyper-spheres***Abstract:**One of the most classical problems in multivariate statistics is considered, namely, the problem of testing isotropy, or equivalently, the problem of testing uniformity on the unit hypersphere. Rather than restricting to tests that can detect specific types of alternatives only, we consider the broad class of Sobolev tests. While these tests are known to allow for omnibus testing of uniformity, their non-null behavior and consistency rates, unexpectedly, remain largely unexplored. To improve on this, we thoroughly study the local asymptotic powers of Sobolev tests under the most classical alternatives to uniformity, namely, under rotationally symmetric alternatives. We show in particular that the consistency rate of Sobolev tests does not only depend on the coefficients defining these tests but also on the derivatives of the underlying angular function at zero.

- Thursday 17.03.2022, 1pm, MNO 1.030
**Matthieu Lerasle (ENSAE)**,*Some phase transition phenomena in graphical data analysis***Abstract:**I’ll present two problems where data naturally present a graphical structure: the analysis of champions in a tournament and the problem of matching. I’ll present for each problem intuitive results in toy models and discuss various mathematical tools involved to prove them. I’ll also present many open problems, hopefully convincing people to jump in this growing area.

- Thursday 03.03.2022, 1pm, MNO 1.050
**Nicolas Chopin (ENSAE)**,*Waste-free sequential Monte-Carlo***Abstract:**A standard way to move particles in a SMC sampler is to apply several steps of a MCMC (Markov chain Monte Carlo) kernel. Unfortunately, it is not clear how many steps need to be performed for optimal performance. In addition, the output of the intermediate steps are discarded and thus wasted somehow. We propose a new, waste-free SMC algorithm which uses the outputs of all these intermediate MCMC steps as particles. We establish that its output is consistent and asymptotically normal. We use the expression of the asymptotic variance to develop various insights on how to implement the algorithm in practice. We develop in particular a method to estimate, from a single run of the algorithm, the asymptotic variance of any particle estimate. We show empirically, through a range of numerical examples, that waste-free SMC tends to outperform standard SMC samplers, and especially so in situations where the mixing of the considered MCMC kernels decreases across iterations (as in tempering or rare event problems).

- Thursday 16.12.2021, 1pm, TBA
**Davy Paindaveine (Université Libre de Bruxelles)**,*Hypothesis testing on high-dimensional spheres: the Le Cam approach***Abstract:**Hypothesis testing in high dimensions has been a most active research topics in the last decade. Both theoretical and practical considerations make it natural to restrict to sign tests, that is, to tests that uses observations only through their directions from a given center. This obviously maps the original Euclidean problem to a spherical one, still in high dimensions. With this motivation in mind, we tackle two testing problems on high-dimensional spheres, both under a symmetry assumption that specifies that the distribution at hand is invariant under rotations with respect to a given axis. More precisely, we consider the problem of testing the null hypothesis of uniformity (“detecting the signal”) and the problem of testing the null hypothesis that the symmetry axis coincides with a given direction (“learning the signal direction”). We solve both problems by exploiting Le Cam’s asymptotic theory of statistical experiments, in a double- or triple-asymptotic framework. Interestingly, contiguity rates depend in a subtle way on how well the parameters involved are identified as well as on a possible further antipodally-symmetric nature of the distribution. In many cases, strong optimality results are obtained from local asymptotic normality. When this cannot be achieved, it is still possible to establish minimax rate optimality.

- Thursday 02.12.2021, 1pm, webex
**Béatrice Laurent-Bonneau (INSA-Toulouse)**,*Aggregated tests of independence based on HSIC measures***Abstract:**Independence measures based on Reproducing Kernel Hilbert Spaces, also known as Hilbert-Schmidt Independence Criterion and denoted HSIC, are widely used to statistically decide whether or not two random vectors are dependent since the seminal work by [Gretton et al., 2005]. Non-parametric HSIC-based statistical tests of independence have been performed, see [Gretton et al., 2008]. However, these tests lead to the question of the choice of the kernels associated to the HSIC. In particular, there is as yet no method to objectively select specific kernels with theoretical guarantees in terms of first and second kind errors. One of the main contributions of this work is to develop a new HSIC-based aggregated procedure which avoids such a kernel choice, and to provide theoretical guarantees for this procedure. To achieve this, we first introduce non-asymptotic single tests based on Gaussian kernels with a given bandwidth, which are of prescribed level α ∈ (0, 1). From a theoretical point of view, we upper-bound their uniform separation rate of testing over Sobolev and Nikol’skii balls. The key tools to obtain the theoretical performances of the test are exponential inequalities for U-statistics due to [Arcones and Giné, 1993] and [Giné et al., 2000]. Then, we aggregate several single tests, and obtain similar upper-bounds for the uniform separation rate of the aggregated procedure over the same regularity spaces. Another main contribution is that we provide a lower-bound for the non-asymptotic minimax separation rate of testing over Sobolev balls, and deduce that the aggregated procedure is adaptive in the minimax sense over such regularity spaces. The non-asymptotic lower bound is based on the work by [Baraud, 2002]. Finally, from a practical point of view, we perform numerical studies in order to assess the efficiency of our aggregated procedure and compare it to existing independence tests in the literature, in particular to the statistical test of independence based on the kernel mutual information recently studied by [Berrett and Samworth, 2017]. The paper is available on Hal [Albert et al., 2020].

- Thursday 18.11.2021, 1pm, webex
**Sumit Mukherjee(Columbia University)**,*Asymptotic distribution of quadratic form***Abstract:**In this talk we will give an exact characterization for the asymptotic distribution of quadratic forms in IID random variables with finite second moment, where the underlying matrix is the adjacency matrix of a graph. In particular we will show that the limit distribution of such a quadratic form can always be expressed as the sum of three independent components: a Gaussian, a (possibly) infinite sum of centered chi-squares, and a Gaussian with a random variance. As a consequence, we derive necessary and sufficient conditions for asymptotic normality, and universality of the limiting distribution. - Thursday 11.11.2021, 1pm, TBA
**Denis Belomestny(Duisburg-Essen University)**,*Rates of convergence for density estimation with generative adversarial networks***Abstract:**In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs). We derive theoretical guarantees for the density estimation with GANs under a proper choice of the deep neural networks classes representing generators and discriminators. In particular, we prove that the resulting estimate converges to the true density in terms of Jensen-Shannon (JS) divergence at the rate where is the sample size and determines the smoothness of . To the best of our knowledge, this is the first result in the literature on density estimation using vanilla GANs with JS convergence rates faster than in the regime . Moreover, we show that the obtained rate is minimax optimal for the considered class of densities.

- Thursday 04.11.2021, 1pm, MSA 3.540
**Anatoli Juditsky(Université Grenoble-Alpes)**,*Adaptive estimation from indirect observations***Abstract:**We discuss an approach to estimate aggregation and adaptive estimation based upon (nearly optimal) testing of convex hypotheses. We show that in the situation where the observations stem from*simple observation schemes*(i.e, have Gaussian, discrete and Poisson distribution) and where the set of unknown signals is a finite union of convex and compact sets, the proposed approach leads to aggregation and adaptation routines with nearly optimal performance. As an illustration, we consider application of the proposed estimates to the problem of recovery of unknown signal known to belong to a union of (sic) in Gaussian observation scheme. The corresponding numerical routines can be implemented efficiently when the number of sets in the union is “not very large”. We illustrate the “practical performance” of the method in a numerical example of estimation in the single index model.

- Thursday 14.10.2021, 1pm, MSA auditorium 3.530
**Elisabeth Gassiat (Université Paris-Saclay)**,*Deconvolution with unknown noise distribution***Abstract:**I consider the deconvolution problem in the case where no information is known about the noise distribution. More precisely, no assumption is made on the noise distribution and no samples are available to estimate it: the deconvolution problem is solved based only on observations of the corrupted signal. I will prove the identifiability of the model up to translation when the signal has a Laplace transform with an exponential growth smaller than 2 and when it can be decomposed into two dependent components, so that the identifiability theorem can be used for sequences of dependent data or for sequences of iid multidimensional data. In the case of iid multidimensional data, I will propose an adaptive estimator of the density of the signal and provide rates of convergence. This rate of convergence is known to be minimax when ρ = 1.

- Thursday 01.07.2021, 1pm, Webex
**Christoph Thäle (Ruhr Universität Bochum)**,*Poisson hyperplanes in hyperbolic space***Abstract:**In the focus of this talk are random tessellations in hyperbolic space induced by Poisson point processes on the space of hyperbolic hyperplanes (totally geodesic subspaces of co-dimension 1). In the first part of the talk we consider the so-called k-skeleton of such tessellations and prove that, when observed in a sequence of increasing observation windows, the k-volume of the k-skeleton satisfies a central limit theorem only for dimensions 2 and 3 and that asymptotic normality fails in all higher dimensions. We indicate possible generalizations to Poisson processes of lower-dimensional random subspaces as well. If time permits we also describe a way to address the combinatorial structure of the zero cell of a hyperbolic hyperplane tessellation. In particular we present a fully explicit formula for the number of facets of this cell.

- Thursday 24.06.2021, 1pm, Webex
**Maurizia Rossi (University of Milano-Bicocca)**,*Non-universal fluctuations of the empirical measure for sphere-cross-time random fields***Abstract:**In this talk we consider isotropic and stationary real Gaussian sphere-cross-time random fields and we investigate the large time asymptotic behavior of the empirical measure at any threshold, covering both cases when the field exhibits short and long memory, i.e. integrable and non-integrable temporal covariance. It turns out that the limiting distribution is not universal, depending both on the memory parameters and the threshold. In particular, in the long memory case a form of Berry’s cancellation phenomenon occurs at zero-level, inducing phase transitions for both variance rates and limiting laws. (This talk is based on a joint work with D. Marinucci and A. Vidotto.) - Thursday 10.06.2021, 1pm, Webex
**Yvik Swan (Université libre de Bruxelles)**,*Stein’s density method for multivariate continuous distributions*Slides Video (Password: hXPJCnD5)

**Abstract:**We will discuss a general framework for Stein’s density method for multivariate continuous distributions. The approach associates to any probability density function a canonical operator and Stein class, as well as an infinite collection of operators and classes which we call standardizations. These in turn spawn an entire family of Stein identities and characterizations for any continuous distribution on Rd, among which we highlight those based on the score function and the Stein kernel. A feature of these operators is that they do not depend on normalizing constants. A new definition of Stein kernel is introduced and examined; integral formulas are obtained through a connection with mass transport, as well as ready-to-use explicit formulas for elliptical distributions. The flexibility of the kernels is used to compare in Stein discrepancy (and therefore 2-Wasserstein distance) between two normal distributions, Student and normal distributions, as well as two normal-gamma distributions. Upper and lower bounds on the 1-Wasserstein distance between continuous distributions are provided, and computed for a variety of examples: comparison between different normal distri- butions (improving on existing bounds in some regimes), posterior distributions with different priors in a Bayesian setting (including logistic regression), centred Azzalini- Dalla Valle distributions. Finally the notion of weak Stein equation and weak Stein factors is introduced, and new bounds are obtained for Lipschitz test functions if the distribution admits a Poincaré constant, which we use to compare in 1-Wasserstein distance between different copulas on the unit square. (arXiv : https://arxiv.org/abs/2101.05079) - Thursday 27.05.2021, 1pm, Webex
**Louis Gass (IRMAR)**,*A Salem Zygmund approach to almost sure asymptotics concerning the nodal measure of Riemannian random waves*Video (Password: TknKfmQ4)

**Abstract:**Let be a compact, boundaryless Riemannian manifold, and the sequences of (ordered) Laplace eigenvalues and eigenfunctions, satysfying . We consider the model of Riemannian random waves defined by , where is a iid sequence of Gaussian random variables. With probability one with respect to the Gaussian coefficients, we establish that the process , properly rescaled and evaluated at an independently and uniformly chosen point X on the manifold, converges in distribution towards an universal Gaussian field as grows to infinity. Using the continuity of the nodal measure with respect to the topology, we deduce that almost surely with respect to the Gaussian coefficient, the nodal measure of weakly converges towards the Riemannian volume. - Thursday 20.05.2021, 1pm, Webex
**Anna Gusakova (Ruhr-Universität Bochum)**,*Random simplicial tessellations: high-dimensional probabilistic behaviour of the typical cells*Video (Password: 3iDKQ6fn)

**Abstract:**A tessellation in is a locally finite collection of convex polytopes, which cover the space and have disjoin interior. In this talk we consider a few models of random simplicial tessellations, the so-called Delaunay tessellations, whose construction is based on Poisson point process. Among them is the classical Poisson-Delaunay tessellation.

The main object we are interested in is the typical cell of a random tessellation . Intuitively, one can think of it as a randomly chosen polytope from the collection , assuming that each polytope has the same chance to be chosen. Considering the volume of the typical cells of our models we derive the explicit formulas for the moments as well as probabilistic representation in term of independent gamma- and beta-distributed random variables. Moreover, we investigate the limiting probabilistic behaviour of the logarithmic volume of typical cell, when dimension tends to infinity. In particular we establish central limit theorem and large deviation principle. - Thursday 06.05.2021, 1pm, Webex
**Hugo Vanneuville (Université Grenoble-Alpes)**,*The percolation phase transition of the random plane wave*Video (Password: 7Bj6bgc6)

**Abstract:**Consider the random plane wave , which is a random eigenfunction of the Laplacian in . Given a real number , we study the connectivity properties of the set , and we show that the model undergoes a percolation phase transition at : if then a.s. there is no unbounded connected component in while this is a.s. the case if . As I will explain in the talk, the main difficulty is that the field is not positively correlated. In the talk, I will present the strategy of proof, based on some superconcentration considerations that have enabled us to revisit the following general idea from (Russo, 1982; Talagrand, 1994…): “an event satisfies a phase transition if it depends little on any given point”. This is joint work with Stephen Muirhead and Alejandro Rivera. - Thursday 22.04.2021, 1pm, Webex
**Vytauté Pilipauskaite (University of Luxembourg)**,*Local scaling limits of Lévy driven fractional random fields***Abstract:**We obtain all local scaling limits for a class of Lévy driven fractional random fields on . More specifically, the random field is defined as integral of a non-random function with respect to infinitely divisible random measure. The scaling procedure involves increments of X over points between which the distance in the horizontal and vertical directions shrinks respectively as and as for a given . We consider two types of increments of : usual increment and rectangular increment. We call their local scaling limits respectively -tangent and -rectangent random fields. We show that for above both types of local scaling limits exist for any and undergo a transition at some . We also discuss properties of these limits. This is a joint work with Donatas Surgailis (Vilnius University, Lithuania). - Thursday 18.03.2021, 1pm, Webex
**Etienne Roquain (LPSM)**,*False Discovery Rate control with unknown null distribution***Abstract:**Classical multiple testing theory prescribes the null distribution, which is often a too stringent assumption for nowadays large scale experiments. This paper presents theoretical foundations to understand the limitations caused by ignoring the null distribution, and how it can be properly learned from the (same) data-set, when possible. While an oracle procedure in that case is the Benjamini Hochberg procedure applied with the true (unknown) null distribution, we pursue the aim of building a procedure that asymptotically mimics the performance of the oracle (AMO in short). For a Gaussian null, our main result states that an AMO procedure exists if and only if the sparsity parameter k (number of false nulls) is of order less than , where n is the total number of tests.

This is a joint work with Nicolas Verzelen, https://arxiv.org/abs/1912.03109. - Thursday 11.03.2021, 1pm, Webex
**Chiara Amorino (University of Luxembourg)**,*Rate of estimation for the stationary distribution of jump-processes over anisotropic Holder classes***Abstract:**We consider the solution of a multivariate stochastic differential equation with Levy-type jumps and with unique invariant probability measure

with density . We assume that a continuous record of observations is available.

In the case without jumps, under respectively isotropic and anisotropic Holder smoothness constraints, Dalalyan and Reiss [1] and Strauch [2] have found convergence rates of invariant density estimators which are considerably faster than those known from standard multivariate density estimation.

We extend the previous works by obtaining, in presence of jumps and in an anisotropic context, some estimators which achieve the continuous convergence rates in mono and bi-dimensional cases and a convergence rate faster than the ones found by Strauch [2] for .

Moreover, we obtain a minimax lower bound on the risk for pointwise estimation, with the same rate up to a term. It implies that, on a class of diffusions whose invariant density belongs to the anisotropic Holder class we are considering, it is impossible to find an estimator with a rate of estimation faster than the one we propose.

**References**

[1] Dalalyan, A. and Reiss, M. (2007). Asymptotic statistical equivalence for ergodic diffusions: the multidimensional case. Probab. Theory Relat. Fields, 137(1), 25-47.

[2] Strauch, C. (2018). Adaptive invariant density estimation for ergodic diffusions over anisotropic classes. The Annals of Statistics, 46(6B), 3451-3480. 1 - Thursday 25.02.2021, 2pm, TBA
**Richard Samworth (University of Cambridge)**,*USP: an independence test that improves on Pearson’s chi-squared and the -test***Abstract:**We introduce the U-Statistic Permutation (USP) test of independence in the context of discrete data displayed in a contingency table. Either Pearson’s chi-squared test of independence, or the Generalised Likelihood Ratio test (G-test), are typically used for this task, but we argue that these tests have serious deficiencies, both in terms of their inability to control the size of the test, and their power properties. By contrast, the USP test is guaranteed to control the size of the test at the nominal level for all sample sizes, has no issues with small (or zero) cell counts, and is able to detect distributions that violate independence in only a minimal way. The test statistic is derived from a U-statistic estimator of a natural population measure of dependence, and we prove that this is the unique minimum variance unbiased estimator of this population quantity.

In the last one-third of the talk, I will show how this is a special case of a much more general methodology and theory for independence testing.