# Probability & Statistics Seminar

- Home >
- Probability & Statistics Seminar

- Thursday 05.10.2023, 1pm, MNO 1.020

Paul Doukhan (Cergy Paris Université),*Discrete-time trawls*

**Abstract**: In a collaborative work with Adam Jakubowski, Silvia Lopes and Surgailis (SPA 2019), we introduce a, possibly integer-valued, stationary time series model which has original properties. On the one hand these models may have moments at all orders and a long range dependence property. In addition these models particularize those introduced by Barndorff-Nielsen, Lunde, Shephard, and Veraart. (Scandinavian Journal of Statistics 2014) to the case of discrete time; they have renormalized partial sums with possibly a stable limit contrary to what was announced by these authors.

With François Roueff and Joseph Rynkiewicz (EJS 2020) we prove the consistency of the parametric estimation of these models and show a central limit theorem which also seems contradictory for these popular Ambit-type models.

- Thursday 19.10.2023, 1pm, 1.020

**Jean Jacod (University Paris VI)**,*Systematic Jump Risk*

**Abstract**: (joint with Huidi Lin and Viktor Todorov) In a factor model for a large panel of N asset prices, a random time S is called a “systematic jump time” if it is not a jump time of any of the factors, but nevertheless is a jump time for a significant number of prices: one might for example think that those S’s are jump times of some hidden or unspecified factors. Our aim is to test whether such systematic jumps exist and, if they do, to estimate a suitably defined “aggregated measure” of their sizes. The setting is the usual high frequency setting with a finite time horizon T and observations of all prices and factors at the times iT /n for i = 0, . . . , n. We suppose that both n and N are large, and the asymptotic results (including feasible estimation of the above aggregate measure) are given when both go to ∞, without imposing restrictions on their relative size.

- Thursday 26.10.2023, 1pm, MNO 1.020

**Francesco Lagona****(University of Roma III)**,*Integrating directional statistics and survival analysis: hidden semi-Markov models for toroidal data*

**Abstract**: A nonhomogeneous hidden semi-Markov model is proposed to segment toroidal time series according to a finite number of latent regimes and, simultaneously, estimate the influence of time-varying covariates on the process’ survival under each regime. The model is a mixture of toroidal densities, whose parameters depend on the evolution of a semi-Markov chain, which is in turn modulated by time-varying covariates through a proportional hazards assumption. Parameter estimates are obtained using an EM algorithm that relies on an efficient augmentation of the latent process. The proposal is illustrated on an environmental time series of wind and wave directions recorded during winter.

- Thursday 9.11.2023, 1pm, MNO 1.020

**Raphaël Mignot (Université de Lorraine)**,*Analyzing time series, a new approach with the signature method.*

**Abstract**: In order to analyze multivariate time series (or any kind of ordered data), we can encode them with integrals of various moment orders, constituting their signature. Those features can be used in various Machine Learning tasks, as a compressed substitute of the raw time series, extracting only essential characteristics in the data.

In September in Metz, some of you might have seen my presentation at the Lorraine-Luxembourg workshop: it was dealing with barycenters in manifolds and in particular in the Lie group on which lie the signature features. Today, I will take a more general and practical point of view on the signature. I will give some insights on the reasons why this method has been used on very different applications and why people put interest in it.

- Thursday 16.11.2023, 1pm, MNO 1.020

**Bruno Ebner (Karlsruhe Institute of Technology),***On goodness-of-fit tests for families of distributions based on Steins method.*

**Abstract**: At the heart of Steins method lie characterizations of target distributions based on some so-called Stein operators and Stein classes of functions. We present a general method to construct new types of goodness-of-fit tests for (multivariate) families of distributions based on Stein characterizations and different choices of classes of functions. These new methods provide accessibility to testing problems that are considered intractable by classical methods. Properties of the new tests such as limit null distributions, consistency statements and the behavior under contiguous and fixed alternatives are derived for several special cases including testing (multivariate) normality, testing for the gamma-, Weibull-, Gompertz-, Cauchy- as well as Dickman- or compound Poisson-distributions. Monte Carlo simulations show that the new procedures are competitive to classical procedures when comparable.

- Thursday 23.11.2023, 1pm, MNO 1.020

**Önder Askin (University of Bochum)**,*Quantifying Differential Privacy in Black-Box Settings*

**Abstract**: Differential Privacy (DP) has emerged as a popular notion to assess and mitigate the privacy leakage of algorithms that release data. Traditionally, the development of privacy preserving algorithms within the DP framework relies on formal proofs prior to implementation. Yet, the adoption of DP in recent years has fostered interest in validation methods that can check the privacy claims of a given algorithm (retrospectively). In this talk, we discuss approaches that aim to assess the privacy guarantees of algorithms in black-box settings and we outline how statistical methods can help us infer the level of DP afforded by these algorithms.

- Thursday 30.11.2023, 1pm, MNO 1.020

**Johanna Ziegel (University of Bern)**,*Isotonic distributional regression and CRPS decompositions*

**Abstract**: Isotonic distributional regression (IDR) is a nonparametric distributional regression approach under a monotonicity constraint. It has found application as a generic method for uncertainty quantification, in statistical postprocessing of weather forecasts, and in distributional single index models. IDR has favorable calibration and optimality properties in finite samples. Furthermore, it has an interesting population counterpart called isotonic conditional laws that generalize conditional distributions with respect to $\sigma$-algebras to conditional distributions with respect to $\sigma$-lattices. In this talk, an overview of the theory is presented. Furthermore, it is shown how IDR can be used to decompose the mean CRPS for assessing the predictive performance of models with regards to their calibration and discrimination ability.

- Thursday 7.12.2023, 1pm, MNO 1.020

**Joseph Yukich (Lehigh University)**,*Gaussian fluctuations for dynamic spatial random models*

**Abstract**: We establish Gaussian fluctuations for statistics of spatial random models evolving over a time domain and which are asymptotically de-correlated over spatial domains. The three sources of model randomness are given by the collection of random particle locations, their random initial states, and the system evolution given by the collection of time-evolving marks at the particle locations. When the spatial domain increases up to R^d, we establish the limit theory for statistics of these models under spatial mixing conditions on both the particle locations and their random initial states, together with a spatial localization criterion on the marks. This gives asymptotic normality for continuum versions of interacting diffusion models, interacting particle systems, and some spin models. The talk is based on joint work with B. Blaszczyszyn and D. Yogeshwaran.

- Thursday 14.12.2023, 1pm, MNO 1.040

**Robert Baumgarth (University of Leipzig)**,*Exponential integrability, concentration inequalities and exit times of diffusions on evolving manifolds*

**Abstract**: We derive moment estimates, exponential integrability, concentration inequalities and exit times estimates for (possibly non-symmetric) diffusions on evolving Riemannian manifolds, more precisely, diffusion processes endowed with a family of time-dependent Riemannian metrics on a smooth (not necessarily compact) Riemannian manifold.

- Thursday 21.12.2023, 1pm, MNO 1.020

**Lihu Xu (University of Macau)**,*Comparison of stochastic algorithms with stochastic differential equations*

**Abstract**: Many stochastic algorithms in machine learning such as Langevin sampling can be approximated by stochastic differential equations (SDEs). In this talk, we will have a review for our recent work in this direction, which includes (i) estimating the error between unadjusted Langevin sampling and an SDE and (ii) estimating the error between unadjusted Hamilton sampling and an SDE.

- Thursday 11.1.2024, 1pm, MNO 1.020

**Léo Mathis (University of Frankfurt)**,*The zonoid algebra and random determinants*

**Abstract:**Zonoids are a particular family of convex bodies (convex compact subsets of R^n) and, as such, come with a natural additive structure: the Minkowski sum. In a recent joint work with Paul Breiding, Peter Bürgisser and Antonio Lerario we uncovered a receipe (the fundamental theorem of zonoid calculus) to build a multiplicative structure and construct the zonoid algebra. In my talk I will introduce all the objects mentionned above and will then explain how this applies to the computation of expected absolute random determinants, generalizing a theorem by Richard A Vitale from the 90s. If time allows I will show how this further applies to the study of zeroes of random fields in a joint work with Michele Stecconi.

- Thursday 18.1.2024, 1pm, MNO 1.020

**Anna Paola Todino (Università del Piemonte Orientale)**,*Laguerre Expansion for Nodal Volumes*

**Abstract**: We investigate the nodal volume of random hyperspherical harmonics on the d-dimensional unit sphere. We exploit an orthogonal expansion in terms of Laguerre polynomials; this representation entails a drastic reduction in the computational complexity and allows to prove isotropy for chaotic components, an issue which was left open in the previous literature. As a further application, we obtain an upper bound (that we conjecture to be sharp) for the asymptotic variance, in the high-frequency limit, of the nodal volume for d>2. This result shows that the so-called Berry’s cancellation phenomenon holds in any dimension: namely, the nodal variance is one order of magnitude smaller than the variance of the volume of level sets at any non-zero threshold, in the high-energy limit. Joint work with Domenico Marinucci and Maurizia Rossi.

- Thursday 1.2.2024, 1pm, MNO 1.040

**Antoine Jego (EPFL Lausanne)**,*Thick points of 4d critical branching Brownian motion*

**Abstract**: I will describe a recent work in which we prove that branching Brownian motion in dimension four is governed by a nontrivial multifractal geometry and compute the associated exponents. As a part of this, we establish very precise estimates on the probability that a ball is hit by an unusually large number of particles, sharpening earlier works by Angel, Hutchcroft, and Jarai (2020) and Asselah and Schapira (2022) and allowing us to compute the Hausdorff dimension of the set of “a-thick” points for each a > 0. Surprisingly, we find that the exponent for the probability of a unit ball to be “a-thick” has a phase transition where it is differentiable but not twice differentiable at a = 2, while the dimension of the set of thick points is positive until a = 4. If time permits, I will also discuss a new strong coupling theorem for branching random walk that allows us to prove analogues of some of our results in the discrete case. Joint work with Nathanael Berestycki and Tom Hutchcroft.

- Thursday 8.2.2024, 1pm, MNO 1.020

**Gérard Biau (Sorbonne Université)**,*Deep residual networks and differential equations*

**Abstract**: Deep learning has become a prominent approach for many applications, such as computer vision or neural language processing. However, the mathematical understanding of these methods is still incomplete. A recent approach is to consider neural networks as discretized versions of differential equations. I will first give an overview of this emerging field and then discuss new results on residual neural networks, which are state-of-the-art deep learning models.

- Thursday 15.2.2024, 1pm, MNO 1.020

**Angelika Rohde (University of Freiburg)**,*Bootstrapping high-dimensional sample covariance matrices*

**Abstract**: Bootstrapping is the classical approach for distributional approximation of estimators and test statistics when an asymptotic distribution contains unknown quantities or provides a poor approximation quality. For the analysis of massive data, however, the bootstrap is computationally intractable in its basic sampling-with-replacement version. Moreover, it is even not valid in some important high-dimensional applications. Combining subsampling of observations with suitable selection of their coordinates, we introduce a new “$(m,mp/n)$ out of $(n,p)$”-sampling with replacement bootstrap for eigenvalue statistics of high-dimensional sample covariance matrices based on $n$ independent $p$-dimensional random vectors. In the high-dimensional scenario $p/n\rightarrow c\in [0,\infty)$, this fully nonparametric bootstrap is shown to consistently reproduce the underlying spectral measure if $m/n\rightarrow 0$. If $m^2/n\rightarrow 0$, it approximates correctly the distribution of linear spectral statistics. The crucial component is a suitably defined representative subpopulation condition which is shown to be verified in a large variety of situations. The proofs incorporate several delicate technical results which may be of independent interest.

- Monday 19.2.2024, 1pm, MNO 1.020

**Valentin Garino (Uppsala University)**,*Approximation of stochastic integrals driven by fractional Brownian, with discontinuous integrands*

**Abstract**: We are concerned with the approximation error of a class of stochastic integrals driven by a fractional Brownian motion with Hurst index $H>\frac{1}{2}$. In the case where the integrand verifies some adequate regularity properties, the scaling and limit behavior of the error are already relatively well understood. However, when the integrand is a discontinuous function of the fractional Brownian motion, classic tools from Young theory and Malliavin calculus no longer applies.

In this talk, we will adress this issue thanks to a fine analysis of the covariance function of the increments of the fractional Brownian motion, thus obtaining first and second order rates of convergence, as well as a limit for the error involving the local time of the fractional Brownian motion.

- Thursday 22.2.2024, 1pm, MNO 1.020

**Stéphane Robin (Sorbonne Université)**,*Change-point detection in a Poisson process*

**Abstract**: Change-point detection aims at discovering behavior changes lying behind time sequences data. In this paper, we investigate the case where the data come from an inhomogenous Poisson process or a marked Poisson process. We present an offline multiple change-point detection methodology based on minimum contrast estimator. In particular we explain how to deal with the continuous nature of the process together with the discrete available observations. Besides, we select the appropriate number of regimes through a cross-validation procedure which is really convenient here due to the nature of the Poisson process. Through experiments on simulated and realworld datasets, we show the interest of the proposed method, which is implemented in the CptPointProcess R package.

- Thursday 29.2.2024, 8:50, MSA 3.160

**Workshop :***Time series and Directional Statistics*

**Organizers: Masanobu Taniguchi, Christophe Ley**

- Thursday 7.3.2024, 1pm, MNO 1.020
,

Vladimir Spokoiny (Humboldt University of Berlin)*Inference for nonlinear inverse problems*

**Abstract**: Assume that a solution to a nonlinear inverse problem given e.g. by PDE is observed with noise. The target of analysis is typically a set of model parameters describing the corresponding forward operator and the corresponding denoised solution. The classical least squares approach faces several challenges and obstacles for theoretical study and numerically efficient implementation, especially if the parameter space is large and the observation noise is not negligible. We propose a new approach that provides rather precise finite sample results about the accuracy of estimation and quantification of uncertainty and allows us to avoid any stability analysis of the inverse operator and advanced results from empirical processes theory. The approach is based on extending the parameter space by introducing a set of «observables» and careful treatment of the arising semiparametric problem.

- Monday 11.3.2024, 1pm, MNO 1.020

**Masahisa Ebina (Kyoto University)**,*Spatial average of stochastic wave equations*

**Abstract**: This talk considers a spatial average of the solution to stochastic wave equations. We will deal with the average over a Euclidean ball and focus on several limit theorems when letting the radius of the ball go to infinity, including the law of large numbers, central limit theorems, and some large deviation results. We will discuss how the tools of the Malliavin calculus can be applied to show these results.

- Monday 18.3.2024, 1pm, MNO 1.040

**Kumar Venayagamoorthy (Clemson University)**,*Intelligent Data Analytics and Decision-Making using Artificial Intelligence for Smart Grid Operations and Control*

**Abstract**: Data is one of the most valuable assets for the electricity generation and delivery industry. It is known today that businesses increasingly prefer data-driven decision-making to intuition-based decision-making, which probably accounts for why the data analytics market is growing at a compound annual rate of nearly 30%. It is challenging to analyze oceans of unstructured data. Artificial intelligence (AI) and machine learning (ML, a subset of AI) technologies will allow businesses to analyze these unstructured data in a smarter and faster way. These technologies can also discover patterns and trends in structured data that are not easily observable. Furthermore, the volume of this data is so vast it causes a major strain on traditional (including AI/ML) models of computing where everything is controlled and analyzed centrally. New frameworks and methodologies are needed to turn data into insights, technologies into strategy, and opportunities into value and responsibility, and bring micro-analytics closer to the end-customer. In addition, predictive and prescriptive analytics should be adaptive, catering for decision-making based on real-time data with an extremely high degree of accuracy. In short, intelligent data analytics and decision-making is the new oil, but one needs a powerful engine to extract, refine and harness it efficiently. This seminar will present a distributed computational framework (/engine) for intelligent data analytics and decision-making, known as the cellular computational network (CCN). Several case studies of predictive and/or prescriptive analytics with CCNs in smart grid operations and management will be presented.

- Thursday 21.3.2024, 1pm, MNO 1.020

**Oliver Feng (University of Bath)**,*Convex loss selection via score matching*

**Abstract**: In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitting process is a nonparametric extension of score matching, corresponding to a log-concave projection of the noise distribution with respect to the Fisher divergence. The procedure is computationally efficient, and we prove guarantees on its asymptotic relative efficiency compared with an oracle procedure that has knowledge of the error distribution. As an example of a highly non-log-concave setting, for Cauchy errors, the optimal convex loss function is Huber-like, and yields an asymptotic relative efficiency greater than 0.87; in this sense, we obtain robustness without sacrificing (much) efficiency. Numerical experiments on simulated and real data confirm the practical merits of our proposal.

- Thursday 28.3.2024, 1pm, MNO 1.020

**Masanobu Taniguchi (Waseda University)**,*Statistical Estimation of Optimal Portfolios for Dependent Returns*

**Abstract**: The field of financial engineering has developed as a huge integration of economics, probability theory, statistics etc. for these decades. The composition of portfolio is one of the most fundamental and important methods of financial engineering to control the risk of investments. This talk provides a comprehensive development of statistical inference for portfolios and its applications. Historically, Morkowitz contributed to the advancement of modern portfolio theory laying the foundation for the diversification of investment portfolio. His approach is called the mean-variance portfolio, which maximizes the mean of portfolio return with reducing its variance ( risk of portfolio ). Actually, the mean-variance portfolio coefficients are expressed as a function of the mean and variance matrix of the return process. Optimal portfolio coefficients based on the mean and variance matrix have been derived by various criteria. Assuming that the return process is i.i.d. Gaussian, Jobson and Korkie(1980) proposed a portfolio coefficient estimator of optimal portfolio by making the sample version of the mean-variance portfolio. However, emplical studies show that observed stretches of financial return are often non-Gaussian dependent. In this situation, it is shown that portfolio estimators of the mean-variance type are not asymptotically efficient generally even if the process is Gaussian, which gives a strong warning for use of the usual portfolio estimators. We also provide a necessary and sufficient condition for the estimators to be asymptotically efficient in terms of the spectral density matrix of the return. This motivates the fundamental important issue of the talk. Hence we will provide modern statistical techniques for the problems of portfolio estimation, grasping them as optimal statistical inference for various return processes. We will introduce a variety of stochastic processes, e.g., non-Gaussian stationary processes, non-linear processes, non-stationary processes etc.. For them we will develop a modern statistical inference by use of local asymototic normality(LAN), which is due to LeCam. The approach is a unified and very general one. Based on this we address a lot of important problems for portfolio. Cowork with Shiraishi H.

- Thursday 11.4.2024, 1pm, MNO 1.020 (Online)

**Ulrike Genschel (Iowa State University)**,*A Modified t-test for Treatment Means in Unreplicated Classroom Comparisons*

**Abstract**Discipline-based education research (DBER), with a focus on evidence-based teaching, has grown immensely over the last decades. A common interest in DBER studies is identifying superior pedagogical approaches using rigorous and scientific methodology.Researchers may have few classrooms available when comparing classroom-level treatments or conditions so that one classroom per treatment is not uncommon in many DBER studies. Because data and analysis options are then limited, an approach often seen in the DBER literature is to compare treatment means with a two-sample t-test applied to student-level responses from each classroom. This strategy, however, carries particular risks for statistical inference, where p-values can be misleading to an extent that is often under-appreciated and also much worse than possibly overstating practical significance. We demonstrate that, even in the absence of any treatment difference, a mathematical guarantee exists that the p-value from a standard two-sample t-test applied to student-level responses in this setting can be made arbitrarily close to zero with probability 1, simply as an artifact of sufficient student enrollment. Existing options to remedy the t-test, as we review, are typically intractable. As a more reasonable assessment of evidence, we propose a modified two-sample t-test for comparing treatment means, which involves a smoothing step to account for classroom-level experimental error rather than ignoring this and possible correlations among student responses. Our numerical studies show that the modified t-test performs better than the standard t-test in controlling false rejection rates. The method is also illustrated with applications to several real data sets from educational studies.

- Thursday 18.4.2024, 1pm, MNO 1.020

**Roland Speicher (Saarland University)**,*TBA*

**Abstract**: TBA

- Monday 22.4.2024, 1pm, MNO 1.040

**Ronan Herry (Université Rennes 1)**,*TBA*

**Abstract**: TBA

- Thursday 25.4.2024

**Special lunch event to promote our master programs**

- Monday 29.4.2024, 1pm, MNO 1.020

**Johannes Schmidt-Hieber (University of Twente)**,*TBA*

**Abstract**: TBA

- Thursday 02.5.2024, 1pm, MNO 1.020

**Raphaël Lachièze-Rey (Université Paris Cité)**,*TBA*

**Abstract**: TBA

- Thursday 9.5.2024

**Public holiday**

- Monday 13.05.2024, 1pm, MNO 1.020

**Michele Ancona (Université Côte d’Azur)**,*TBA*

**Abstract**: TBA

- Thursday 16.05.2024, 1pm, MNO 1.020

**Maximilian Nitzschner (Hong Kong University of Science and Technology)**,*TBA*

**Abstract**: TBA

- Thursday 23.5.2024, 1pm, MNO 1.020

**Andrea Meilan (Universidad Carlos III de Madrid)**,*TBA*

**Abstract**: TBA

- Thursday 30.5.2024, 1pm, MNO 1.020

**Johan Segers (UCLouvain)**,*TBA*

**Abstract**: TBA

- Thursday 6.6.2024, 1pm, MNO 1.020

**Morine Delhelle (Université Catholique de Louvain-la-Neuve)**,*TBA*

**Abstract**: TBA