Séminaire Statistique

organisé par l'équipe Statistique

Julien Gibaud

Identifiability of stochastic state-space models

16 janvier 2026 - 11:00Salle de séminaires IRMA

State-Space Models (SSMs) are deterministic or stochastic dynamical systems defined by two processes. The state process, which is not observed directly, models the transformation of the system states over time, while the observation process produces the observables on which model fitting and prediction are based. Ecology frequently uses stochastic SSMs to represent the imperfectly observed dynamics of population sizes or animal movement. However, several simulation-based evaluations of model performance suggest broad identifiability issues in ecological SSMs. Formal SSM identifiability is typically investigated using exhaustive summaries, which are simplified representations of the model. The theory on exhaustive summaries is largely based on continuous-time deterministic modelling and those for discrete-time stochastic SSMs have developed by analogy. While the discreteness of time does not constitute a challenge, finding a good exhaustive summary for a stochastic SSM is more difficult. The strategy adopted so far has been to create exhaustive summaries based on a transfer function of the expectations of the stochastic process. However, this evaluation of identifiability does not allow to take into account the possible dependency between the variance parameters and the process parameters. We show that the output spectral density plays a key role in stochastic SSM identifiability assessment. This allows us to define a new suitable exhaustive summary. Using several ecological examples, we show that usual ecological models are often theoretically identifiable, suggesting that most SSM estimation problems are due to practical rather than theoretical identifiability issues.
Marina Gomtsyan

Variable selection methods in sparse GLARMA models

23 janvier 2026 - 11:00Salle de séminaires IRMA

We propose novel variable selection methods for sparse GLARMA (Generalised Linear Autoregressive Moving Average) models, which can be used for modelling discrete-valued time series. These models allow us to introduce some dependence in a Generalised Linear Model (GLM). The key idea behind our estimation procedure is first to estimate the coefficients of the ARMA part of the GLARMA model and then use a regularised approach, namely the Lasso, to estimate the regression coefficients of the GLM part of the model. Furthermore, we establish a sign-consistency result for the estimator of the regression coefficients in a sparse Poisson model without time dependence. The performance of our proposed methods was assessed on simulation studies in different frameworks and on several datasets in the field of molecular biology. Our approaches exhibit very good statistical performance, surpassing other methods in identifying non-null regression coefficients. Secondly, their low computational burden enables their application to relatively large datasets. Our proposed methods are implemented in R packages, which are publicly available on the Comprehensive R Archive Network (CRAN).
Jean-Armel Bra Kouadio

Modèles autorégressifs modulés par une chaîne de Markov cachée avec innovations dépendantes

6 février 2026 - 11:00Salle de séminaires IRMA

Ces travaux portent sur l’estimation et l'inférence statistique des modèles de séries temporelles ARHMC (Autoregressive Hidden Markov Chain) à changements de régimes markoviens avec innovations dépendantes (i.e ARHMC(p) faibles). Nous avons développé des procédures d’estimation par la méthode des moments. Puis, nous avons établi les principales propriétés asymptotiques des estimateurs proposés. Nous avons également accordé une attention particulière à l'estimation de la matrice variance asymptotique de type sandwich. Pour le modèle ARHMC}(0) faible, nous construisons des tests portmanteau adaptés aux innovations dépendantes, permettant de tester l’adéquation du modèle et de sélectionner le nombre de régimes. Nous abordons également la prévision et le décodage de la chaîne cachée.
Yiye Jiang

New sampling approaches for Shrinkage Inverse-Wishart distribution

13 février 2026 - 11:00Salle de séminaires IRMA

Covariance estimation has many applications, such as brain connectivity, portofolio allocation, to cite a few. Following a Bayesian approach, a typical covariance prior is the Inverse-Wishart distribution. However, a well-known issue of this prior is that it concentrates too little mass over covariance matrices with small eigengaps. To rebalance the mass, Berger et al. (2020, Annals of Statistics) proposed a more generic family, Shrinkage Inverse-Wishart, which thus offers a more flexible prior choice for covariance matrices. However, sampling from it remains challenging. The existing algorithm relies on a nested Gibbs sampler, which is slow and lacks rigorous theoretical convergence analysis. We propose a new algorithm based on the Sampling Importance Resampling method, which is significantly faster and comes with theoretical convergence guarantees. In this talk, we first derive the new sampling algorithm of Shrinkage Inverse-Wishart. Then we apply it to the inference of a Bayseian model of covariance estimation. We show inference results over a real data set of fMRI signals of rats.
Stéphane Lhaut

Statistical Learning of Multivariate Extremes: finite sample analysis of tail risks

20 février 2026 - 11:00Salle de séminaires IRMA

Many problems in statistics (regression, classification, …) can be cast as specific instances of the general problem of minimizing a risk over a given class of functions. In practice, this risk is unknown and has to be estimated based on historical data. Statistical learning theory provides the tools to study the consistency properties of the solution to this empirical risk minimization procedure. In some problems, the tail of the covariate random vector plays a specific role in predicting the outcome. As such, it is of interest to study the tail risk arising from conditioning the risk on the covariate being larger than a threshold and letting this threshold tend to infinity. Making use of standard regular variation assumptions from extreme value analysis leads to the definition of tail risks based on the well-known angular measure for multivariate extremes. Studying the properties of empirical minimizers of such risks deserves special attention, and specific concentration tools have to be developed. We will explore these considerations mainly in the case of (tail) binary classification and propose new results for the general case.
Hugo Henneuse

Estimation de Modes Multiples et Homologie Persistante

6 mars 2026 - 11:00Salle de séminaires IRMA

La détection et la localisation des modes d'une densité de probabilité (i.e., les points où la densité atteint un maximum local) constituent un problème classique de statistique non paramétrique. L’estimation du mode global, lorsqu’il est unique, en particulier pour les densités unimodales, a longtemps concentré l’attention, conduisant à la fois à la conception d’algorithmes efficaces et à une caractérisation précise des vitesses minimax sous différentes hypothèses sur la densité sous-jacente. Le problème plus général de l’estimation de l’ensemble des modes est plus difficile. Plusieurs approches ont été proposées, notamment les méthodes de type mean-shift, qui donnent des résultats satisfaisants en pratique, mais dont les performances restent peu comprises théoriquement. Dans cette présentation, nous proposerons une alternative fondée sur un outil central de l’analyse topologique des données (TDA) : l’homologie persistante et sa représentation pratique via les diagrammes de persistance. Nous présenterons plusieurs résultats sur la consistance de cette approche, pour de larges classes de densités pouvant admettre des discontinuités (y compris en les modes) ainsi que son optimalité au sens minimax. Au-delà de l’estimation des modes, nous discuterons également du problème de l’estimation des diagrammes de persistance pour de telles densités.
Komlan Noukpoape

A venir

13 mars 2026 - 11:00Salle de séminaires IRMA
Modou Wade

A general framework for deep learning

20 mars 2026 - 14:00Salle de séminaires IRMA

This paper develops a general approach for deep learning for a setting that includes nonparametric regression and classification. We perform a framework from a data that fulfills a generalized Bernstein-type inequality, including, independent, ϕ-mixing, strongly mixing, C-mixing observations. Two estimators are proposed: a non-penalized deep neural network estimator (NPDNN) and a sparse-penalized deep neural network estimator (SPDNN). For each of these estimators, bounds of the expected excess risk on the class of Hölder smooth functions and composition Hölder functions are established. Applications to independent data, as well as to ϕ-mixing, strongly mixing, C-mixing processes are considered. For each of theses examples, the upper bounds of the expected excess risk of the proposed NPDNN and SPDNN predictors are derived. It is shown that, both the NPDNN and SPDNN estimators are minimax optimal (up to a logarithmic factor) in many classical settings.
Christelle Agonkoui

Principal Component Analysis of Multivariate Spatial Functional Data

26 mars 2026 - 11:00None

This work is devoted to the study of dimension reduction techniques for multivariate spatially indexed functional data and defined on different domains. We present a method called Spatial Multivariate Functional Principal Component Analysis (SMFPCA), which performs principal component analysis for multivariate spatial functional data. In contrast to Multivariate Karhunen- Loève approach for independent data, SMFPCA is notably adept at effectively capturing spatial dependencies among multiple functions. SMFPCA applies spectral functional component analysis to multivariate functional spatial data, focusing on data points arranged on a regular grid. The methodological framework and algorithm of SMFPCA have been developed to tackle the challenges arising from the lack of appropriate methods for managing this type of data. The performance of the proposed method has been verified through finite sample properties using simulated datasets and sea-surface temperature dataset. Additionally, we conducted comparative studies of SMFPCA against some existing methods providing valuable insights into the properties of multivariate spatial functional data within a finite sample.
Orlane Rossini

From Impulse Control of PDMPs to Bayesian Adaptive POMDPs: A Reinforcement Learning Approach

27 mars 2026 - 11:00Salle de séminaires IRMA

Piecewise Deterministic Markov Processes (PDMPs) constitute a family of Markov processes characterized by deterministic motion interspersed with random jumps. When controlled through discrete-time interventions, this leads to an impulse control problem. In the fully observed setting with known dynamics we develop a numerical method to compute an optimal strategy. In real-world applications, however, full observability is rarely available. Under partial observation, the impulse control of a PDMP can be reformulated as a Partially Observed Markov Decision Process (POMDP), which we address using deep reinforcement learning techniques. A major limitation of existing approaches is the assumption that the underlying PDMP dynamics are known or can be accurately simulated. This assumption is unrealistic in applications such as patient monitoring, where data may be scarce and disease dynamics may vary across individuals. To address this issue, we introduce a Bayesian Adaptive POMDP (BAPOMDP) framework, in which the unknown PDMP parameters are modeled probabilistically and updated through Bayesian inference. The resulting continuous-state BAPOMDP is solved using deep reinforcement learning methods adapted to high-dimensional belief spaces. This work thus combines stochastic control theory, Bayesian modeling, and deep reinforcement learning to provide a unified framework for decision-making under partial observability and model uncertainty. The proposed methodology is thoroughly illustrated and validated on a medical application : the adaptive follow-up and monitoring of patients diagnosed with multiple myeloma.
Antoine Heranval

Analyzing temporal dependence between extreme events using point processes

10 avril 2026 - 11:00Salle de séminaires IRMA

Extreme meteorological events often occur in complex temporal configurations, where the impacts of one hazard may depend on the prior occurrence of others. Characterising such temporal dependencies is essential for understanding compound climate risks, yet remains challenging due to the discrete, heterogeneous, and clustered nature of extreme events. In this study, we apply temporal point process methods to characterise dependencies among extreme meteorological events occurring within appropriately defined spatial regions across Europe, focusing exclusively on their temporal structure.
We introduce an event-based framework in which extreme events are represented as marked temporal point processes, with marks describing key characteristics such as intensity or duration. Global first- and second-order temporal statistics are used to quantify clustering, co-occurrence, and directional dependencies between different types of extremes. In particular, we rely on directional cross-$K$ functions to assess whether the occurrence of one type of extreme event systematically modifies the short-term probability of subsequent events of another type.
Two complementary applications illustrate different facets of compound event analysis. First, we demonstrate the relevance of the framework for preconditioned compound events through a temporal analysis of wildfire-related meteorological extremes. Second, we examine temporal dependence between extreme precipitation, extreme wind, and extreme atmospheric instability across all European NUTS-2 regions.
Building on these second-order statistics, we develop formal tests of temporal independence to assess the significance of observed directional interactions between different types of extreme events. Overall, this temporal point process framework provides a rigorous and interpretable approach to the analysis of compound and preconditioned climate extremes, with direct applications to climate risk assessment and early-warning systems.

S'abonner au séminaire

Séminaire Statistique

organisé par l'équipe Statistique

Julien Gibaud

Marina Gomtsyan

Jean-Armel Bra Kouadio

Yiye Jiang

Stéphane Lhaut

Hugo Henneuse

Komlan Noukpoape

Modou Wade

Christelle Agonkoui

Orlane Rossini

Antoine Heranval