Speakers

Sofia Olhede

EPFL

Graph Limit Models and Beyond

Probabilistic distributional invariances are key to characterising stochastic processes where we only observe one realisation of that object. I will discuss how permutation invariance of random arrays lets us understand the nature of such objects. Furthermore I will discuss in that setting how any estimation procedure will be naturally linked to grouping rows and columns of such arrays for variance reduction, and estimating an underlying limit object (a graph limit) nonparametrically. Most real-world objects have additional features beyond a distributional permutation invariance. This requires us to extend any grouping to incorporate either additional features in our model specification, or forms of repeated observations. Furthermore estimation based solely on grouping columns or rows often does not fully utilize the underlying patterns of observations. I will describe how further variance reduction can be achieved by grouping entries of the array, and not just grouping the columns and rows, a task fundamentally challenged by the inherent noisiness of Bernoulli random variables, and any lack of repeated observations.

Philippe Rigollet

MIT

Statistics and artificial intelligence

Recent advances in artificial intelligence, particularly in computer vision and natural language processing have propelled AI to the forefront of scientific and technological demands. The field of statistics is poised to meet these demands, yet the classical paradigms of statistical analysis often fall short in addressing the complexities introduced by modern AI systems. In this talk, we will explore how the landscape of nonparametric statistics is evolving in response to these challenges, particularly focusing on the replacement of traditional structures such as smoothness and sparsity.
A key part of our discussion will delve into transformers, a groundbreaking architecture that has revolutionized many AI applications. We will examine the novel statistical questions and considerations that arise from the use of transformers, and how statisticians can contribute to the ongoing development and refinement of these AI technologies.

Jelena Bradic

UCSD

Dynamic treatment effects: high-dimensional inference under model misspecification

Estimating dynamic treatment effects is essential across various disciplines, offering nuanced insights into the time-dependent causal impact of interventions. However, this estimation presents challenges due to the “curse of dimensionality” and time-varying confounding, which can lead to biased estimates. Additionally, correctly specifying the growing number of treatment assignments and outcome models with multiple exposures seems overly complex. Given these challenges, the concept of double robustness, where model misspecification is permitted, is extremely valuable, yet unachieved in practical applications. This paper introduces a new approach by proposing novel, robust estimators for both treatment assignments and outcome models. We present a “sequential model double robust” solution, demonstrating that double robustness over multiple time points can be achieved when each time exposure is doubly robust. This approach improves the robustness and reliability of dynamic treatment effects estimation, addressing a significant gap in this field.
This is joint work with Yuqian Zhang and Weijie Ji.

Ming Yuan

Columbia University

Tensors in High Dimensional Data Analysis: Opportunities and Challenges

Large amount of multidimensional data represented by multiway arrays or tensors are prevalent in modern applications across various fields such as chemometrics, genomics, physics, psychology, and signal processing. The structural complexity of such data provides vast new opportunities for modeling and analysis, but efficiently extracting information content from them, both statistically and computationally, presents unique and fundamental challenges. Addressing these challenges requires an interdisciplinary approach that brings together tools and insights from statistics, optimization and numerical linear algebra among other fields. Despite these hurdles, significant progress has been made in the last decade. In this talk, I will review some of the key advancements and identify common threads among them, under several common statistical settings.

Boaz Nadler

Weizmann

Completing large low rank matrices with only few observed entries: A one-line algorithm with provable guarantees

Suppose you observe very few entries from a large matrix. Can we predict the missing entries, say assuming the matrix is (approximately) low rank ? We describe a very simple method to solve this matrix completion problem. We show our method is able to recover matrices from very few entries and/or with ill conditioned matrices, where many other popular methods fail. Furthermore, due to its simplicity, it is easy to extend our method to incorporate additional knowledge on the underlying matrix, for example to solve the inductive matrix completion problem. On the theoretical front, we prove that our method enjoys some of the strongest available theoretical recovery guarantees. Finally, for inductive matrix completion, we prove that under suitable conditions the problem has a benign optimization landscape with no bad local minima.