This summer school is dedicated to different geometrical tools to analyze data, with a special focus on applications in computer science and astrophysics.

The summer school is composed of five lectures and three talks. Each lectures (2x1h30) will be complemented by 2 hours of tutorials. It is primarily intended for the students of the graduate program "Mathematics and interactions : research and interactions" of the University of Strasbourg but is also accessible to any interested PhD student or researcher. As space is limited, priority will be given to PhD students. This event is supported by the Interdisciplinary Thematic Institute IRMIA++.

It will take place from 28 August to 1 September in the IRMA conference room at the University of Strasbourg.

### Lectures

#### Optimal transport for data analysis, Laetitia Chapel (Université de Bretagne Sud)

#### Shape analysis, Joan Glaunes (Université Paris Cité)

In these lectures, we will review some classical and more recent techniques to perform data analysis from populations of shapes. Here the term shape is very general and may refer to several types of data issued from real applications : images, curves, surfaces, point clouds, etc. The common assumption is that such objects can be "deformed", and that geometric deformations can partly or completely characterize differences between shapes in the population under study. I will first present some classical tools : notions of shape space theory, the diffeomorphic framework for shape analysis, and kernel metrics over shapes. I will also present deep-learning approaches to deal with shapes, such as residual networks that have strong links with the diffeomorphic framework, and techniques from geometric deep learning which are also linked to the classical kernel metrics.

#### Topological methods for astrophysical data, Katarina Kraljic (Observatoire de Strasbourg)

Like many other fields, astronomy has become a very data-rich field, driven by advances in telescopes, detectors, and computer technology. Astronomical data is extremely complex, heterogenous, and multidimensional, in both content and format, including imaging data, spectra, derived catalogs, or numerical simulations. In these lectures, I will present two mathematical approaches allowing the extraction of the morphological information of the underlying field. The first one is Minkowski functionals, a complete family of morphological measures describing the content, shape, and connectivity of the field. The second one relies on the discrete Morse theory and enables a coherent multi-scale identification of different types of astrophysical structures. The main purpose of this algorithm and its implementation, known as DisPerSE (Discrete Persistent Structures Extractor) is to identify persistent topological features such as peaks, voids, walls, and filaments within sampled distributions of matter in two or three dimensions. Visualization is an equally important aspect of data manipulation. The tutorial session will be dedicated to Paraview, an open-source post-processing visualization engine, and the visualization of the data and outcomes related to the above-mentioned methods.

#### Nonlinear, Geometric Reduced Models for Forward and Inverse Problems, Olga Mula (TU Eindhoven)

This course will discuss the role of nonlinear approximation and geometry to address bottlenecks in reduced order modelling when applied to forward and inverse problems.

#### Geometric transformations on digital images, Phuc Ngo (Université de Nancy)

In many applications of image processing, computer vision, or data augmentation techniques in deep learning, it requires applying geometric transformations to images. Despite the discrete nature of digital images, these transformations are often applied as continuous transformations, followed by discretization (usually involving sampling and interpolation). However, interpolation paradigms can alter the nature of the image and lead to unwanted visual biases. As for the sampling paradigms, they can cause geometric approximation errors. Other issues concern the topological properties of observed objects, whose preservation is often desirable in image analysis. The objective of this course is to first present the different issues that can arise when applying geometric transformations to digital images. Then, several solutions that have been proposed in the field of digital geometry will be presented to deal mainly with the problems related to the topological and geometric preservation of transformed images. For this, different notions and tools will be introduced, such as digital convexity, topological invariant, regularity, etc. Examples in 2D and 3D are given during the course to illustrate the mentioned problems as well as the proposed solutions in the context of digital image processing and analysis.

### Talks

#### Coverage of astronomical datasets, Sébastien Derriere (Observatoire de Strasbourg)

Each single astronomical observation represents a sampling of a multi-dimensional space. For example, an astronomical image corresponds to photons detected in a field of view which only covers a fraction of the full sky: this is a sampling in a 2-dimensional spherical space. But an observation is also limited in time (the duration of the observation) and in the photons wavelength or frequency axis, etc. Representing the coverage of astronomical datasets becomes quite challenging when we deal with surveys made of thousands or millions of individual images. Yet astronomers need to describe and combine different datasets, in order to find intersections between different projects, for example. The Hierarchical Progressive Surveys (HiPS) is a Virtual Observatory (VO) standard used to represent astronomical datasets, relying on a spatial partitioning of the sphere named HEALPix. A by-product of the HiPS is the Multi-Order Coverage maps (MOCs), which has also been standardized by the VO, in order to provide a simple description of the spatial coverage of datasets. I will present how spatial MOCs can be used to describe the coverage of image surveys and catalogues of point sources in astronomy, how the MOC concept is being generalized to describe temporal and spectral coverage of datasets, and how tools like Aladin can leverage on these standards to answer problems that would otherwise be very hard to address.

#### Multi-scale modeling of natural images with stochastic geometry, Sixin Zhang (Université de Toulouse)