Joachim Bona-Pellissier : Local Identifiability of Deep ReLU Neural Networks: the Theory
Neural networks admit parameters in the form of weights and biases. Given a parameter choice theta, a classical feedforward network implements a function f_theta. A natural question is the one of identifiability: if f_theta = f_theta', do we have theta = theta'? The presence or absence of identifiability has diverse theoretical and practical implications, such as guarantees on the optimization process, on the complexity of the function implemented by the network and generalization properties, privacy and protection against attacks, interpretability of the network...
We consider a deep fully-connected feedforward ReLU network, and we study a version of identifiability in which we do not assume full knowledge of f_theta, but only the knowledge of f_theta(X), for a finite list X of inputs. In other words, is a sample X rich enough to determine, at least locally, the parameters of the network? To answer this question, we define local lifting operators whose inverses are charts of a smooth manifold of a high dimensional space. The function implemented by the deep ReLU neural network composes the local lifting with a linear operator which depends on the sample. We derive from this convenient representation a geometric necessary and sufficient condition of local identifiability. Looking at tangent spaces, the geometric condition provides: 1/ a sharp and testable necessary condition of identifiability and 2/ a sharp and testable sufficient condition of local identifiability. The validity of the conditions can be tested numerically using backpropagation and matrix rank computations.
Armand Foucault : A general approximation lower bound in Lp norm, with applications to feed-forward neural networks
Feed-forward neural networks are known as powerful approximators of functions. A famous result by Cybenko (1989) even states that any continuous function over a compact can be approximated in sup norm as close as required by a one-layer feed-forward neural network with enough parameters. Quantify the approximation capacity, called expressivity, of a function set by a set of neural networks, in term of its structural properties, is a question that has been widely addressed in the past years.
We study the fundamental limits to the expressive power of neural networks. Given two sets F, G of real-valued functions, we first prove a general lower bound on how well functions in F can be approximated in Lp(μ) norm by functions in G, for any p ≥ 1 and any probability measure μ. The lower bound depends on the packing number of F, the range of F, and the fat-shattering dimension of G. We then instantiate this bound to the case where G corresponds to a piecewise-polynomial feed-forward neural network, and describe in details the application to two sets F: Holder balls and multivariate monotonic functions.
William Todo: Counterfactual explanation for multivariate time series
We propose a novel approach to understand abnormal class features on multivariate time series by dividing the latent space generated by a variational autoencoder (VAE) into general and class-based features using contrastive learning. The resulting Contrastive VAE provides a well-organized latent space that enables us to modify only the class-based features and generate counterfactual examples. Our method is able to produce plausible counterfactual observations that highlight the differences between pathological and non-pathological data. We demonstrate the superiority of our approach over other counterfactual methods through a thorough evaluation that shows significant improvements in both validity and performance.