To fit sparse linear associations, a LASSO sparsity inducing penalty provably allows to recover the important features (needles) with high probability in certain regimes even if the sample size is smaller than the dimension of the input vector (haystack). We investigate whether a phase transition also exists to to fit sparse nonlinear associations known as articial neural networks. Using certain activation functions, proper selection of the penalty parameter λ and a sparsity inducing optimization algorithm, we observe identification of good needles on simulated and real data.