af4f16745bc0b4c7.tex
1: \begin{abstract}
2:    We consider a  general non-linear model where the signal  is a finite
3:    mixture of an unknown, possibly increasing, number of features issued
4:    from  a  continuous dictionary  parameterized  by  a real  non-linear
5:    parameter.    The  signal   is  observed   with  Gaussian   (possibly
6:    correlated) noise  in either  a continuous or  a discrete  setup.  We
7:    propose an off-the-grid optimization method, that is, a method which
8:    does not  use any  discretization scheme on  the parameter  space, to
9:    estimate  both the  non-linear  parameters of  the  features and  the
10:    linear parameters of the mixture.
11: 
12:        
13:    We use  recent results on  the geometry of off-the-grid  methods
14:    %, see Poon, Keriven and  Peyr\'e (2020), 
15:    to give minimal  separation on the true underlying non-linear parameters  such that interpolating  certificate functions
16:    can be  constructed. Using also  tail bounds for suprema  of Gaussian
17:    processes we  bound the  prediction error  with high  probability. Assuming that the certificate functions can be constructed, our prediction error bound is up  to $\log-$factors similar to the rates  attained by the  Lasso predictor  in the linear  regression model.  We  also establish
18:    convergence rates that quantify with  high probability the quality of
19:    estimation for both the linear and the non-linear parameters.
20:    
21:  \end{abstract}
22: