abstract:ee59821db927de85.tex

1: \begin{abstract}

2: % $n$ input-output samples under quadratic loss

3: %Results apply without assuming a ground truth parameter that relates input and output

4: We study the problem of finding the best linear model that can minimize least-squares loss given a dataset. While this problem is trivial in the low-dimensional regime, it becomes more interesting in high-dimensions where the population minimizer is assumed to lie on a manifold such as sparse vectors. We propose projected gradient descent~(PGD) algorithm to estimate the  population minimizer in the finite sample regime. We establish linear convergence rate and data-dependent estimation error bounds for PGD. Our contributions include: 1) The results are established for heavier tailed sub-exponential distributions besides sub-gaussian. 2) We directly analyze the empirical risk minimization and do not require a realizable model that connects input data and labels. 3) Our PGD algorithm is augmented to learn the bias terms which boosts the performance. The numerical experiments validate our theoretical results.

5: %directly study the regularized empirical risk and

6: \end{abstract}

7: