abstract:faea8e943afb2c40.tex

1: \begin{abstract}

2: Coordinate descent methods usually minimize a cost function by updating a random decision variable (corresponding to one coordinate) at a time.

3: %

4: Ideally, we would update the decision variable that yields the largest decrease in the cost function.

5: %

6: However, finding this coordinate would require checking all of them, which would effectively negate the improvement in computational tractability that coordinate descent is intended to afford.

7: %

8: To address this, we propose a new adaptive method for selecting a coordinate.

9: %

10: First, we find a lower bound on the amount the cost function decreases when a coordinate is updated.

11: %

12: We then use a multi-armed bandit algorithm to learn which coordinates result in the largest lower bound by {interleaving} this learning with conventional coordinate descent updates except that the coordinate is selected proportionately to the expected decrease.

13: %

14: We show that our approach improves the convergence of  coordinate descent methods both theoretically and experimentally.

15: \end{abstract}

16: