abstract:5d793b49b28e037e.tex

1: \begin{abstract}

2:   In many applications, such as classification of images or videos, it

3:   is of interest to develop a framework for tensor data instead of an

4:   ad-hoc way of transforming data to vectors due to the computational

5:   and under-sampling issues. In this paper, we study convergence and

6:   statistical properties of two-dimensional canonical correlation

7:   analysis \citep{Lee2007Two} under an assumption that data come from

8:   a probabilistic model. We show that carefully initialized the power method

9:   converges to the optimum and provide a finite sample bound. Then

10:   we extend this framework to tensor-valued data and propose the

11:   higher-order power method, which is commonly used in tensor

12:   decomposition, to extract the canonical directions. Our method can

13:   be used effectively in a large-scale data setting by solving the

14:   inner least squares problem with a stochastic gradient descent, and

15:   we justify convergence via the theory of Lojasiewicz's inequalities

16:   without any assumption on data generating process and initialization. For practical

17:   applications, we further develop (a) an inexact updating scheme

18:   which allows us to use the state-of-the-art stochastic gradient

19:   descent algorithm, (b) an effective initialization scheme which

20:   alleviates the problem of local optimum in non-convex optimization,

21:   and (c) a deflation procedure for extracting several canonical

22:   components. Empirical analyses on challenging data including gene

23:   expression and air pollution indexes in Taiwan, show the

24:   effectiveness and efficiency of the proposed methodology. Our

25:   results fill a missing, but crucial, part in the literature on

26:   tensor data.

27: \end{abstract}

28: