0df493eba5d49709.tex
1: \begin{proof}
2:   Here we highlight the key steps of the proof, and defer the detailed proof to
3:   Section~\ref{sec:proofs-phase-i-1}.
4: 
5:   % Note that given \pk{$W_k>\epsilon_0 e^{-k}$} and conditional on the high probability even that
6:   % $\ol W_k \le 2e^{-e^k d_0}$, we have $Nd_k^{max} = O(N \log M / M)$
7: 
8:   In Figure~\ref{fig:block-decomp-Bk}, the rows and the columns of $B_{\wh{\mc I}_k, \wh{\mc I}_k}$
9:   are sorted according to the exact marginal probabilities of the words in ascending order, with the
10:   rows and columns set to 0 by regularization shaded.
11:   %
12:   Consider the block decomposition according to the good words $\wh{\mc L}_k$ and the
13:   spillover words $\wh{\mc J}_k$.
14:   %
15:   We bound the spectral distance of the 4 blocks ($A_1, A_2, A_3, A_4$) separately. The bound for
16:   the entire matrix $\wt B_k$ is then an immediate result of triangle inequality.
17: 
18:   For block $A_1$ whose rows and columns all correspond to the ``good words'' with roughly
19:   uniform marginals, we show its concentration by applying the result in
20:   \cite{le2015concentration}.
21:   %
22:   For block $A_2$ and $A_3$, we show that after regularization the spectral norm of these two blocks
23:   are small. Intuitively, the expected row sums of block $A_2$ are bounded by $2d_k^{\max}$ and the
24:   expected column sums are bounded by $2d_k^{\max}{\ol W_k \over W_k}= O(1/N)$, as a result of the
25:   bound on $\ol W_k$ in Lemma~\ref{lem:small-spillover}. Thus the spectral norm of the block $A_2$
26:   is likely to be bounded by $O(\sqrt{d_k^{\max}/N})$. We show this rigorously with high probability
27:   arguments.
28:   %
29:   Lastly for block $A_4$, which rows and columns all correspond to the spillover words. We show that
30:   the spectral norm of this block is very small, as a result of the small spillover marginal $\ol
31:   W_k$.
32: \end{proof}
33: