cs0310036/intro.tex
1: \section{Introduction}
2: 
3: %%
4: %% changed D to S as appropriate
5: %%
6: 
7: Sparse linear systems are ubiquitous in scientific computing 
8:   and optimization.
9: In this work, we develop fast algorithms for solving
10:   some of the best-behaved linear systems: those specified
11:   by symmetric, diagonally dominant matrices
12:   with positive diagonals.
13: We call such matrices PSDDD as they are positive semi-definite
14:   and diagonally dominant.
15: Such systems arise in the solution
16:   of certain elliptic differential equations via
17:   the finite element method, 
18:   the modeling of resistive networks,
19:   and in the
20:   solution of certain network optimization 
21:   problems~\cite{StrangFix,Multigrid,Iterative,Iterative2,Iterative3}.
22: 
23: While one is often taught to solve a linear system $A \xx = \bb $
24:   by computing $A^{-1}$ and then multiplying $A^{-1}$ by $\bb$,
25:   this approach is quite inefficient for
26:   sparse linear systems---the best known bound on 
27:   the time required to compute $A^{-1}$ 
28:   is $O (n^{2.376})$~\cite{CoppersmithWinograd} and
29:   the representation of $A^{-1}$ typically requires
30:   $\Omega (n^{2})$ space.
31: In contrast, if $A$ is symmetric and has $m$ non-zero entries, then
32:   one can use the Conjugate Gradient method, as a direct method,
33:    to solve for
34:   $A^{-1} \bb$ in $O (nm)$ time and $O (n)$ space!
35: Until Vaidya's revolutionary introduction of
36:   combinatorial preconditioners~\cite{Vaidya},
37:   this was the best complexity bound for the solution
38:   of general PSDDD systems.
39: 
40: The two most popular families of methods for solving
41:   linear systems are the direct methods and the
42:   iterative methods.
43: Direct methods, such as Gaussian elimination,
44:   perform arithmetic operations that produce $\xx$
45:   treating the entries of $A$ and $\bb$ symbolically.
46: As discussed in Section~\ref{sec:direct}, direct methods
47:   can be used to quickly compute $\xx$ if the matrix
48:   $A$ has special topological structure.
49: 
50: Iterative methods, which are discussed
51:   in Section~\ref{sec:iterative},
52:   compute successively better approximations
53:   to $\xx$.
54: The Chebyshev and Conjugate Gradient methods take
55:   time proportional to $m \sqrt{\kappa_{f} (A)} \log (\kappa_{f} (A) / \epsilon)$
56:   to produce approximations to $\xx$ with relative error $\epsilon$,
57:  where $\kappa_{f} (A)$ is the ratio of the largest to the smallest
58:   non-zero eigenvalue of $A$.
59: These algorithms are improved by preconditioning---essentially solving
60:   $B^{-1} A \xx = B^{-1} \bb $
61:   for a \textit{preconditioner} $B$ that is carefully chosen
62:   so that $\kappa_{f} (A, B)$ is small and
63:   so that it is easy to solve
64:   linear systems in $B$.
65: These systems in $B$ may be solved using direct methods,
66:   or by again applying iterative methods.
67: 
68: Vaidya~\cite{Vaidya} discovered that for PSDDD matrices $A$
69:   one could use combinatorial techniques to construct matrices
70:   $B$ that provably satisfy both criteria.
71: In his seminal work, Vaidya shows that 
72:   when $B$ corresponds to a subgraph of the graph 
73:   of $A$,  one can bound
74:   $\kappa_{f} (A, B)$ by bounding the dilation and congestion
75:   of the best embedding of the graph of $A$ into the
76:   graph of $B$.
77: By using preconditioners derived by
78:   adding a few edges to maximum spanning trees, Vaidya's algorithm
79:   finds $\epsilon$-approximate solutions to 
80:   PSDDD linear systems of maximum valence $d$ in
81:   time $O ((d n)^{1.75} \log (\kappa_{f} (A) / \epsilon ))$.
82: \footnote{For the reader unaccustomed to condition numbers,
83:   we note that for an PSDDD matrix $A$ in which each entry is
84:   specified using $b$ bits of precision, 
85:   $\log (\kappa_{f} (A)) = O (b \log n)$.}
86: When these systems have special structure, such as having a
87:   sparsity graph of bounded genus or avoiding certain minors,
88:   he obtains even faster algorithms.
89: For example, his algorithm solves planar linear systems
90:   in time $O ((d n)^{1.2} \log (\kappa_{f} (A) / \epsilon ))$.
91: This paper follows the outline established by Vaidya:
92:   our contributions are improvements in the techniques
93:   for bounding $\kappa_{f} (A,B)$, a construction of better
94:   preconditioners, a construction that depends upon average
95:   degree rather than maximum degree, and an analysis
96:   of the recursive application of our algorithm.
97: 
98: As Vaidya's paper was never published%
99: \footnote{
100: Vaidya founded the company
101:   Computational Applications and System Integration
102:   (http://www.casicorp.com)
103:  to market his linear system solvers.},
104: and his manuscript lacked many
105:   proofs, the task of formally working out his results fell to others.
106: Much of its
107:   content appears in the thesis of his student, Anil Joshi~\cite{Joshi}.
108: Gremban, Miller  and Zagha\cite{Gremban,GrembanMillerZagha} 
109:   explain parts of Vaidya's paper as well
110:   as extend Vaidya's techniques. 
111: Among other results, they found ways of constructing preconditioners by
112:   \textit{adding} vertices to the graphs and using separator trees.
113: 
114: 
115: Much of the theory behind the application of Vaidya's techniques
116:   to matrices with non-positive off-diagonals
117:   is developed in~~\cite{SupportGraph}.
118: The machinery needed to apply Vaidya's techniques directly
119:   to matrices with positive off-diagonal elements is developed
120:   in~\cite{MWB}.
121: The present work builds upon an algebraic extension of the
122:   tools used to prove bounds on $\kappa_{f} (A, B)$
123:   by Boman and Hendrickson~\cite{SupportTheory}.
124: Boman and Hendrickson~\cite{BomanHendricksonAKPW}
125:   have pointed out that by applying one of their bounds on 
126:   support to 
127:   the tree constructed by Alon, Karp, Peleg, and West \cite{AKPW}
128:   for the $k$-server problem, one obtains
129:   a spanning tree preconditioner $B$ with 
130:   $\kappa_{f} (A, B) = m 2^{\bigO{\sqrt{\log n\log\log n}}}$.
131: They thereby obtain a solver for 
132:   PSDDD systems that produces $\epsilon $-approximate solutions in
133:   time $m^{1.5 + o (1)} \log (\kappa_{f} (A) / \epsilon )$.
134: In their manuscript, they asked whether one could possibly augment
135:   this tree to obtain a better preconditioner. 
136: We answer this question in the affirmative.
137: An algorithm running in time $O (m n^{1/2} \log^{2} (n))$
138:   has also recently been obtained by Maggs,
139:   \textit{et. al.}~\cite{MaggsEtAl}.
140: 
141: The present paper is the first to push past the $O (n^{1.5})$ barrier.
142: It is interesting to observe that this is exactly the point
143:   at which one obtains sub-cubic time algorithms for solving
144:   dense PSDDD linear systems.
145: 
146: Reif~\cite{Reif} proved that by applying Vaidya's techniques
147:   recursively, one can solve bounded-degree planar
148:   positive definite diagonally dominant linear systems
149:   to relative accuracy $\epsilon$ in time
150:   $O (m^{1 + o (1)} \log (\kappa (A) / \epsilon ))$.
151: We extend this result to general planar PSDDD linear systems.
152: 
153: Due to space limitations in the FOCS proceedings, some proofs have
154:   been omitted.
155: These are being gradually included in the on-line version of the paper.
156: 
157: \subsection{Background and Notation}
158: A symmetric matrix $A$ is semi-positive definite 
159:   if $x^{T} A x \geq 0$ for all vectors $x$.
160: This is equivalent to having all eigenvalues of $A$
161:   non-negative.
162: 
163: %We recall that 
164: %  vector $\uu \in\Reals{n}$ is an {\em eigenvector} of $A$ with
165: %  eigenvalue $\lambda$ if $A\uu = \lambda \uu $.
166: %When $A$ is symmetric, all of its eigenvalues are
167: %  real and one can form an orthonormal basis from its eigenvectors.
168: %A symmetric matrix $A$ is semi-positive definite, written $A \succeq 0$,
169: %  if $x^{T} A x \geq 0$ for all vectors $x$.
170: %This is equivalent to saying that all eigenvalues of $A$
171: %  are non-negative.
172: 
173: In most of the paper, we will focus on Laplacian matrices:
174:   symmetric 
175:   matrices with non-negative diagonals and non-positive off-diagonals
176:   such that for all $i$,  $\sum_{j} A_{i,j} = 0$.
177: However, our results will apply to the more general family
178:   of positive semidefinite, diagonally dominant (PSDDD) matrices,
179:   where a matrix is diagonally dominant if
180:   $\abs{A_{i,i}} \geq \sum_{j=1}^{n} \abs{A_{i,j}}$ for all $i$.
181: We remark that a symmetric matrix is PSDDD if and only if 
182:   it is diagonally dominant and all of its diagonals are
183:   non-negative.
184: 
185: In this paper, we will restrict our attention to the solution
186:   of linear systems of the form $A \xx = \bb$
187:   where $A$ is a PSDDD matrix.
188: When $A$ is non-singular, that is when $A^{-1}$ exists,
189:   there exists a unique solution $x = A^{-1}\bb $ to the linear
190:   system.
191: When $A$ is singular and symmetric, 
192:   for every $\bb  \in \Span{A}$ there exists a unique
193:   $\xx \in \Span{A}$ such that $A \xx = \bb$.
194: If $A$ is the Laplacian of a connected graph,
195:   then the null space of $A$ is spanned by $\bvec{1}$.
196: 
197: There are two natural ways to formulate the problem of finding
198:   an approximate solution to a system $A \xx = \bb$.
199: A vector $\xxt$ has \textit{relative residual error} $\epsilon$
200:   if $\norm{A \xxt - \bb} \leq \epsilon \norm{\bb }$.
201: We say that a solution $\xxt$ is an $\epsilon$-approximate
202:   solution if it is at relative
203:   distance at most $\epsilon$ from the actual
204:   solution---that is, if 
205:   $\norm{\xx - \xxt  } \leq \epsilon \norm{\xx }$.
206: One can relate these two notions of approximation by observing that
207:   relative distance of $\xx$ to the solution and
208:   the relative residual error differ by a multiplicative
209:   factor of at most $\kappa_{f} (A)$.
210: We will focus our attention on the problem
211:   of finding $\epsilon$-approximate solutions.
212:  
213: The ratio $\kappa_{f} (A)$ is the finite condition number of $A$.
214: The $l_{2}$ norm of a matrix, $\norm{A}$, is the maximum of
215:   $\norm{ A x} / \norm{x}$, and equals the largest eigenvalue
216:   of $A$ if $A$ is symmetric.
217: For non-symmetric matrices,
218:   $\lambda_{max} (A)$ and $\norm{A}$ are typically different.
219: We let $|A|$ denote the number of non-zero entries in $A$, and
220:   $\min (A)$ and $\max (A)$ denote the smallest and largest
221:   non-zero elements of $A$ in absolute value, respectively.
222: 
223: The condition number plays a prominent role in the analysis
224:   of iterative linear system solvers.
225: When $A$ is PSD, it is known that, after
226:   $\sqrt{\kappa_{f} (A)} \log (1/\epsilon )$ iterations,
227:   the Chebyshev iterative method and the Conjugate Gradient method
228:   produce solutions with relative residual error at most $\epsilon$.
229: To obtain an $\epsilon$-approximate solution, one need merely
230:   run $\log (\kappa_{f} (A))$ times as many iterations.
231: If $A$ has $m$ non-zero entries, each of these iterations takes
232:   time $O (m)$.
233: When applying the preconditioned versions of these algorithms
234:   to solve systems of the form $B^{-1} A \xx = B^{-1} \bb $,
235:   the number of iterations required by these algorithms 
236:   to produce an $\epsilon$-accurate solution is bounded
237:   by 
238:   $\sqrt{\kappa_{f} (A, B)} \log (\kappa_{f} (A) /\epsilon ) $
239:   where 
240: \[
241:   \kappa_{f} (A, B) 
242: = 
243: \left(\max_{\xx : A\xx \neq \zzero} \frac{ \xx^{T} A \xx}{\xx^T B \xx}
244:  \right)
245: \left(\max_{\xx : A\xx \neq \zzero} \frac{ \xx^{T} B \xx}{\xx^T A \xx}
246:  \right),
247: \]
248: for symmetric $A$ and $B$ with $\Span{A} = \Span{B}$.
249: However, each iteration of these methods takes time
250:   $O (m)$ plus the time required to solve linear
251:   systems in $B$.
252: In our initial algorithm, we will use direct methods to
253:   solve these systems, and so will not have to worry about
254:   approximate solutions.
255: For the recursive application of our algorithms, we will
256:   use our algorithm again to solve these systems, and so will
257:   have to determine how well we need to approximate the solution.
258: For this reason, we will analyze the Chebyshev iteration instead
259:   of the Conjugate Gradient, as it is easier to analyze the impact
260:   of approximation in the Chebyshev iterations. 
261: However, we expect that similar results could be obtained for
262:   the preconditioned Conjugate Gradient.
263: For more information on these methods, we refer the reader
264:   to \cite{GolubVanLoanBook} or \cite{Bruaset}.
265: 
266: \subsection{Laplacians and Weighted Graphs}
267: All weighted graphs in this paper have
268:   positive weights.
269: There is a natural isomorphism between weighted
270:   graphs and Laplacian matrices:
271:   given a weighted graph $G = (V, E, w)$, we can
272:   form the Laplacian matrix in which
273:   $A_{i,j} = -w (i,j)$ for $(i,j) \in E$,
274:   and with diagonals determined by the condition
275:   $A \bvec{1} = \bvec{0}$.
276: Conversely, a weighted graph is naturally associated
277:   to each Laplacian matrix.
278: Each vertex of the graph corresponds to both a row and
279:   column of the matrix, and we will often
280:   abuse notation by identifying this row/column pair
281:   with the associated vertex.
282: 
283: We note that if $G_{1}$ and $G_{2}$ are weighted 
284:   graphs on the same vertex set with disjoint sets
285:   of edges, then the Laplacian of the union of
286:   $G_{1}$ and $G_{2}$ is the sum of their
287:   Laplacians.
288: 
289: \subsection{Reductions}\label{sec:reductions}
290: 
291: In most of this paper we just consider
292:   Laplacian matrices of connected graphs.
293: This simplification is enabled by two reductions.
294: 
295: First, we note that it suffices to construct preconditioners
296:   for matrices satisfying
297:   $A_{i,i} = \sum_{j}\abs{A_{i,j}}$, for all $i$.
298: This follows from the observation in~\cite{SupportGraph}
299:   that if $\tilde{A} = A + D$, where
300:   $A$ satisfies the above condition, then
301:   $\kappa _{f} (\tilde{A}, B + D) \leq \kappa _{f} (A,B)$.
302: So, it suffices to find a preconditioner after 
303:   subtracting off the maximal diagonal matrix that maintains
304:   positive diagonal dominance.
305: 
306: We then use an idea of Gremban~\cite{Gremban} for handling
307:   positive off-diagonal entries.
308: If $A$ is a symmetric matrix such that for all $i$,
309:   $A_{i,i} \geq  \sum _{j} \abs{A_{i,j}}$,
310:   then Gremban decomposes 
311:   $A$ into $D + A_{n} + A_{p}$, where
312:   $D$ is the diagonal of $A$,
313:   $A_{n}$ is the matrix containing all 
314:   negative off-diagonal entires of $A$,
315:   and $A_{p}$ contains all the positive off-diagonals.
316: Gremban then considers the linear system
317: \[
318:   \left[
319: \begin{array}{ll}
320:   D + A_{n} & -A_{p}\\
321:   -A_{p} & D + A_{n}
322: \end{array}
323:  \right]
324: \left[
325: \begin{array}{l}
326: \xx\\
327: \xx'
328: \end{array}
329:  \right]
330: =
331: \left[
332: \begin{array}{l}
333: \bb\\
334: -\bb
335: \end{array}
336:  \right],
337: \]
338: and observes that its solution will have
339:   $\xx' = -\xx$ and that 
340:   $\xx$ will be the solution to 
341:   $A \xx = \bb $.
342: Thus, by making this transformation,
343:   we can convert any $PSDDD$ linear
344:   system into one with 
345:   non-negative off diagonals.
346: One can understand this transformation as
347:   making two copies of every vertex in the graph,
348:   and two copies of every edge.
349: The edges corresponding to negative off-diagonals
350:   connect nodes in the same copy of the graph,
351:   while the others cross copies.
352: To capture the resulting family of graphs, we
353:   define a weighted graph $G$ to be a
354:   \textit{Gremban cover}
355:   if it has $2n$ vertices and
356: \begin{itemize}
357: \item for $i,j \leq n$, 
358:   $(i,j) \in  E$ if and only if
359:   $(i+n, j+n) \in E$, and
360:   $w (i,j) = w (i+n, j+n)$, 
361: \item for $i,j \leq n$, 
362:   $(i,j+n) \in  E$ if and only if
363:   $(i+n, j) \in E$, and
364:   $w (i,j+n) = w (i+n, j)$, and
365: \item the graph contains no edge
366:   of the form $(i, i+n)$.
367: \end{itemize}
368: When necessary,
369:   we will explain how to modify our arguments
370:   to handle Laplacians that are Gremban covers.
371: 
372: Finally, if $A$ is the Laplacian of an unconnected
373:   graph, then the blocks corresponding to the connected
374:   components may be solved independently.
375: 
376: 
377: \subsection{Direct Methods}\label{sec:direct}
378: The standard direct method for solving symmetric linear systems
379:   is Cholesky factorization.
380: Those unfamiliar with Cholesky factorization should think of
381:   it as Gaussian elimination in which one 
382:   simultaneously eliminates on rows and columns so as to preserve
383:   symmetry.
384: Given a permutation matrix $P$,
385:   Cholesky factorization produces a lower-triangular matrix
386:   $L$ such that $L L^{T} = P A P^{T}$.
387: Because one can use forward and back substitution to
388:   multiply vectors by $L^{-1}$ and $L^{-T}$
389:   in time proportional to the
390:   number of non-zero entries in $L$,
391:   one can use the Cholesky factorization of $A$
392:   to solve the system
393:   $A \xx = \bb $ in time $O (\sizeof{L})$.
394:  
395: Each pivot in the factorization comes from the diagonal
396:   of $A$, and one should understand the 
397:   permutation $P$ as providing
398:   the order in which these pivots are chosen.
399: Many heuristics exist for producing permutations $P$ 
400:   for which the number of non-zeros in $L$ is small.
401: If the graph of $A$ is a tree, then a permutation
402:   $P$ that orders the vertices of $A$ from the leaves up
403:   will result in an $L$ with at most $2n-1$ non-zero entries.
404: In this work, we will use results concerning matrices
405:   whose sparsity graphs resemble trees with a few additional
406:   edges and whose graphs have small separators, which
407:   we now review.
408: 
409: If $B$ is the Laplacian matrix of a weighted graph
410:   $(V,E,w)$, and one eliminates a vertex $a$
411:   of degree $1$, then the remaining matrix
412:   has the form
413: \[
414:   \left[
415:   \begin{array}{ll}
416:   1 & 0\\
417:   0 & A_{1},
418: \end{array}
419:  \right]
420: \]
421: where $A_{1}$ is the Laplacian of the graph in which    
422:   $a$ and its attached edge have been removed.
423: Similarly, if a vertex $a$ of degree $2$ is eliminated,
424:   then the remaining matrix is the Laplacian of the   
425:   graph in which the vertex $a$ and its adjacent edges
426:   have been removed, and an edge
427:   with weight $1/ (1/w_{1} + 1/w_{2})$
428:   is added between the
429:   two neighbors of $a$,
430:   where $w_{1}$ and $w_{2}$ are the weights of the edges
431:   connecting $a$ to its neighbors.
432: 
433: Given a graph $G$ 
434:   with edge set $E = R \cup S$, where the edges
435:   in $R$ form a tree,
436:   we will perform a partial Cholesky factorization
437:   of $G$ in which we successively eliminate all the degree 1 and 2
438:   vertices that are not endpoint of edges in $S$.
439: We introduce the algorithm \texttt{trim} to define the order
440:   in which the vertices should be eliminated, and we call the
441:   \emph{trim order} the order in which \texttt{trim} deletes vertices.
442: \begin{trivlist}
443: \item []
444: \noindent {\bf Algorithm:} \texttt{trim$(V,R,S)$}
445: \begin{enumerate}
446: \item  While $G$ contains a vertex of degree one
447:   that is not an endpoint of an edge in $S$,
448:   remove that vertex and its adjacent edge.
449: \item  While $G$ contains a vertex of degree two
450:   that is not an endpoint of an edge in $S$,
451:   remove that vertex and its adjacent edges, and add
452:   an edge between its two neighbors.
453: \end{enumerate}
454: \end{trivlist}
455: 
456: \begin{proposition}\label{pro:trim}
457: The output of \texttt{trim} is a graph
458:   with at most $4 \sizeof{S}$ vertices
459:   and $5 \sizeof{S}$ edges.
460: \end{proposition}
461:  
462: \begin{remark}
463: If $(V,R)$ and $(V,S)$ are Gremban covers,
464:   then we can implement \texttt{trim} so
465:   that the output graph is also a Gremban cover.
466: Moreover, the genus and maximum size clique minor
467:   of the output graph do not increase.
468: \end{remark}
469: 
470: After performing partial Cholesky factorization
471:   of the vertices in the trim order, one obtains
472:   a factorization of the form
473: \[
474: B = L C
475:     L^{T},
476: \mbox{where $C = $ }
477: \left[
478:     \begin{array}{ll}
479:      I & 0\\
480:      0 & A_{1}
481:     \end{array}
482:  \right],
483: \]
484: $L$ is lower triangular, 
485:   and the left column and right columns in the above
486:   representations correspond to the eliminated 
487:   and remaining vertices
488:   respectively.
489: Moreover, $\sizeof{L} \leq 2n-1$, and
490:   this Cholesky factorization may be performed in
491:   time $O (n + \sizeof{S})$.
492: 
493: %After performing partial Cholesky factorization
494: %  of the vertices in the trim order, one obtains
495: %  a factorization of the form
496: %\[
497: %A = L \left[
498: %    \begin{array}{ll}
499: %     I & 0\\
500: %     0 & B
501: %    \end{array}
502: % \right]
503: %    L^{T},
504: %\mbox{ where $L$ has the form}
505: %\left[\begin{array}{ll}
506: %  L_{1,1} & 0\\
507: %  L_{2,1} & I
508: %\end{array}
509: % \right], \]
510: %$L_{1,1}$ is lower triangular, 
511: %  and the left column and right columns in the above
512: %  representations correspond to the eliminated 
513: %  and remaining vertices
514: %  respectively.
515: %Moreover, $\sizeof{L} \leq 2n-1$.
516: 
517: The following Lemma may be proved by induction.
518: \begin{lemma}\label{lem:partialCholesky}
519: Let $B$ be a Laplacian matrix and let
520:   $L$ and $A_{1}$ be the matrices arising from
521:   the partial Cholesky factorization of $B$
522:   according to the trim order.
523: Let $U$ be the set of eliminated vertices, 
524:   and let $W$ be the set of remaining vertices.
525: For each pair of vertices $(a,b)$ in $W$ joined by a simple
526:   path containing only vertices of $U$, let
527:   $B_{(a,b)}$ be the Laplacian of the graph containing
528:   just one edge between $a$ and $b$ of weight
529:   $1/(\sum_{i} 1/w_{i})$, where
530:   the $w_{i}$ are the weights on the path between
531:   $a$ and $b$.
532: Then, 
533: 
534: \begin{itemize}
535: \item [$(a)$]
536:   the matrix $A_{1}$ is the sum of the Laplacian of the 
537:   induced graph
538:   on $W$ and
539:   the sum all the Laplacians $B_{(a,b)}$, 
540: \item [$(b)$]
541:   $\norm{A_{1}} \leq \norm{B}$,
542:   $\lambda_{2} (A_{1}) \geq \lambda_{2} (B)$, and so
543:   $\kappa _{f} (A_{1}) \leq \kappa _{f} (B)$.
544: \end{itemize}
545: \end{lemma}
546: 
547: 
548: 
549: Other topological structures may be exploited
550:   to produce elimination orderings
551:   that result in sparse $L$.
552: In particular,  Lipton, Rose and Tarjan~\cite{LiptonRoseTarjan} 
553:   prove that if the sparsity graph is 
554:   planar, then one can find such an $L$ with at most
555:   $O (n \log n)$ non-zero entries in time $O (n^{3/2})$.
556: In general, Lipton, Rose and Tarjan prove that if
557:   a graph can be dissected by a family of small separators,  
558:   then $L$ can be made sparse.
559: The precise definition and theorem follow.
560:   
561: \begin{definition}\label{def:separator}
562: A subset of vertices $C$ of a graph $G= (V,E)$ with $n$ vertices is an 
563:   $f (n)$-separator if 
564:   $\sizeof{C}\leq f (n)$, and the vertices of $V - C$
565:   can be partitioned into two sets  $U$ and $W$ such that there are
566:   no edges from $U$ to $W$, and $\sizeof{U},\sizeof{W}\leq 2n/3 $.
567: \end{definition}
568: 
569: \begin{definition}\label{def:familyseparator}
570: Let $f ()$ be a positive function.
571: A graph $G = (V,E)$ with $n$ vertices has a family of $f ()$-separators
572:   if for every $s \leq  n$, every subgraph $G' \subseteq G$ with $s$ vertices
573:   has a $f (s)$-separator.
574: \end{definition}
575: 
576: 
577: \begin{theorem}[Nested Dissection: Lipton-Rose-Tarjan]
578:   \label{thm:nestesdissection}
579: Let $A$ be an $n$ by $n$ symmetric PSD matrix, $\alpha > 0$ be a
580:   constant, and $h (n)$ be a positive function of $n$.
581: Let $f (x) = h (n) x^{\alpha }$.
582: If $G (A)$ has a family of $f ()$-separator, then 
583:   the Nested Dissection Algorithm of Lipton, Rose and Tarjan can,
584:  in $\bigO{n + (h (n) n^{\alpha })^{3}}$ time,
585:  factor $A$ into $A = LL^{T}$ 
586:  so that $L$ has at most $\bigO{(h (n)n^{\alpha })^{2}\log n}$
587:   non-zeros.
588: \end{theorem}
589: 
590: To apply this theorem, we note that many families of graphs
591:   are known to have families of small separators.
592: Gilbert, Hutchinson, and Tarjan
593:   \cite{GilbertHutchinsonTarjan} show
594:   that all graphs of $n$ vertices
595:   with genus bounded by $g$ have a family of $O (\sqrt{gn})$-separators,
596:   and Plotkin, Rao and Smith \cite{PlotkinRaoSmith} 
597:   show that any graph that excludes
598:   $K_{s}$ as minor has a family of $O (s\sqrt{n\log n})$-separators.
599: 
600: 
601: \subsection{Iterative Methods}\label{sec:iterative}
602: Iterative methods such as
603:   Chebyshev iteration and Conjugate Gradient
604:   solve systems such as
605:   $A \xx = \bb $
606:   by successively multiplying vectors
607:   by the matrix $A$, and
608:   then taking linear combinations of
609:   vectors that have been produced so far.
610: The preconditioned versions of these
611:   iterative methods take as input
612:   another matrix $B$, called the
613:   \textit{preconditioner},
614:   and also perform
615:   the operation of solving linear systems
616:   in $B$.
617: In this paper, we will restrict our attention to 
618:   the preconditioned Chebyshev method as it
619:   is easier to understand the effect of
620:   imprecision in the solution of the systems in 
621:   $B$ on the method's output.
622: In the non-recursive version of our algorithms,
623:   we will exploit the standard analysis
624:   of Chebyshev iteration (see~\cite{Bruaset}), adapted
625:   to our situation:
626: 
627: \begin{theorem}[Preconditioned Chebyshev]\label{thm:cheby}
628: Let $A$ and $B$ be Laplacian matrices, let $\bb $
629:   be a vector, and let $\xx$ satisfy $A \xx  = \bb$.
630: At each iteration, the preconditioned Chebyshev method
631:   multiplies one vector by $A$, solves one linear system
632:   in $B$, and performs a constant number of vector additions.
633: At the $k$th iteration, the algorithm maintains a solution
634:   $\xxt$ satisfying
635: \[
636:   \norm{(\xxt - \xx)}
637:  \leq 
638:     e^{-k / \sqrt{\kappa_{f} (A,B)}} 
639:    \kappa_{f} (A) \sqrt{\kappa_{f} (B)} \norm{\xx}.
640: \]
641: \end{theorem}
642: 
643: In the non-recursive versions of our algorithms,
644:   we will pre-compute the Cholesky factorization of 
645:   the preconditioners $B$,
646:   and use these to solve the linear systems encountered
647:   by preconditioned Chebyshev method.
648: In the recursive versions, we will perform a
649:   partial Cholesky factorization of $B$,
650:   into a matrix of the form $L [I, 0; 0, A_{1}] L^{T}$ ,
651:   construct a preconditioner for $A_{1}$, and again use
652:   the preconditioned Chebyshev method to solve
653: -  the systems in $A_{1}$.
654:   
655: 
656: 
657: %\subsection{Our contributions}
658: 
659: 
660: 
661: 
662: % Local Variables: ***
663: % TeX-master:"post.tex" ***
664: % End: ***
665: 
666: 
667: