1: \section{Introduction}
2:
3: %%
4: %% changed D to S as appropriate
5: %%
6:
7: Sparse linear systems are ubiquitous in scientific computing
8: and optimization.
9: In this work, we develop fast algorithms for solving
10: some of the best-behaved linear systems: those specified
11: by symmetric, diagonally dominant matrices
12: with positive diagonals.
13: We call such matrices PSDDD as they are positive semi-definite
14: and diagonally dominant.
15: Such systems arise in the solution
16: of certain elliptic differential equations via
17: the finite element method,
18: the modeling of resistive networks,
19: and in the
20: solution of certain network optimization
21: problems~\cite{StrangFix,Multigrid,Iterative,Iterative2,Iterative3}.
22:
23: While one is often taught to solve a linear system $A \xx = \bb $
24: by computing $A^{-1}$ and then multiplying $A^{-1}$ by $\bb$,
25: this approach is quite inefficient for
26: sparse linear systems---the best known bound on
27: the time required to compute $A^{-1}$
28: is $O (n^{2.376})$~\cite{CoppersmithWinograd} and
29: the representation of $A^{-1}$ typically requires
30: $\Omega (n^{2})$ space.
31: In contrast, if $A$ is symmetric and has $m$ non-zero entries, then
32: one can use the Conjugate Gradient method, as a direct method,
33: to solve for
34: $A^{-1} \bb$ in $O (nm)$ time and $O (n)$ space!
35: Until Vaidya's revolutionary introduction of
36: combinatorial preconditioners~\cite{Vaidya},
37: this was the best complexity bound for the solution
38: of general PSDDD systems.
39:
40: The two most popular families of methods for solving
41: linear systems are the direct methods and the
42: iterative methods.
43: Direct methods, such as Gaussian elimination,
44: perform arithmetic operations that produce $\xx$
45: treating the entries of $A$ and $\bb$ symbolically.
46: As discussed in Section~\ref{sec:direct}, direct methods
47: can be used to quickly compute $\xx$ if the matrix
48: $A$ has special topological structure.
49:
50: Iterative methods, which are discussed
51: in Section~\ref{sec:iterative},
52: compute successively better approximations
53: to $\xx$.
54: The Chebyshev and Conjugate Gradient methods take
55: time proportional to $m \sqrt{\kappa_{f} (A)} \log (\kappa_{f} (A) / \epsilon)$
56: to produce approximations to $\xx$ with relative error $\epsilon$,
57: where $\kappa_{f} (A)$ is the ratio of the largest to the smallest
58: non-zero eigenvalue of $A$.
59: These algorithms are improved by preconditioning---essentially solving
60: $B^{-1} A \xx = B^{-1} \bb $
61: for a \textit{preconditioner} $B$ that is carefully chosen
62: so that $\kappa_{f} (A, B)$ is small and
63: so that it is easy to solve
64: linear systems in $B$.
65: These systems in $B$ may be solved using direct methods,
66: or by again applying iterative methods.
67:
68: Vaidya~\cite{Vaidya} discovered that for PSDDD matrices $A$
69: one could use combinatorial techniques to construct matrices
70: $B$ that provably satisfy both criteria.
71: In his seminal work, Vaidya shows that
72: when $B$ corresponds to a subgraph of the graph
73: of $A$, one can bound
74: $\kappa_{f} (A, B)$ by bounding the dilation and congestion
75: of the best embedding of the graph of $A$ into the
76: graph of $B$.
77: By using preconditioners derived by
78: adding a few edges to maximum spanning trees, Vaidya's algorithm
79: finds $\epsilon$-approximate solutions to
80: PSDDD linear systems of maximum valence $d$ in
81: time $O ((d n)^{1.75} \log (\kappa_{f} (A) / \epsilon ))$.
82: \footnote{For the reader unaccustomed to condition numbers,
83: we note that for an PSDDD matrix $A$ in which each entry is
84: specified using $b$ bits of precision,
85: $\log (\kappa_{f} (A)) = O (b \log n)$.}
86: When these systems have special structure, such as having a
87: sparsity graph of bounded genus or avoiding certain minors,
88: he obtains even faster algorithms.
89: For example, his algorithm solves planar linear systems
90: in time $O ((d n)^{1.2} \log (\kappa_{f} (A) / \epsilon ))$.
91: This paper follows the outline established by Vaidya:
92: our contributions are improvements in the techniques
93: for bounding $\kappa_{f} (A,B)$, a construction of better
94: preconditioners, a construction that depends upon average
95: degree rather than maximum degree, and an analysis
96: of the recursive application of our algorithm.
97:
98: As Vaidya's paper was never published%
99: \footnote{
100: Vaidya founded the company
101: Computational Applications and System Integration
102: (http://www.casicorp.com)
103: to market his linear system solvers.},
104: and his manuscript lacked many
105: proofs, the task of formally working out his results fell to others.
106: Much of its
107: content appears in the thesis of his student, Anil Joshi~\cite{Joshi}.
108: Gremban, Miller and Zagha\cite{Gremban,GrembanMillerZagha}
109: explain parts of Vaidya's paper as well
110: as extend Vaidya's techniques.
111: Among other results, they found ways of constructing preconditioners by
112: \textit{adding} vertices to the graphs and using separator trees.
113:
114:
115: Much of the theory behind the application of Vaidya's techniques
116: to matrices with non-positive off-diagonals
117: is developed in~~\cite{SupportGraph}.
118: The machinery needed to apply Vaidya's techniques directly
119: to matrices with positive off-diagonal elements is developed
120: in~\cite{MWB}.
121: The present work builds upon an algebraic extension of the
122: tools used to prove bounds on $\kappa_{f} (A, B)$
123: by Boman and Hendrickson~\cite{SupportTheory}.
124: Boman and Hendrickson~\cite{BomanHendricksonAKPW}
125: have pointed out that by applying one of their bounds on
126: support to
127: the tree constructed by Alon, Karp, Peleg, and West \cite{AKPW}
128: for the $k$-server problem, one obtains
129: a spanning tree preconditioner $B$ with
130: $\kappa_{f} (A, B) = m 2^{\bigO{\sqrt{\log n\log\log n}}}$.
131: They thereby obtain a solver for
132: PSDDD systems that produces $\epsilon $-approximate solutions in
133: time $m^{1.5 + o (1)} \log (\kappa_{f} (A) / \epsilon )$.
134: In their manuscript, they asked whether one could possibly augment
135: this tree to obtain a better preconditioner.
136: We answer this question in the affirmative.
137: An algorithm running in time $O (m n^{1/2} \log^{2} (n))$
138: has also recently been obtained by Maggs,
139: \textit{et. al.}~\cite{MaggsEtAl}.
140:
141: The present paper is the first to push past the $O (n^{1.5})$ barrier.
142: It is interesting to observe that this is exactly the point
143: at which one obtains sub-cubic time algorithms for solving
144: dense PSDDD linear systems.
145:
146: Reif~\cite{Reif} proved that by applying Vaidya's techniques
147: recursively, one can solve bounded-degree planar
148: positive definite diagonally dominant linear systems
149: to relative accuracy $\epsilon$ in time
150: $O (m^{1 + o (1)} \log (\kappa (A) / \epsilon ))$.
151: We extend this result to general planar PSDDD linear systems.
152:
153: Due to space limitations in the FOCS proceedings, some proofs have
154: been omitted.
155: These are being gradually included in the on-line version of the paper.
156:
157: \subsection{Background and Notation}
158: A symmetric matrix $A$ is semi-positive definite
159: if $x^{T} A x \geq 0$ for all vectors $x$.
160: This is equivalent to having all eigenvalues of $A$
161: non-negative.
162:
163: %We recall that
164: % vector $\uu \in\Reals{n}$ is an {\em eigenvector} of $A$ with
165: % eigenvalue $\lambda$ if $A\uu = \lambda \uu $.
166: %When $A$ is symmetric, all of its eigenvalues are
167: % real and one can form an orthonormal basis from its eigenvectors.
168: %A symmetric matrix $A$ is semi-positive definite, written $A \succeq 0$,
169: % if $x^{T} A x \geq 0$ for all vectors $x$.
170: %This is equivalent to saying that all eigenvalues of $A$
171: % are non-negative.
172:
173: In most of the paper, we will focus on Laplacian matrices:
174: symmetric
175: matrices with non-negative diagonals and non-positive off-diagonals
176: such that for all $i$, $\sum_{j} A_{i,j} = 0$.
177: However, our results will apply to the more general family
178: of positive semidefinite, diagonally dominant (PSDDD) matrices,
179: where a matrix is diagonally dominant if
180: $\abs{A_{i,i}} \geq \sum_{j=1}^{n} \abs{A_{i,j}}$ for all $i$.
181: We remark that a symmetric matrix is PSDDD if and only if
182: it is diagonally dominant and all of its diagonals are
183: non-negative.
184:
185: In this paper, we will restrict our attention to the solution
186: of linear systems of the form $A \xx = \bb$
187: where $A$ is a PSDDD matrix.
188: When $A$ is non-singular, that is when $A^{-1}$ exists,
189: there exists a unique solution $x = A^{-1}\bb $ to the linear
190: system.
191: When $A$ is singular and symmetric,
192: for every $\bb \in \Span{A}$ there exists a unique
193: $\xx \in \Span{A}$ such that $A \xx = \bb$.
194: If $A$ is the Laplacian of a connected graph,
195: then the null space of $A$ is spanned by $\bvec{1}$.
196:
197: There are two natural ways to formulate the problem of finding
198: an approximate solution to a system $A \xx = \bb$.
199: A vector $\xxt$ has \textit{relative residual error} $\epsilon$
200: if $\norm{A \xxt - \bb} \leq \epsilon \norm{\bb }$.
201: We say that a solution $\xxt$ is an $\epsilon$-approximate
202: solution if it is at relative
203: distance at most $\epsilon$ from the actual
204: solution---that is, if
205: $\norm{\xx - \xxt } \leq \epsilon \norm{\xx }$.
206: One can relate these two notions of approximation by observing that
207: relative distance of $\xx$ to the solution and
208: the relative residual error differ by a multiplicative
209: factor of at most $\kappa_{f} (A)$.
210: We will focus our attention on the problem
211: of finding $\epsilon$-approximate solutions.
212:
213: The ratio $\kappa_{f} (A)$ is the finite condition number of $A$.
214: The $l_{2}$ norm of a matrix, $\norm{A}$, is the maximum of
215: $\norm{ A x} / \norm{x}$, and equals the largest eigenvalue
216: of $A$ if $A$ is symmetric.
217: For non-symmetric matrices,
218: $\lambda_{max} (A)$ and $\norm{A}$ are typically different.
219: We let $|A|$ denote the number of non-zero entries in $A$, and
220: $\min (A)$ and $\max (A)$ denote the smallest and largest
221: non-zero elements of $A$ in absolute value, respectively.
222:
223: The condition number plays a prominent role in the analysis
224: of iterative linear system solvers.
225: When $A$ is PSD, it is known that, after
226: $\sqrt{\kappa_{f} (A)} \log (1/\epsilon )$ iterations,
227: the Chebyshev iterative method and the Conjugate Gradient method
228: produce solutions with relative residual error at most $\epsilon$.
229: To obtain an $\epsilon$-approximate solution, one need merely
230: run $\log (\kappa_{f} (A))$ times as many iterations.
231: If $A$ has $m$ non-zero entries, each of these iterations takes
232: time $O (m)$.
233: When applying the preconditioned versions of these algorithms
234: to solve systems of the form $B^{-1} A \xx = B^{-1} \bb $,
235: the number of iterations required by these algorithms
236: to produce an $\epsilon$-accurate solution is bounded
237: by
238: $\sqrt{\kappa_{f} (A, B)} \log (\kappa_{f} (A) /\epsilon ) $
239: where
240: \[
241: \kappa_{f} (A, B)
242: =
243: \left(\max_{\xx : A\xx \neq \zzero} \frac{ \xx^{T} A \xx}{\xx^T B \xx}
244: \right)
245: \left(\max_{\xx : A\xx \neq \zzero} \frac{ \xx^{T} B \xx}{\xx^T A \xx}
246: \right),
247: \]
248: for symmetric $A$ and $B$ with $\Span{A} = \Span{B}$.
249: However, each iteration of these methods takes time
250: $O (m)$ plus the time required to solve linear
251: systems in $B$.
252: In our initial algorithm, we will use direct methods to
253: solve these systems, and so will not have to worry about
254: approximate solutions.
255: For the recursive application of our algorithms, we will
256: use our algorithm again to solve these systems, and so will
257: have to determine how well we need to approximate the solution.
258: For this reason, we will analyze the Chebyshev iteration instead
259: of the Conjugate Gradient, as it is easier to analyze the impact
260: of approximation in the Chebyshev iterations.
261: However, we expect that similar results could be obtained for
262: the preconditioned Conjugate Gradient.
263: For more information on these methods, we refer the reader
264: to \cite{GolubVanLoanBook} or \cite{Bruaset}.
265:
266: \subsection{Laplacians and Weighted Graphs}
267: All weighted graphs in this paper have
268: positive weights.
269: There is a natural isomorphism between weighted
270: graphs and Laplacian matrices:
271: given a weighted graph $G = (V, E, w)$, we can
272: form the Laplacian matrix in which
273: $A_{i,j} = -w (i,j)$ for $(i,j) \in E$,
274: and with diagonals determined by the condition
275: $A \bvec{1} = \bvec{0}$.
276: Conversely, a weighted graph is naturally associated
277: to each Laplacian matrix.
278: Each vertex of the graph corresponds to both a row and
279: column of the matrix, and we will often
280: abuse notation by identifying this row/column pair
281: with the associated vertex.
282:
283: We note that if $G_{1}$ and $G_{2}$ are weighted
284: graphs on the same vertex set with disjoint sets
285: of edges, then the Laplacian of the union of
286: $G_{1}$ and $G_{2}$ is the sum of their
287: Laplacians.
288:
289: \subsection{Reductions}\label{sec:reductions}
290:
291: In most of this paper we just consider
292: Laplacian matrices of connected graphs.
293: This simplification is enabled by two reductions.
294:
295: First, we note that it suffices to construct preconditioners
296: for matrices satisfying
297: $A_{i,i} = \sum_{j}\abs{A_{i,j}}$, for all $i$.
298: This follows from the observation in~\cite{SupportGraph}
299: that if $\tilde{A} = A + D$, where
300: $A$ satisfies the above condition, then
301: $\kappa _{f} (\tilde{A}, B + D) \leq \kappa _{f} (A,B)$.
302: So, it suffices to find a preconditioner after
303: subtracting off the maximal diagonal matrix that maintains
304: positive diagonal dominance.
305:
306: We then use an idea of Gremban~\cite{Gremban} for handling
307: positive off-diagonal entries.
308: If $A$ is a symmetric matrix such that for all $i$,
309: $A_{i,i} \geq \sum _{j} \abs{A_{i,j}}$,
310: then Gremban decomposes
311: $A$ into $D + A_{n} + A_{p}$, where
312: $D$ is the diagonal of $A$,
313: $A_{n}$ is the matrix containing all
314: negative off-diagonal entires of $A$,
315: and $A_{p}$ contains all the positive off-diagonals.
316: Gremban then considers the linear system
317: \[
318: \left[
319: \begin{array}{ll}
320: D + A_{n} & -A_{p}\\
321: -A_{p} & D + A_{n}
322: \end{array}
323: \right]
324: \left[
325: \begin{array}{l}
326: \xx\\
327: \xx'
328: \end{array}
329: \right]
330: =
331: \left[
332: \begin{array}{l}
333: \bb\\
334: -\bb
335: \end{array}
336: \right],
337: \]
338: and observes that its solution will have
339: $\xx' = -\xx$ and that
340: $\xx$ will be the solution to
341: $A \xx = \bb $.
342: Thus, by making this transformation,
343: we can convert any $PSDDD$ linear
344: system into one with
345: non-negative off diagonals.
346: One can understand this transformation as
347: making two copies of every vertex in the graph,
348: and two copies of every edge.
349: The edges corresponding to negative off-diagonals
350: connect nodes in the same copy of the graph,
351: while the others cross copies.
352: To capture the resulting family of graphs, we
353: define a weighted graph $G$ to be a
354: \textit{Gremban cover}
355: if it has $2n$ vertices and
356: \begin{itemize}
357: \item for $i,j \leq n$,
358: $(i,j) \in E$ if and only if
359: $(i+n, j+n) \in E$, and
360: $w (i,j) = w (i+n, j+n)$,
361: \item for $i,j \leq n$,
362: $(i,j+n) \in E$ if and only if
363: $(i+n, j) \in E$, and
364: $w (i,j+n) = w (i+n, j)$, and
365: \item the graph contains no edge
366: of the form $(i, i+n)$.
367: \end{itemize}
368: When necessary,
369: we will explain how to modify our arguments
370: to handle Laplacians that are Gremban covers.
371:
372: Finally, if $A$ is the Laplacian of an unconnected
373: graph, then the blocks corresponding to the connected
374: components may be solved independently.
375:
376:
377: \subsection{Direct Methods}\label{sec:direct}
378: The standard direct method for solving symmetric linear systems
379: is Cholesky factorization.
380: Those unfamiliar with Cholesky factorization should think of
381: it as Gaussian elimination in which one
382: simultaneously eliminates on rows and columns so as to preserve
383: symmetry.
384: Given a permutation matrix $P$,
385: Cholesky factorization produces a lower-triangular matrix
386: $L$ such that $L L^{T} = P A P^{T}$.
387: Because one can use forward and back substitution to
388: multiply vectors by $L^{-1}$ and $L^{-T}$
389: in time proportional to the
390: number of non-zero entries in $L$,
391: one can use the Cholesky factorization of $A$
392: to solve the system
393: $A \xx = \bb $ in time $O (\sizeof{L})$.
394:
395: Each pivot in the factorization comes from the diagonal
396: of $A$, and one should understand the
397: permutation $P$ as providing
398: the order in which these pivots are chosen.
399: Many heuristics exist for producing permutations $P$
400: for which the number of non-zeros in $L$ is small.
401: If the graph of $A$ is a tree, then a permutation
402: $P$ that orders the vertices of $A$ from the leaves up
403: will result in an $L$ with at most $2n-1$ non-zero entries.
404: In this work, we will use results concerning matrices
405: whose sparsity graphs resemble trees with a few additional
406: edges and whose graphs have small separators, which
407: we now review.
408:
409: If $B$ is the Laplacian matrix of a weighted graph
410: $(V,E,w)$, and one eliminates a vertex $a$
411: of degree $1$, then the remaining matrix
412: has the form
413: \[
414: \left[
415: \begin{array}{ll}
416: 1 & 0\\
417: 0 & A_{1},
418: \end{array}
419: \right]
420: \]
421: where $A_{1}$ is the Laplacian of the graph in which
422: $a$ and its attached edge have been removed.
423: Similarly, if a vertex $a$ of degree $2$ is eliminated,
424: then the remaining matrix is the Laplacian of the
425: graph in which the vertex $a$ and its adjacent edges
426: have been removed, and an edge
427: with weight $1/ (1/w_{1} + 1/w_{2})$
428: is added between the
429: two neighbors of $a$,
430: where $w_{1}$ and $w_{2}$ are the weights of the edges
431: connecting $a$ to its neighbors.
432:
433: Given a graph $G$
434: with edge set $E = R \cup S$, where the edges
435: in $R$ form a tree,
436: we will perform a partial Cholesky factorization
437: of $G$ in which we successively eliminate all the degree 1 and 2
438: vertices that are not endpoint of edges in $S$.
439: We introduce the algorithm \texttt{trim} to define the order
440: in which the vertices should be eliminated, and we call the
441: \emph{trim order} the order in which \texttt{trim} deletes vertices.
442: \begin{trivlist}
443: \item []
444: \noindent {\bf Algorithm:} \texttt{trim$(V,R,S)$}
445: \begin{enumerate}
446: \item While $G$ contains a vertex of degree one
447: that is not an endpoint of an edge in $S$,
448: remove that vertex and its adjacent edge.
449: \item While $G$ contains a vertex of degree two
450: that is not an endpoint of an edge in $S$,
451: remove that vertex and its adjacent edges, and add
452: an edge between its two neighbors.
453: \end{enumerate}
454: \end{trivlist}
455:
456: \begin{proposition}\label{pro:trim}
457: The output of \texttt{trim} is a graph
458: with at most $4 \sizeof{S}$ vertices
459: and $5 \sizeof{S}$ edges.
460: \end{proposition}
461:
462: \begin{remark}
463: If $(V,R)$ and $(V,S)$ are Gremban covers,
464: then we can implement \texttt{trim} so
465: that the output graph is also a Gremban cover.
466: Moreover, the genus and maximum size clique minor
467: of the output graph do not increase.
468: \end{remark}
469:
470: After performing partial Cholesky factorization
471: of the vertices in the trim order, one obtains
472: a factorization of the form
473: \[
474: B = L C
475: L^{T},
476: \mbox{where $C = $ }
477: \left[
478: \begin{array}{ll}
479: I & 0\\
480: 0 & A_{1}
481: \end{array}
482: \right],
483: \]
484: $L$ is lower triangular,
485: and the left column and right columns in the above
486: representations correspond to the eliminated
487: and remaining vertices
488: respectively.
489: Moreover, $\sizeof{L} \leq 2n-1$, and
490: this Cholesky factorization may be performed in
491: time $O (n + \sizeof{S})$.
492:
493: %After performing partial Cholesky factorization
494: % of the vertices in the trim order, one obtains
495: % a factorization of the form
496: %\[
497: %A = L \left[
498: % \begin{array}{ll}
499: % I & 0\\
500: % 0 & B
501: % \end{array}
502: % \right]
503: % L^{T},
504: %\mbox{ where $L$ has the form}
505: %\left[\begin{array}{ll}
506: % L_{1,1} & 0\\
507: % L_{2,1} & I
508: %\end{array}
509: % \right], \]
510: %$L_{1,1}$ is lower triangular,
511: % and the left column and right columns in the above
512: % representations correspond to the eliminated
513: % and remaining vertices
514: % respectively.
515: %Moreover, $\sizeof{L} \leq 2n-1$.
516:
517: The following Lemma may be proved by induction.
518: \begin{lemma}\label{lem:partialCholesky}
519: Let $B$ be a Laplacian matrix and let
520: $L$ and $A_{1}$ be the matrices arising from
521: the partial Cholesky factorization of $B$
522: according to the trim order.
523: Let $U$ be the set of eliminated vertices,
524: and let $W$ be the set of remaining vertices.
525: For each pair of vertices $(a,b)$ in $W$ joined by a simple
526: path containing only vertices of $U$, let
527: $B_{(a,b)}$ be the Laplacian of the graph containing
528: just one edge between $a$ and $b$ of weight
529: $1/(\sum_{i} 1/w_{i})$, where
530: the $w_{i}$ are the weights on the path between
531: $a$ and $b$.
532: Then,
533:
534: \begin{itemize}
535: \item [$(a)$]
536: the matrix $A_{1}$ is the sum of the Laplacian of the
537: induced graph
538: on $W$ and
539: the sum all the Laplacians $B_{(a,b)}$,
540: \item [$(b)$]
541: $\norm{A_{1}} \leq \norm{B}$,
542: $\lambda_{2} (A_{1}) \geq \lambda_{2} (B)$, and so
543: $\kappa _{f} (A_{1}) \leq \kappa _{f} (B)$.
544: \end{itemize}
545: \end{lemma}
546:
547:
548:
549: Other topological structures may be exploited
550: to produce elimination orderings
551: that result in sparse $L$.
552: In particular, Lipton, Rose and Tarjan~\cite{LiptonRoseTarjan}
553: prove that if the sparsity graph is
554: planar, then one can find such an $L$ with at most
555: $O (n \log n)$ non-zero entries in time $O (n^{3/2})$.
556: In general, Lipton, Rose and Tarjan prove that if
557: a graph can be dissected by a family of small separators,
558: then $L$ can be made sparse.
559: The precise definition and theorem follow.
560:
561: \begin{definition}\label{def:separator}
562: A subset of vertices $C$ of a graph $G= (V,E)$ with $n$ vertices is an
563: $f (n)$-separator if
564: $\sizeof{C}\leq f (n)$, and the vertices of $V - C$
565: can be partitioned into two sets $U$ and $W$ such that there are
566: no edges from $U$ to $W$, and $\sizeof{U},\sizeof{W}\leq 2n/3 $.
567: \end{definition}
568:
569: \begin{definition}\label{def:familyseparator}
570: Let $f ()$ be a positive function.
571: A graph $G = (V,E)$ with $n$ vertices has a family of $f ()$-separators
572: if for every $s \leq n$, every subgraph $G' \subseteq G$ with $s$ vertices
573: has a $f (s)$-separator.
574: \end{definition}
575:
576:
577: \begin{theorem}[Nested Dissection: Lipton-Rose-Tarjan]
578: \label{thm:nestesdissection}
579: Let $A$ be an $n$ by $n$ symmetric PSD matrix, $\alpha > 0$ be a
580: constant, and $h (n)$ be a positive function of $n$.
581: Let $f (x) = h (n) x^{\alpha }$.
582: If $G (A)$ has a family of $f ()$-separator, then
583: the Nested Dissection Algorithm of Lipton, Rose and Tarjan can,
584: in $\bigO{n + (h (n) n^{\alpha })^{3}}$ time,
585: factor $A$ into $A = LL^{T}$
586: so that $L$ has at most $\bigO{(h (n)n^{\alpha })^{2}\log n}$
587: non-zeros.
588: \end{theorem}
589:
590: To apply this theorem, we note that many families of graphs
591: are known to have families of small separators.
592: Gilbert, Hutchinson, and Tarjan
593: \cite{GilbertHutchinsonTarjan} show
594: that all graphs of $n$ vertices
595: with genus bounded by $g$ have a family of $O (\sqrt{gn})$-separators,
596: and Plotkin, Rao and Smith \cite{PlotkinRaoSmith}
597: show that any graph that excludes
598: $K_{s}$ as minor has a family of $O (s\sqrt{n\log n})$-separators.
599:
600:
601: \subsection{Iterative Methods}\label{sec:iterative}
602: Iterative methods such as
603: Chebyshev iteration and Conjugate Gradient
604: solve systems such as
605: $A \xx = \bb $
606: by successively multiplying vectors
607: by the matrix $A$, and
608: then taking linear combinations of
609: vectors that have been produced so far.
610: The preconditioned versions of these
611: iterative methods take as input
612: another matrix $B$, called the
613: \textit{preconditioner},
614: and also perform
615: the operation of solving linear systems
616: in $B$.
617: In this paper, we will restrict our attention to
618: the preconditioned Chebyshev method as it
619: is easier to understand the effect of
620: imprecision in the solution of the systems in
621: $B$ on the method's output.
622: In the non-recursive version of our algorithms,
623: we will exploit the standard analysis
624: of Chebyshev iteration (see~\cite{Bruaset}), adapted
625: to our situation:
626:
627: \begin{theorem}[Preconditioned Chebyshev]\label{thm:cheby}
628: Let $A$ and $B$ be Laplacian matrices, let $\bb $
629: be a vector, and let $\xx$ satisfy $A \xx = \bb$.
630: At each iteration, the preconditioned Chebyshev method
631: multiplies one vector by $A$, solves one linear system
632: in $B$, and performs a constant number of vector additions.
633: At the $k$th iteration, the algorithm maintains a solution
634: $\xxt$ satisfying
635: \[
636: \norm{(\xxt - \xx)}
637: \leq
638: e^{-k / \sqrt{\kappa_{f} (A,B)}}
639: \kappa_{f} (A) \sqrt{\kappa_{f} (B)} \norm{\xx}.
640: \]
641: \end{theorem}
642:
643: In the non-recursive versions of our algorithms,
644: we will pre-compute the Cholesky factorization of
645: the preconditioners $B$,
646: and use these to solve the linear systems encountered
647: by preconditioned Chebyshev method.
648: In the recursive versions, we will perform a
649: partial Cholesky factorization of $B$,
650: into a matrix of the form $L [I, 0; 0, A_{1}] L^{T}$ ,
651: construct a preconditioner for $A_{1}$, and again use
652: the preconditioned Chebyshev method to solve
653: - the systems in $A_{1}$.
654:
655:
656:
657: %\subsection{Our contributions}
658:
659:
660:
661:
662: % Local Variables: ***
663: % TeX-master:"post.tex" ***
664: % End: ***
665:
666:
667: