0310:cs0310036/intro.tex

1: \section{Introduction}

2:

3: %%

4: %% changed D to S as appropriate

5: %%

6:

7: Sparse linear systems are ubiquitous in scientific computing

8:   and optimization.

9: In this work, we develop fast algorithms for solving

10:   some of the best-behaved linear systems: those specified

11:   by symmetric, diagonally dominant matrices

12:   with positive diagonals.

13: We call such matrices PSDDD as they are positive semi-definite

14:   and diagonally dominant.

15: Such systems arise in the solution

16:   of certain elliptic differential equations via

17:   the finite element method,

18:   the modeling of resistive networks,

19:   and in the

20:   solution of certain network optimization

21:   problems~\cite{StrangFix,Multigrid,Iterative,Iterative2,Iterative3}.

22:

23: While one is often taught to solve a linear system $A \xx = \bb $

24:   by computing $A^{-1}$ and then multiplying $A^{-1}$ by $\bb$,

25:   this approach is quite inefficient for

26:   sparse linear systems---the best known bound on

27:   the time required to compute $A^{-1}$

28:   is $O (n^{2.376})$~\cite{CoppersmithWinograd} and

29:   the representation of $A^{-1}$ typically requires

30:   $\Omega (n^{2})$ space.

31: In contrast, if $A$ is symmetric and has $m$ non-zero entries, then

32:   one can use the Conjugate Gradient method, as a direct method,

33:    to solve for

34:   $A^{-1} \bb$ in $O (nm)$ time and $O (n)$ space!

35: Until Vaidya's revolutionary introduction of

36:   combinatorial preconditioners~\cite{Vaidya},

37:   this was the best complexity bound for the solution

38:   of general PSDDD systems.

39:

40: The two most popular families of methods for solving

41:   linear systems are the direct methods and the

42:   iterative methods.

43: Direct methods, such as Gaussian elimination,

44:   perform arithmetic operations that produce $\xx$

45:   treating the entries of $A$ and $\bb$ symbolically.

46: As discussed in Section~\ref{sec:direct}, direct methods

47:   can be used to quickly compute $\xx$ if the matrix

48:   $A$ has special topological structure.

49:

50: Iterative methods, which are discussed

51:   in Section~\ref{sec:iterative},

52:   compute successively better approximations

53:   to $\xx$.

54: The Chebyshev and Conjugate Gradient methods take

55:   time proportional to $m \sqrt{\kappa_{f} (A)} \log (\kappa_{f} (A) / \epsilon)$

56:   to produce approximations to $\xx$ with relative error $\epsilon$,

57:  where $\kappa_{f} (A)$ is the ratio of the largest to the smallest

58:   non-zero eigenvalue of $A$.

59: These algorithms are improved by preconditioning---essentially solving

60:   $B^{-1} A \xx = B^{-1} \bb $

61:   for a \textit{preconditioner} $B$ that is carefully chosen

62:   so that $\kappa_{f} (A, B)$ is small and

63:   so that it is easy to solve

64:   linear systems in $B$.

65: These systems in $B$ may be solved using direct methods,

66:   or by again applying iterative methods.

67:

68: Vaidya~\cite{Vaidya} discovered that for PSDDD matrices $A$

69:   one could use combinatorial techniques to construct matrices

70:   $B$ that provably satisfy both criteria.

71: In his seminal work, Vaidya shows that

72:   when $B$ corresponds to a subgraph of the graph

73:   of $A$,  one can bound

74:   $\kappa_{f} (A, B)$ by bounding the dilation and congestion

75:   of the best embedding of the graph of $A$ into the

76:   graph of $B$.

77: By using preconditioners derived by

78:   adding a few edges to maximum spanning trees, Vaidya's algorithm

79:   finds $\epsilon$-approximate solutions to

80:   PSDDD linear systems of maximum valence $d$ in

81:   time $O ((d n)^{1.75} \log (\kappa_{f} (A) / \epsilon ))$.

82: \footnote{For the reader unaccustomed to condition numbers,

83:   we note that for an PSDDD matrix $A$ in which each entry is

84:   specified using $b$ bits of precision,

85:   $\log (\kappa_{f} (A)) = O (b \log n)$.}

86: When these systems have special structure, such as having a

87:   sparsity graph of bounded genus or avoiding certain minors,

88:   he obtains even faster algorithms.

89: For example, his algorithm solves planar linear systems

90:   in time $O ((d n)^{1.2} \log (\kappa_{f} (A) / \epsilon ))$.

91: This paper follows the outline established by Vaidya:

92:   our contributions are improvements in the techniques

93:   for bounding $\kappa_{f} (A,B)$, a construction of better

94:   preconditioners, a construction that depends upon average

95:   degree rather than maximum degree, and an analysis

96:   of the recursive application of our algorithm.

97:

98: As Vaidya's paper was never published%

99: \footnote{

100: Vaidya founded the company

101:   Computational Applications and System Integration

102:   (http://www.casicorp.com)

103:  to market his linear system solvers.},

104: and his manuscript lacked many

105:   proofs, the task of formally working out his results fell to others.

106: Much of its

107:   content appears in the thesis of his student, Anil Joshi~\cite{Joshi}.

108: Gremban, Miller  and Zagha\cite{Gremban,GrembanMillerZagha}

109:   explain parts of Vaidya's paper as well

110:   as extend Vaidya's techniques.

111: Among other results, they found ways of constructing preconditioners by

112:   \textit{adding} vertices to the graphs and using separator trees.

113:

114:

115: Much of the theory behind the application of Vaidya's techniques

116:   to matrices with non-positive off-diagonals

117:   is developed in~~\cite{SupportGraph}.

118: The machinery needed to apply Vaidya's techniques directly

119:   to matrices with positive off-diagonal elements is developed

120:   in~\cite{MWB}.

121: The present work builds upon an algebraic extension of the

122:   tools used to prove bounds on $\kappa_{f} (A, B)$

123:   by Boman and Hendrickson~\cite{SupportTheory}.

124: Boman and Hendrickson~\cite{BomanHendricksonAKPW}

125:   have pointed out that by applying one of their bounds on

126:   support to

127:   the tree constructed by Alon, Karp, Peleg, and West \cite{AKPW}

128:   for the $k$-server problem, one obtains

129:   a spanning tree preconditioner $B$ with

130:   $\kappa_{f} (A, B) = m 2^{\bigO{\sqrt{\log n\log\log n}}}$.

131: They thereby obtain a solver for

132:   PSDDD systems that produces $\epsilon $-approximate solutions in

133:   time $m^{1.5 + o (1)} \log (\kappa_{f} (A) / \epsilon )$.

134: In their manuscript, they asked whether one could possibly augment

135:   this tree to obtain a better preconditioner.

136: We answer this question in the affirmative.

137: An algorithm running in time $O (m n^{1/2} \log^{2} (n))$

138:   has also recently been obtained by Maggs,

139:   \textit{et. al.}~\cite{MaggsEtAl}.

140:

141: The present paper is the first to push past the $O (n^{1.5})$ barrier.

142: It is interesting to observe that this is exactly the point

143:   at which one obtains sub-cubic time algorithms for solving

144:   dense PSDDD linear systems.

145:

146: Reif~\cite{Reif} proved that by applying Vaidya's techniques

147:   recursively, one can solve bounded-degree planar

148:   positive definite diagonally dominant linear systems

149:   to relative accuracy $\epsilon$ in time

150:   $O (m^{1 + o (1)} \log (\kappa (A) / \epsilon ))$.

151: We extend this result to general planar PSDDD linear systems.

152:

153: Due to space limitations in the FOCS proceedings, some proofs have

154:   been omitted.

155: These are being gradually included in the on-line version of the paper.

156:

157: \subsection{Background and Notation}

158: A symmetric matrix $A$ is semi-positive definite

159:   if $x^{T} A x \geq 0$ for all vectors $x$.

160: This is equivalent to having all eigenvalues of $A$

161:   non-negative.

162:

163: %We recall that

164: %  vector $\uu \in\Reals{n}$ is an {\em eigenvector} of $A$ with

165: %  eigenvalue $\lambda$ if $A\uu = \lambda \uu $.

166: %When $A$ is symmetric, all of its eigenvalues are

167: %  real and one can form an orthonormal basis from its eigenvectors.

168: %A symmetric matrix $A$ is semi-positive definite, written $A \succeq 0$,

169: %  if $x^{T} A x \geq 0$ for all vectors $x$.

170: %This is equivalent to saying that all eigenvalues of $A$

171: %  are non-negative.

172:

173: In most of the paper, we will focus on Laplacian matrices:

174:   symmetric

175:   matrices with non-negative diagonals and non-positive off-diagonals

176:   such that for all $i$,  $\sum_{j} A_{i,j} = 0$.

177: However, our results will apply to the more general family

178:   of positive semidefinite, diagonally dominant (PSDDD) matrices,

179:   where a matrix is diagonally dominant if

180:   $\abs{A_{i,i}} \geq \sum_{j=1}^{n} \abs{A_{i,j}}$ for all $i$.

181: We remark that a symmetric matrix is PSDDD if and only if

182:   it is diagonally dominant and all of its diagonals are

183:   non-negative.

184:

185: In this paper, we will restrict our attention to the solution

186:   of linear systems of the form $A \xx = \bb$

187:   where $A$ is a PSDDD matrix.

188: When $A$ is non-singular, that is when $A^{-1}$ exists,

189:   there exists a unique solution $x = A^{-1}\bb $ to the linear

190:   system.

191: When $A$ is singular and symmetric,

192:   for every $\bb  \in \Span{A}$ there exists a unique

193:   $\xx \in \Span{A}$ such that $A \xx = \bb$.

194: If $A$ is the Laplacian of a connected graph,

195:   then the null space of $A$ is spanned by $\bvec{1}$.

196:

197: There are two natural ways to formulate the problem of finding

198:   an approximate solution to a system $A \xx = \bb$.

199: A vector $\xxt$ has \textit{relative residual error} $\epsilon$

200:   if $\norm{A \xxt - \bb} \leq \epsilon \norm{\bb }$.

201: We say that a solution $\xxt$ is an $\epsilon$-approximate

202:   solution if it is at relative

203:   distance at most $\epsilon$ from the actual

204:   solution---that is, if

205:   $\norm{\xx - \xxt  } \leq \epsilon \norm{\xx }$.

206: One can relate these two notions of approximation by observing that

207:   relative distance of $\xx$ to the solution and

208:   the relative residual error differ by a multiplicative

209:   factor of at most $\kappa_{f} (A)$.

210: We will focus our attention on the problem

211:   of finding $\epsilon$-approximate solutions.

212:

213: The ratio $\kappa_{f} (A)$ is the finite condition number of $A$.

214: The $l_{2}$ norm of a matrix, $\norm{A}$, is the maximum of

215:   $\norm{ A x} / \norm{x}$, and equals the largest eigenvalue

216:   of $A$ if $A$ is symmetric.

217: For non-symmetric matrices,

218:   $\lambda_{max} (A)$ and $\norm{A}$ are typically different.

219: We let $|A|$ denote the number of non-zero entries in $A$, and

220:   $\min (A)$ and $\max (A)$ denote the smallest and largest

221:   non-zero elements of $A$ in absolute value, respectively.

222:

223: The condition number plays a prominent role in the analysis

224:   of iterative linear system solvers.

225: When $A$ is PSD, it is known that, after

226:   $\sqrt{\kappa_{f} (A)} \log (1/\epsilon )$ iterations,

227:   the Chebyshev iterative method and the Conjugate Gradient method

228:   produce solutions with relative residual error at most $\epsilon$.

229: To obtain an $\epsilon$-approximate solution, one need merely

230:   run $\log (\kappa_{f} (A))$ times as many iterations.

231: If $A$ has $m$ non-zero entries, each of these iterations takes

232:   time $O (m)$.

233: When applying the preconditioned versions of these algorithms

234:   to solve systems of the form $B^{-1} A \xx = B^{-1} \bb $,

235:   the number of iterations required by these algorithms

236:   to produce an $\epsilon$-accurate solution is bounded

237:   by

238:   $\sqrt{\kappa_{f} (A, B)} \log (\kappa_{f} (A) /\epsilon ) $

239:   where

240: \[

241:   \kappa_{f} (A, B)

242: =

243: \left(\max_{\xx : A\xx \neq \zzero} \frac{ \xx^{T} A \xx}{\xx^T B \xx}

244:  \right)

245: \left(\max_{\xx : A\xx \neq \zzero} \frac{ \xx^{T} B \xx}{\xx^T A \xx}

246:  \right),

247: \]

248: for symmetric $A$ and $B$ with $\Span{A} = \Span{B}$.

249: However, each iteration of these methods takes time

250:   $O (m)$ plus the time required to solve linear

251:   systems in $B$.

252: In our initial algorithm, we will use direct methods to

253:   solve these systems, and so will not have to worry about

254:   approximate solutions.

255: For the recursive application of our algorithms, we will

256:   use our algorithm again to solve these systems, and so will

257:   have to determine how well we need to approximate the solution.

258: For this reason, we will analyze the Chebyshev iteration instead

259:   of the Conjugate Gradient, as it is easier to analyze the impact

260:   of approximation in the Chebyshev iterations.

261: However, we expect that similar results could be obtained for

262:   the preconditioned Conjugate Gradient.

263: For more information on these methods, we refer the reader

264:   to \cite{GolubVanLoanBook} or \cite{Bruaset}.

265:

266: \subsection{Laplacians and Weighted Graphs}

267: All weighted graphs in this paper have

268:   positive weights.

269: There is a natural isomorphism between weighted

270:   graphs and Laplacian matrices:

271:   given a weighted graph $G = (V, E, w)$, we can

272:   form the Laplacian matrix in which

273:   $A_{i,j} = -w (i,j)$ for $(i,j) \in E$,

274:   and with diagonals determined by the condition

275:   $A \bvec{1} = \bvec{0}$.

276: Conversely, a weighted graph is naturally associated

277:   to each Laplacian matrix.

278: Each vertex of the graph corresponds to both a row and

279:   column of the matrix, and we will often

280:   abuse notation by identifying this row/column pair

281:   with the associated vertex.

282:

283: We note that if $G_{1}$ and $G_{2}$ are weighted

284:   graphs on the same vertex set with disjoint sets

285:   of edges, then the Laplacian of the union of

286:   $G_{1}$ and $G_{2}$ is the sum of their

287:   Laplacians.

288:

289: \subsection{Reductions}\label{sec:reductions}

290:

291: In most of this paper we just consider

292:   Laplacian matrices of connected graphs.

293: This simplification is enabled by two reductions.

294:

295: First, we note that it suffices to construct preconditioners

296:   for matrices satisfying

297:   $A_{i,i} = \sum_{j}\abs{A_{i,j}}$, for all $i$.

298: This follows from the observation in~\cite{SupportGraph}

299:   that if $\tilde{A} = A + D$, where

300:   $A$ satisfies the above condition, then

301:   $\kappa _{f} (\tilde{A}, B + D) \leq \kappa _{f} (A,B)$.

302: So, it suffices to find a preconditioner after

303:   subtracting off the maximal diagonal matrix that maintains

304:   positive diagonal dominance.

305:

306: We then use an idea of Gremban~\cite{Gremban} for handling

307:   positive off-diagonal entries.

308: If $A$ is a symmetric matrix such that for all $i$,

309:   $A_{i,i} \geq  \sum _{j} \abs{A_{i,j}}$,

310:   then Gremban decomposes

311:   $A$ into $D + A_{n} + A_{p}$, where

312:   $D$ is the diagonal of $A$,

313:   $A_{n}$ is the matrix containing all

314:   negative off-diagonal entires of $A$,

315:   and $A_{p}$ contains all the positive off-diagonals.

316: Gremban then considers the linear system

317: \[

318:   \left[

319: \begin{array}{ll}

320:   D + A_{n} & -A_{p}\\

321:   -A_{p} & D + A_{n}

322: \end{array}

323:  \right]

324: \left[

325: \begin{array}{l}

326: \xx\\

327: \xx'

328: \end{array}

329:  \right]

330: =

331: \left[

332: \begin{array}{l}

333: \bb\\

334: -\bb

335: \end{array}

336:  \right],

337: \]

338: and observes that its solution will have

339:   $\xx' = -\xx$ and that

340:   $\xx$ will be the solution to

341:   $A \xx = \bb $.

342: Thus, by making this transformation,

343:   we can convert any $PSDDD$ linear

344:   system into one with

345:   non-negative off diagonals.

346: One can understand this transformation as

347:   making two copies of every vertex in the graph,

348:   and two copies of every edge.

349: The edges corresponding to negative off-diagonals

350:   connect nodes in the same copy of the graph,

351:   while the others cross copies.

352: To capture the resulting family of graphs, we

353:   define a weighted graph $G$ to be a

354:   \textit{Gremban cover}

355:   if it has $2n$ vertices and

356: \begin{itemize}

357: \item for $i,j \leq n$,

358:   $(i,j) \in  E$ if and only if

359:   $(i+n, j+n) \in E$, and

360:   $w (i,j) = w (i+n, j+n)$,

361: \item for $i,j \leq n$,

362:   $(i,j+n) \in  E$ if and only if

363:   $(i+n, j) \in E$, and

364:   $w (i,j+n) = w (i+n, j)$, and

365: \item the graph contains no edge

366:   of the form $(i, i+n)$.

367: \end{itemize}

368: When necessary,

369:   we will explain how to modify our arguments

370:   to handle Laplacians that are Gremban covers.

371:

372: Finally, if $A$ is the Laplacian of an unconnected

373:   graph, then the blocks corresponding to the connected

374:   components may be solved independently.

375:

376:

377: \subsection{Direct Methods}\label{sec:direct}

378: The standard direct method for solving symmetric linear systems

379:   is Cholesky factorization.

380: Those unfamiliar with Cholesky factorization should think of

381:   it as Gaussian elimination in which one

382:   simultaneously eliminates on rows and columns so as to preserve

383:   symmetry.

384: Given a permutation matrix $P$,

385:   Cholesky factorization produces a lower-triangular matrix

386:   $L$ such that $L L^{T} = P A P^{T}$.

387: Because one can use forward and back substitution to

388:   multiply vectors by $L^{-1}$ and $L^{-T}$

389:   in time proportional to the

390:   number of non-zero entries in $L$,

391:   one can use the Cholesky factorization of $A$

392:   to solve the system

393:   $A \xx = \bb $ in time $O (\sizeof{L})$.

394:

395: Each pivot in the factorization comes from the diagonal

396:   of $A$, and one should understand the

397:   permutation $P$ as providing

398:   the order in which these pivots are chosen.

399: Many heuristics exist for producing permutations $P$

400:   for which the number of non-zeros in $L$ is small.

401: If the graph of $A$ is a tree, then a permutation

402:   $P$ that orders the vertices of $A$ from the leaves up

403:   will result in an $L$ with at most $2n-1$ non-zero entries.

404: In this work, we will use results concerning matrices

405:   whose sparsity graphs resemble trees with a few additional

406:   edges and whose graphs have small separators, which

407:   we now review.

408:

409: If $B$ is the Laplacian matrix of a weighted graph

410:   $(V,E,w)$, and one eliminates a vertex $a$

411:   of degree $1$, then the remaining matrix

412:   has the form

413: \[

414:   \left[

415:   \begin{array}{ll}

416:   1 & 0\\

417:   0 & A_{1},

418: \end{array}

419:  \right]

420: \]

421: where $A_{1}$ is the Laplacian of the graph in which

422:   $a$ and its attached edge have been removed.

423: Similarly, if a vertex $a$ of degree $2$ is eliminated,

424:   then the remaining matrix is the Laplacian of the

425:   graph in which the vertex $a$ and its adjacent edges

426:   have been removed, and an edge

427:   with weight $1/ (1/w_{1} + 1/w_{2})$

428:   is added between the

429:   two neighbors of $a$,

430:   where $w_{1}$ and $w_{2}$ are the weights of the edges

431:   connecting $a$ to its neighbors.

432:

433: Given a graph $G$

434:   with edge set $E = R \cup S$, where the edges

435:   in $R$ form a tree,

436:   we will perform a partial Cholesky factorization

437:   of $G$ in which we successively eliminate all the degree 1 and 2

438:   vertices that are not endpoint of edges in $S$.

439: We introduce the algorithm \texttt{trim} to define the order

440:   in which the vertices should be eliminated, and we call the

441:   \emph{trim order} the order in which \texttt{trim} deletes vertices.

442: \begin{trivlist}

443: \item []

444: \noindent {\bf Algorithm:} \texttt{trim$(V,R,S)$}

445: \begin{enumerate}

446: \item  While $G$ contains a vertex of degree one

447:   that is not an endpoint of an edge in $S$,

448:   remove that vertex and its adjacent edge.

449: \item  While $G$ contains a vertex of degree two

450:   that is not an endpoint of an edge in $S$,

451:   remove that vertex and its adjacent edges, and add

452:   an edge between its two neighbors.

453: \end{enumerate}

454: \end{trivlist}

455:

456: \begin{proposition}\label{pro:trim}

457: The output of \texttt{trim} is a graph

458:   with at most $4 \sizeof{S}$ vertices

459:   and $5 \sizeof{S}$ edges.

460: \end{proposition}

461:

462: \begin{remark}

463: If $(V,R)$ and $(V,S)$ are Gremban covers,

464:   then we can implement \texttt{trim} so

465:   that the output graph is also a Gremban cover.

466: Moreover, the genus and maximum size clique minor

467:   of the output graph do not increase.

468: \end{remark}

469:

470: After performing partial Cholesky factorization

471:   of the vertices in the trim order, one obtains

472:   a factorization of the form

473: \[

474: B = L C

475:     L^{T},

476: \mbox{where $C = $ }

477: \left[

478:     \begin{array}{ll}

479:      I & 0\\

480:      0 & A_{1}

481:     \end{array}

482:  \right],

483: \]

484: $L$ is lower triangular,

485:   and the left column and right columns in the above

486:   representations correspond to the eliminated

487:   and remaining vertices

488:   respectively.

489: Moreover, $\sizeof{L} \leq 2n-1$, and

490:   this Cholesky factorization may be performed in

491:   time $O (n + \sizeof{S})$.

492:

493: %After performing partial Cholesky factorization

494: %  of the vertices in the trim order, one obtains

495: %  a factorization of the form

496: %\[

497: %A = L \left[

498: %    \begin{array}{ll}

499: %     I & 0\\

500: %     0 & B

501: %    \end{array}

502: % \right]

503: %    L^{T},

504: %\mbox{ where $L$ has the form}

505: %\left[\begin{array}{ll}

506: %  L_{1,1} & 0\\

507: %  L_{2,1} & I

508: %\end{array}

509: % \right], \]

510: %$L_{1,1}$ is lower triangular,

511: %  and the left column and right columns in the above

512: %  representations correspond to the eliminated

513: %  and remaining vertices

514: %  respectively.

515: %Moreover, $\sizeof{L} \leq 2n-1$.

516:

517: The following Lemma may be proved by induction.

518: \begin{lemma}\label{lem:partialCholesky}

519: Let $B$ be a Laplacian matrix and let

520:   $L$ and $A_{1}$ be the matrices arising from

521:   the partial Cholesky factorization of $B$

522:   according to the trim order.

523: Let $U$ be the set of eliminated vertices,

524:   and let $W$ be the set of remaining vertices.

525: For each pair of vertices $(a,b)$ in $W$ joined by a simple

526:   path containing only vertices of $U$, let

527:   $B_{(a,b)}$ be the Laplacian of the graph containing

528:   just one edge between $a$ and $b$ of weight

529:   $1/(\sum_{i} 1/w_{i})$, where

530:   the $w_{i}$ are the weights on the path between

531:   $a$ and $b$.

532: Then,

533:

534: \begin{itemize}

535: \item [$(a)$]

536:   the matrix $A_{1}$ is the sum of the Laplacian of the

537:   induced graph

538:   on $W$ and

539:   the sum all the Laplacians $B_{(a,b)}$,

540: \item [$(b)$]

541:   $\norm{A_{1}} \leq \norm{B}$,

542:   $\lambda_{2} (A_{1}) \geq \lambda_{2} (B)$, and so

543:   $\kappa _{f} (A_{1}) \leq \kappa _{f} (B)$.

544: \end{itemize}

545: \end{lemma}

546:

547:

548:

549: Other topological structures may be exploited

550:   to produce elimination orderings

551:   that result in sparse $L$.

552: In particular,  Lipton, Rose and Tarjan~\cite{LiptonRoseTarjan}

553:   prove that if the sparsity graph is

554:   planar, then one can find such an $L$ with at most

555:   $O (n \log n)$ non-zero entries in time $O (n^{3/2})$.

556: In general, Lipton, Rose and Tarjan prove that if

557:   a graph can be dissected by a family of small separators,

558:   then $L$ can be made sparse.

559: The precise definition and theorem follow.

560:

561: \begin{definition}\label{def:separator}

562: A subset of vertices $C$ of a graph $G= (V,E)$ with $n$ vertices is an

563:   $f (n)$-separator if

564:   $\sizeof{C}\leq f (n)$, and the vertices of $V - C$

565:   can be partitioned into two sets  $U$ and $W$ such that there are

566:   no edges from $U$ to $W$, and $\sizeof{U},\sizeof{W}\leq 2n/3 $.

567: \end{definition}

568:

569: \begin{definition}\label{def:familyseparator}

570: Let $f ()$ be a positive function.

571: A graph $G = (V,E)$ with $n$ vertices has a family of $f ()$-separators

572:   if for every $s \leq  n$, every subgraph $G' \subseteq G$ with $s$ vertices

573:   has a $f (s)$-separator.

574: \end{definition}

575:

576:

577: \begin{theorem}[Nested Dissection: Lipton-Rose-Tarjan]

578:   \label{thm:nestesdissection}

579: Let $A$ be an $n$ by $n$ symmetric PSD matrix, $\alpha > 0$ be a

580:   constant, and $h (n)$ be a positive function of $n$.

581: Let $f (x) = h (n) x^{\alpha }$.

582: If $G (A)$ has a family of $f ()$-separator, then

583:   the Nested Dissection Algorithm of Lipton, Rose and Tarjan can,

584:  in $\bigO{n + (h (n) n^{\alpha })^{3}}$ time,

585:  factor $A$ into $A = LL^{T}$

586:  so that $L$ has at most $\bigO{(h (n)n^{\alpha })^{2}\log n}$

587:   non-zeros.

588: \end{theorem}

589:

590: To apply this theorem, we note that many families of graphs

591:   are known to have families of small separators.

592: Gilbert, Hutchinson, and Tarjan

593:   \cite{GilbertHutchinsonTarjan} show

594:   that all graphs of $n$ vertices

595:   with genus bounded by $g$ have a family of $O (\sqrt{gn})$-separators,

596:   and Plotkin, Rao and Smith \cite{PlotkinRaoSmith}

597:   show that any graph that excludes

598:   $K_{s}$ as minor has a family of $O (s\sqrt{n\log n})$-separators.

599:

600:

601: \subsection{Iterative Methods}\label{sec:iterative}

602: Iterative methods such as

603:   Chebyshev iteration and Conjugate Gradient

604:   solve systems such as

605:   $A \xx = \bb $

606:   by successively multiplying vectors

607:   by the matrix $A$, and

608:   then taking linear combinations of

609:   vectors that have been produced so far.

610: The preconditioned versions of these

611:   iterative methods take as input

612:   another matrix $B$, called the

613:   \textit{preconditioner},

614:   and also perform

615:   the operation of solving linear systems

616:   in $B$.

617: In this paper, we will restrict our attention to

618:   the preconditioned Chebyshev method as it

619:   is easier to understand the effect of

620:   imprecision in the solution of the systems in

621:   $B$ on the method's output.

622: In the non-recursive version of our algorithms,

623:   we will exploit the standard analysis

624:   of Chebyshev iteration (see~\cite{Bruaset}), adapted

625:   to our situation:

626:

627: \begin{theorem}[Preconditioned Chebyshev]\label{thm:cheby}

628: Let $A$ and $B$ be Laplacian matrices, let $\bb $

629:   be a vector, and let $\xx$ satisfy $A \xx  = \bb$.

630: At each iteration, the preconditioned Chebyshev method

631:   multiplies one vector by $A$, solves one linear system

632:   in $B$, and performs a constant number of vector additions.

633: At the $k$th iteration, the algorithm maintains a solution

634:   $\xxt$ satisfying

635: \[

636:   \norm{(\xxt - \xx)}

637:  \leq

638:     e^{-k / \sqrt{\kappa_{f} (A,B)}}

639:    \kappa_{f} (A) \sqrt{\kappa_{f} (B)} \norm{\xx}.

640: \]

641: \end{theorem}

642:

643: In the non-recursive versions of our algorithms,

644:   we will pre-compute the Cholesky factorization of

645:   the preconditioners $B$,

646:   and use these to solve the linear systems encountered

647:   by preconditioned Chebyshev method.

648: In the recursive versions, we will perform a

649:   partial Cholesky factorization of $B$,

650:   into a matrix of the form $L [I, 0; 0, A_{1}] L^{T}$ ,

651:   construct a preconditioner for $A_{1}$, and again use

652:   the preconditioned Chebyshev method to solve

653: -  the systems in $A_{1}$.

654:

655:

656:

657: %\subsection{Our contributions}

658:

659:

660:

661:

662: % Local Variables: ***

663: % TeX-master:"post.tex" ***

664: % End: ***

665:

666:

667: