0712:0712.2887/PJ.tex

1: %

2: \documentclass[11pt,letterpaper]{article}

3: %

4: %

5:                                                           %

6:                                                           %

7: \usepackage{graphicx}

8: \usepackage{amsthm}

9: \usepackage{amssymb}

10:

11: \usepackage[margin=1in]{geometry}

12:

13: %

14: %

15: \usepackage{mathtools}

16:

17:

18: %

19:

20: \newtheorem{theorem}{Theorem}[section]

21: \newtheorem{lemma}[theorem]{Lemma}

22: \newtheorem{corollary}[theorem]{Corollary}

23: \newtheorem{proposition}[theorem]{Proosition}

24: \newtheorem{conjecture}[theorem]{Conjecture}

25: \newtheorem{algorithm}[theorem]{Algorithm}

26: \newtheorem{definition}[theorem]{Definition}

27: \newtheorem{remark}[theorem]{Remark}

28: \newtheorem{example}[theorem]{Example}

29: \newtheorem{question}{Question}

30: \newtheorem{note}[theorem]{Note}

31:

32: \graphicspath{{figures/}}

33:

34:

35: \newcommand{\R}{{\mathbb R}}

36: \newcommand{\C}{{\mathbb C}}

37: \newcommand{\Z}{{\mathbb Z}}

38: \newcommand{\Q}{{\mathbb Q}}

39: \newcommand{\N}{{\mathbb N}}

40: %

41: %

42: %

43: %

44: %

45: %

46: %

47: %

48: \newcommand{\cf}{{\it cf.}}

49: \newcommand{\eg}{{\it e.g.}}

50: \newcommand{\ie}{{\it i.e.}}

51: \newcommand{\etc}{{\it etc.}}

52: \newcommand{\ones}{\mathbf 1}

53: \newcommand{\reals}{{\mbox{\bf R}}}

54: \newcommand{\diag}{\mathop{\bf diag}}

55: \newcommand{\argmin}{\mathop{\mathrm{argmin}}}

56: %

57: \newcommand{\todo}[1]{\vspace{5 mm}\par \noindent \marginpar{\textsc{ToDo}}

58: \framebox{\begin{minipage}[c]{0.95 \columnwidth}

59: \tt #1 \end{minipage}}\vspace{5 mm}\par}

60: \newcommand{\half}{{\textstyle\frac{1}{2} }}

61: \newcommand{\sqtwo}{{\textstyle\frac{1}{\sqrt 2 } }}

62: \newcommand{\third}{{\textstyle\frac{1}{3} }}

63: \title{Approximation of the joint spectral radius \\ using sum of squares}

64: \author{Pablo A. Parrilo\thanks{Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, \texttt{parrilo@mit.edu}}

65: \and

66: Ali Jadbabaie\thanks{GRASP Laboratory, University of Pennsylvania, \texttt{jadbabai@seas.upenn.edu}}}

67:

68: \date{\today}

69: %

70:

71: %

72: %

73: %

74: %

75: %

76: %

77: %

78:

79: %

80: %

81: \date{}

82: %

83: %

84:

85: \begin{document}

86: \maketitle

87: %

88: %

89:

90: \begin{abstract}

91: We provide an asymptotically tight, computationally efficient

92: approximation of the joint spectral radius of a set of matrices using

93: sum of squares (SOS) programming. The approach is based on a search

94: for an SOS polynomial that proves simultaneous contractibility of a

95: finite set of matrices. We provide a bound on the quality of the

96: approximation that unifies several earlier results and is independent

97: of the number of matrices. Additionally, we present a comparison

98: between our approximation scheme and earlier techniques, including the

99: use of common quadratic Lyapunov functions and a method based on

100: matrix liftings. Theoretical results and numerical investigations show

101: that our approach yields tighter approximations.

102: \end{abstract}

103:

104:

105: \section{Introduction}

106:

107: Stability of discrete linear inclusions has been a topic of major

108: research over the past two decades. Such systems can be represented as

109: a switched linear system of the form $x(k+1) = A_{\sigma(k)} x(k)$,

110: where $\sigma$ is a mapping from the integers to a given set of

111: indices. The above model, and its many variations, has been studied

112: extensively across multiple disciplines including control theory,

113: theory of non-negative matrices and Markov chains, subdivision schemes

114: and wavelet theory, dynamical systems, etc. The fundamental question

115: of interest is to determine whether $x(k)$ converges to a limit, or

116: equivalently, whether the infinite matrix products chosen from the set

117: of matrices converge~\cite{BeWa92,DaLa92,DaLa01}.  The research on

118: convergence of infinite products of matrices spans across four

119: decades. A majority of results in this area has been provided in the

120: special case of non-negative and/or stochastic matrices. A

121: non-exhaustive list of related research providing several necessary

122: and sufficient conditions for convergence of infinite products and

123: their applications

124: includes~\cite{CH94,DaLa01,Leiz92,ShuWuPa97}. Despite the wealth of

125: research in this area, finding algorithms that can unambiguously

126: decide convergence remains elusive.  Much of the difficulty of this

127: problem stems from the hardness in computation or efficient

128: approximation of the joint spectral radius of a finite set of

129: matrices.  This notion was introduced by Rota and Strang \cite{RoSt60}

130: via the definition

131: \begin{equation}

132: \rho(A_1,\ldots,A_m) := \lim_{k \rightarrow \infty}

133: \max_{\sigma \in \{1,\ldots,m\}^k} || A_{\sigma_k} \cdots

134: A_{\sigma_2} A_{\sigma_1} ||^{1/k},

135: \label{eq:defjsr}

136: \end{equation}

137: and represents the maximum growth rate that can be achieved by taking

138: arbitrary products of the matrices $A_i$. As in the case of the

139: classical spectral radius, the value of this expression is independent

140: of the choice of norm in~(\ref{eq:defjsr}). Daubechies and

141: Lagarias~\cite{DaLa92} conjectured that the joint spectral radius is

142: equal to a related quantity, the {\it generalized spectral radius},

143: which is defined in a similar way except for the fact that the norm of

144: the product is replaced by the spectral radius. Berger and

145: Wang~\cite{BeWa92} proved this conjecture to be true for finite sets

146: of matrices.  Blondel and Tsitsiklis have shown that computing $\rho$

147: is hard from a computational complexity viewpoint, and even

148: approximating it is difficult~\cite{BlTi2,BlTi3}. In particular, it

149: follows from their results that the problem ``Is $\rho \leq 1$?'' is

150: undecidable. For rational matrices, the joint spectral radius is not a

151: semialgebraic function of the data, thus ruling out a very large class

152: of methods for its exact computation. We refer the reader to the

153: survey \cite[\S3.5]{BlTi1} for further results and references on the

154: computational complexity of the joint spectral radius.

155:

156: %

157: %

158: %

159:

160: It turns out that a necessary and sufficient condition for the

161: stability of a linear difference inclusion is for the corresponding

162: matrices to have a subunit joint spectral radius, i.e.,

163: $\rho(A_1,\ldots,A_m) < 1$; see e.g. \cite[Thm.~1]{ShuWuPa97} and

164: \cite{BraytonTong2}. A subunit joint spectral radius is equivalent to

165: the existence of a common norm with respect to which all matrices in

166: the set are contractive~\cite{Bar88,Koz90,wirth}; unfortunately, this

167: common norm is in general not finitely constructible. In fact a

168: similar result, due to Dayawansa and Martin~\cite{DayaMar}, holds for

169: nonlinear systems that undergo switching. A popular approach towards

170: approximating the joint spectral radius or showing that it is indeed

171: subunit has been to try to prove simultaneous contractibility (i.e.,

172: existence of a common norm with respect to which matrices are

173: contractive), by searching for a common ellipsoidal norm, or

174: equivalently, a common quadratic Lyapunov function. The benefit of

175: this approach is due to the fact that the search for a common

176: ellipsoidal norm can be posed as a semidefinite program and solved

177: efficiently using interior point techniques. However, it is not too

178: difficult to generate examples where the discrete inclusion is {\it

179: absolutely asymptotically stable}, i.e., asymptotically stable for all

180: switching sequences, but a common quadratic Lyapunov function, (or

181: equivalently a common ellipsoidal norm) does not exist.

182:

183: %

184:

185: Ando and Shih describe in~\cite{Ando98} a constructive procedure for

186: generating a set of $m$ matrices whose joint spectral radius is equal

187: to $\frac{1}{\sqrt{m}}$, but for which no quadratic Lyapunov function

188: exists.  They prove that the interval $[0,\, \frac{1}{\sqrt{m}})$ is

189: effectively the ``optimal" range for the joint spectral radius

190: necessary to guarantee simultaneous contractibility under an

191: ellipsoidal norm for a finite collection of $m$ matrices. The range is

192: denoted as optimal since it is the largest subset of $[0,1)$ for which

193: if the joint spectral radius is in this subset the collection of

194: matrices is simultaneously contractible under an ellipsoidal

195: norm. Furthermore, they show that the optimal joint spectral radius

196: range for a {\it bounded} set of $n \times n$ matrices is the interval

197: $[0,\, \frac{1}{\sqrt{n}})$. The proof of this fact is based on John's

198: ellipsoid theorem \cite{JohnEllipsoid}.  Roughly speaking, John's

199: ellipsoid theorem implies that every convex body in $n$-dimensional

200: Euclidean space that is symmetric with respect to the origin can be

201: approximated by inner and outer ellipsoids, up to a factor of

202: $\frac{1}{\sqrt{n}}$. Independently, Blondel, Nesterov and Theys

203: \cite{BlNT04} showed a similar result (also based on John's ellipsoid

204: theorem), that the best ellipsoidal norm approximation of the joint

205: spectral radius provides a lower bound and an upper bound on the

206: actual value. Given a set ${\mathcal M}$ of $n \times n$ matrices with

207: joint spectral radius $\rho$, and best ellipsoidal norm approximation

208: $\hat \rho$, it is shown there that

209: \begin{equation}

210: \frac{1}{\sqrt{n}} \, \hat \rho({\mathcal M}) \le \rho({\mathcal M})

211: \le \hat \rho({\mathcal M}).

212: \label{eq:sqrtn}

213: \end{equation}

214: A major consequence of these results is that finding a common Lyapunov

215: function becomes increasingly hard as the dimension goes up.

216:

217: There have been a number of earlier works proposing different

218: numerical techniques for the effective computation of bounds on the

219: joint spectral radius.  A natural class of lower bounds is obtained by

220: considering periodic switching sequences, in which case only a finite

221: number of matrix norms need to be computed.  Using a naive approach,

222: the required computational efforts grow exponentially as $m^k$, where

223: $k$ is the period of the sequence.  Due to the cyclic property of the

224: spectral radius, some terms are redundant, and Maesumi \cite{Maesumi}

225: has shown using combinatorial techniques that the number of required

226: products can be reduced to $m^k/k$. Another approach is the work of

227: Gripenberg \cite{Gripenberg}, who has introduced a branch-and-bound

228: algorithm to produce upper and lower bounds on the joint spectral

229: radius.  Protasov \cite{Protasov1,Protasov2} has developed a geometric

230: method to approximate this quantity, based on a polytopic

231: approximation of a convex set that is invariant under the action of

232: the linear operators $A_i$.  This method has also been extended to the

233: computation of the so-called $p$-radius \cite{Protasov1}. More

234: recently, Blondel and Nesterov \cite{BlNes05} have proposed an

235: alternative scheme to the computation of the joint spectral radius, by

236: ``lifting'' the matrices using Kronecker products to provide better

237: approximations.  A common feature in many of these approaches is the

238: presence of convexity-based methods to provide certificates of the

239: desired system properties.

240:

241: In this paper, we develop a sum of squares (SOS) based scheme for the

242: approximation of the joint spectral radius. The method computes, using

243: the techniques of semidefinite programming, a homogeneous polynomial

244: that serves as a Lyapunov-like function for the corresponding switched

245: linear system. We prove several results on the quality of

246: approximation of the proposed scheme. In particular, it will follow

247: from Theorems~\ref{thm:sos2dbound} and~\ref{thm:msos2dbound} that our

248: SOS-based approximation $\rho_{SOS,2d}$ satisfies

249: \[

250: \eta^{-\frac{1}{2d}} \, \cdot \,

251: \rho_{SOS,2d}({\mathcal M}) \le \rho({\mathcal M})

252: \le \rho_{SOS,2d}({\mathcal M}),

253: \]

254: where $\eta := \min \{ m , {\textstyle\binom{n+d-1}{d}} \}$.  To prove

255: this, we use two different techniques, one inspired by recent results

256: of Barvinok~\cite{Barvinok} on approximation of norms by polynomials,

257: and the other one based on a convergent iteration similar to that used

258: for Lyapunov inequalities. Our results provide a simple and unified

259: derivation of most of the available bounds, including some new

260: ones. We prove that the SOS-based approximation is always tighter than

261: that obtained by the use of common quadratic Lyapunov functions, and

262: than the one provided by Blondel and Nesterov in

263: \cite{BlNes05}. Furthermore, we show how to compute the bound in

264: \cite{BlNes05} using matrices that are exponentially smaller than

265: those proposed there; this result also follows from the earlier work

266: of Protasov \cite{Protasov1}. A preliminary version of some of our

267: results has been presented in \cite{PabloAliHSCC}.

268:

269: A description of the paper follows. In Section~\ref{sec:sosnorms} we

270: present a class of bounds on the joint spectral radius based on

271: simultaneous contractivity with respect to a norm, followed by a sum

272: of squares-based relaxation, and the corresponding suboptimality

273: properties. In Section~\ref{sec:symmalgebra} we present some

274: background material in multilinear algebra, necessary for our

275: developments, and a derivation of a bound of the quality of the SOS

276: relaxation. An alternative development is presented in

277: Section~\ref{sec:soslyap}, where a different bound on the performance

278: of the SOS relaxation is given in terms of a very natural Lyapunov

279: iteration, similar to the classical case. In

280: Section~\ref{sec:comparison} we make a comparison with earlier

281: techniques and analyze a numerical example. Finally, in

282: Section~\ref{sec:conclusions} we present our conclusions.

283:

284: %

285: %

286: %

287:

288:

289:

290: \section{Bounds via polynomials and sums of squares}

291: \label{sec:sosnorms}

292:

293: A natural way of bounding the joint spectral radius is to find a

294: common norm that guarantees certain contractiveness properties for all

295: the matrices. In this section, we first revisit this characterization,

296: and introduce our method of using SOS relaxations to approximate this

297: common norm.

298:

299: \paragraph{Norms and the joint spectral radius.}

300: As we mentioned, there exists an intimate relationship between the

301: spectral radius and the existence of a vector norm under which all the

302: matrices are simultaneously contractive. This is summarized in the

303: following theorem, a special case of Proposition 1 in \cite{RoSt60} by

304: Rota and Strang.

305:

306: \begin{theorem}[\cite{RoSt60}]

307: \label{thm:RotaStrang}

308: Consider a finite set of matrices $\mathcal{A} =

309: \{A_1,\ldots,A_m\}$. For any $\epsilon > 0$, there exists a norm

310:  $\|\cdot\|$ in $\R^n$ (denoted as JSR norm hereafter) such that

311: \[

312: ||A_i x|| \leq (\rho(\mathcal{A}) + \epsilon) \, ||x||, \qquad \forall x

313:   \in \R^n, \quad i = 1,\ldots,m.

314: \]

315: \end{theorem}

316:

317: The theorem appears in this form, for instance, in Proposition~4 of

318: \cite{BlNT04}.  The main idea in our approach is to replace the JSR

319: norm that approximates the joint spectral radius with a homogeneous

320: SOS polynomial $p(x)$ of degree $2d$. As we will see in the next

321: sections, we can produce arbitrarily tight SOS approximations, while

322: still being able to prove a bound on the resulting estimate.

323:

324: \paragraph{Joint spectral radius and polynomials.}

325: As the results presented above indicate, the joint spectral radius can

326: be characterized by finding a common norm under which all the maps are

327: simultaneously contractive.  As opposed to the unit ball of a norm,

328: the level sets of a homogeneous polynomial are not necessarily convex

329: (see for instance Figure~\ref{fig:jsr}). Nevertheless, as the

330: following theorem suggests, we can still obtain upper bounds on the

331: joint spectral radius by replacing norms with homogeneous polynomials.

332:

333: \begin{theorem}

334: \label{thm:psdbound}

335: Let $p(x)$ be a strictly positive homogeneous polynomial of degree

336: $2d$ that satisfies

337: \[

338: p(A_i x) \leq \gamma^{2d} \, p(x), \qquad \forall x \in \R^n \quad i = 1,\ldots,m.

339: \]

340: Then, $\rho(A_1,\ldots,A_m) \leq \gamma$.

341: \end{theorem}

342: \begin{proof}

343: If $p(x)$ is strictly positive, then by compactness of the unit ball

344: in $\R^n$ and continuity of $p(x)$, there exist constants $0 < \alpha

345: \leq \beta$, such that

346: \[

347: \alpha \, ||x||^{2d} \leq p(x) \leq \beta \, ||x||^{2d} \qquad \forall x \in \R^n.

348: \]

349: Then,

350: \begin{eqnarray*}

351: ||A_{\sigma_k} \ldots A_{\sigma_1}|| &\leq&

352: \max_x \frac{||A_{\sigma_k} \ldots A_{\sigma_1} x||}{||x||}  \\

353: &\leq& \left(\frac{\beta}{\alpha}\right)^\frac{1}{2d} \max_x \frac{p(A_{\sigma_k} \ldots A_{\sigma_1} x)^\frac{1}{2d}}{p(x)^\frac{1}{2d}} \\

354: & \leq & \left(\frac{\beta}{\alpha}\right)^\frac{1}{2d} \gamma^k.

355: \end{eqnarray*}

356: From the definition of the joint spectral radius in

357: equation~(\ref{eq:defjsr}), by taking $k$th roots and the limit $k

358: \rightarrow \infty$ we immediately have the upper bound

359: $\rho(A_1,\ldots,A_m) \leq \gamma$.

360: \end{proof}

361:

362: The condition in Theorem~\ref{thm:psdbound} involves positive

363: polynomials, which are computationally hard to characterize.  A useful

364: scheme, introduced in \cite{Phd:Parrilo,sdprelax} and relatively

365: well-known by now, relaxes the nonnegativity constraints to a much

366: more tractable \emph{sum of squares} (SOS) condition, where $p(x)$ is

367: required to have a decomposition as $p(x) = \sum_i p_i(x)^2$. The SOS

368: condition can be equivalently expressed in terms of a semidefinite

369: programming (SDP) constraint. In what follows, we briefly describe the

370: basic ideas behind SDP and sum of squares programming, and their

371: applications to our problem.

372:

373: \paragraph{Semidefinite programming.} SDP is a specific kind of convex

374: optimization problem with very appealing numerical properties. An SDP

375: problem corresponds to the optimization of a linear function over the

376: intersection of an affine subspace and the cone of positive

377: semidefinite matrices. For much more information about SDP and its

378: many applications, we refer the reader to the surveys

379: \cite{VaB:96,ToddSDP} and the comprehensive treatment in

380: \cite{HandSDP}.

381:

382: An SDP problem in standard primal form is usually written as:

383: \begin{align*}

384: \mathrm{minimize}   \quad      C \bullet &X   \quad   &

385: \mbox{subject to}   \quad      A_i \bullet X  &= b_i, \quad i = 1,\ldots,m \\

386:                 &           &          X &\succeq 0,

387: \end{align*}

388: where $C, A_i$ are symmetric $n \times n$ matrices, and $X \bullet Y

389: := \mathrm{trace}(X Y)$. The symmetric matrix $X$ is the optimization

390: variable over which the maximization is performed.  The inequality in

391: the second line means that the matrix $X$ must be positive

392: semidefinite, i.e., all its eigenvalues should be greater than or

393: equal to zero.  The set of feasible solutions, i.e., the set of

394: matrices $X$ that satisfy the constraints, is always a convex set. In

395: the particular case when $C=0$, the problem reduces to whether or not

396: the inequality can be satisfied for some matrix $X$. In this case, the

397: SDP is referred to as a \emph{feasibility problem}.

398:

399: There are a number of sophisticated and reliable methods to

400: numerically solve semidefinite programming problems. One of the most

401: successful approaches is based on \emph{primal-dual interior point

402: methods}, that generalize many of the techniques used in linear

403: programming \cite{NN}. The interior-point approach to SDP typically

404: involves the iterative solution of a perturbed version of the KKT

405: optimality conditions. Each iteration requires the computation of the

406: corresponding Newton direction, and the solution of a system of linear

407: equations. A theoretical bound on the number of Newton iterations is

408: $O(\sqrt{n} \log \frac{1}{\epsilon})$ for an $\epsilon$-approximate

409: solution. This estimate is signficantly more conservative than what is

410: usually experienced in practice, where the dependence on $n$ is very

411: mild (typically, 10-40 Newton iterations are enough for most

412: problems).  The cost of each iteration heavily depends on the

413: structure and sparsity of the matrices $A_i$, and is dominated by the

414: computation of the Hessian and the solution of the corresponding

415: linear system. In the fully dense case, this cost is of the order of

416: $\max\{mn^3,m^2n^2,m^3\}$, where the first two terms correspond to the

417: construction of the Hessian, and the last one to the solution of the

418: Newton system.

419:

420:

421: \paragraph{Sums of squares programming.}

422: Consider a given multivariate polynomial for which we want to decide

423: whether a sum of squares decomposition exists. This question is

424: equivalent to a semidefinite programming (SDP) problem, because of the

425: following result, that has appeared in different forms in the work of

426: Shor \cite{Shor}, Choi-Lam-Reznick \cite{ChoiLamReznick}, Nesterov

427: \cite{NesterovSquared}, and Parrilo \cite{Phd:Parrilo,sdprelax}.

428: \begin{theorem}

429: A homogeneous multivariate polynomial $p(x)$ of degree $2d$ is a sum

430: of squares if and only if

431: \begin{equation}

432: p(x) = (x^{[d]})^T Q x^{[d]},

433: \label{Par:sosrep}

434: \end{equation}

435: where $x^{[d]}$ is a vector whose entries are (possibly scaled)

436: monomials of degree $d$ in the variables $x_i$, and $Q$ is a symmetric

437: positive semidefinite matrix.

438: \end{theorem}

439: Since in general the entries of $x^{[d]}$ are not algebraically

440: independent, the matrix $Q$ in the representation (\ref{Par:sosrep})

441: \emph{is not unique}. In fact, there is an affine subspace of matrices

442: $Q$ that satisfy the equality, as can be easily seen by expanding the

443: right-hand side and equating term by term. To obtain an SOS

444: representation, we need to find a positive semidefinite matrix in this

445: affine subspace. Therefore, the problem of checking if a polynomial

446: can be decomposed as a sum of squares is \emph{equivalent} to

447: verifying whether a certain affine matrix subspace intersects the cone

448: of positive definite matrices, and hence an SDP feasibility problem.

449:

450: \begin{example}

451:   Consider the quartic homogeneous polynomial in two variables

452:   described below, and define the vector of monomials as $[ x^2, y^2,

453:   x y]^T$.

454: \begin{eqnarray*}

455: p(x,y) &=& 2 x^4 + 2 x^3 y  - x^2 y^2 + 5 y^4 \\

456: &=&

457: \left[\begin{array}{c}

458: x^2 \\  y^2 \\ x y

459: \end{array}\right]^T

460: \left[\begin{array}{ccc}

461: q_{11} & q_{12} & q_{13} \\

462: q_{12} & q_{22} & q_{23} \\

463: q_{13} & q_{23} & q_{33}

464: \end{array}\right]

465: \left[\begin{array}{c}

466: x^2 \\  y^2 \\ x y

467: \end{array}\right]\\

468: &=&

469: q_{11} x^4 + q_{22} y^4 + (q_{33} + 2 q_{12}) x^2 y^2 + 2 q_{13} x^3 y + 2 q_{23} x y^3

470: \end{eqnarray*}

471: For the left- and right-hand sides to be identical, the following

472: linear equations should hold:

473: \begin{equation}

474: q_{11} = 2, \quad

475: q_{22} = 5, \quad

476: q_{33} + 2 q_{12} = -1, \quad

477: 2 q_{13} = 2, \quad

478: 2 q_{23} = 0.

479: \end{equation}

480:

481: A positive semidefinite $Q$ that satisfies the linear equalities can

482: then be found using SDP. A particular solution is given by:

483: \[

484: Q =

485: \left[\begin{array}{rrr}

486: 2  & -3 & 1 \\ -3 & 5 & 0 \\ 1 & 0 & 5

487: \end{array}\right]

488: = L^T L, \qquad

489: L =

490: \frac{1}{\sqrt{2}}\left[\begin{array}{rrr}

491: 2 & -3 & 1 \\

492: 0 & 1 & 3

493: \end{array}\right],

494: \]

495: and therefore we have the sum of squares decomposition:

496: \[

497: p(x,y) = \frac{1}{2} (2 x^2 - 3 y^2 + x y)^2 +

498: \frac{1}{2}(y^2 + 3 x y)^2.

499: \]

500: \label{Par:ex:sosexample}

501: \hfill $\square$

502: \end{example}

503:

504:

505:

506:

507: \subsection{Norms and SOS polynomials}

508:

509: The procedure described in the previous subsection can be easily

510: adapted to the case where the polynomial $p(x)$ is not fixed, but

511: instead we search for an SOS polynomial in a given affine family (for

512: instance, all homogeneous polynomials of a given degree).

513:

514: This line of thought immediately suggests the following SOS relaxation

515: of the conditions in Theorem~\ref{thm:psdbound}:

516: \begin{equation}

517: \rho_{SOS,2d} :=

518: \inf_{p(x) \in \R_{2d}[x], \gamma} \gamma \qquad \mbox{s.t. }\left\{

519: \begin{array}{rl}

520: p(x) \, & \mbox{is SOS}\\

521: \gamma^{2d} p(x) - p(A_i x) \, & \mbox{is SOS}

522: \end{array}

523: \right.

524: \label{eq:SOSrelax}

525: \end{equation}

526: where $\R_{2d}[x]$ is the set of homogeneous polynomials of degree

527: $2d$.

528:

529: \begin{remark}

530: Theorem~\ref{thm:psdbound} requires a strictly positive polynomial

531: $p(x)$, so it would be natural to add some strict positivity condition

532: to the relaxation~(\ref{eq:SOSrelax}). For instance, one could require

533: for the polynomial $p(x)$ to belong to the relative interior of the

534: SOS cone.  However, since interior-point methods by construction

535: always produce solutions in the relative interior of the corresponding

536: convex set, this is automatically satisfied if the problem is

537: feasible. Alternatively, it is possible to give a formulation that

538: includes terms of the form $\epsilon ||x||^{2d}$, for small positive

539: $\epsilon$. These modifications are unnecessary in practice.

540: \end{remark}

541:

542: For any fixed degree $d$ and any given $\gamma$, the constraints in

543: this problem are all of SOS type, and thus equivalent to semidefinite

544: programming. Therefore, the computation of $\rho_{SOS,2d}$ is a

545: quasiconvex problem, and can be easily solved with a standard SDP

546: solver, and a simple bisection method for the scalar variable

547: $\gamma$. By Theorem~\ref{thm:psdbound}, the solution of this

548: relaxation yields an upper bound on the joint spectral radius

549: \begin{equation}

550: \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d},

551: \label{eq:trivbound}

552: \end{equation}

553: where $2d$ is the degree of the approximating polynomial.

554:

555:

556:

557:

558:

559: %

560: %

561: %

562:

563:

564: \subsection{Quality of approximation}

565:

566: What can be said about the quality of the bounds produced by the SOS

567: relaxation? We present next some results to answer this question; a

568: more complete characterization is developed in

569: Section~\ref{sec:goodbounds}. An inspiring result in this direction is

570: the following theorem of Barvinok, that quantifies how tightly SOS

571: polynomials can approximate norms:

572: \begin{theorem}[\cite{Barvinok}, p.~221]

573: \label{thm:Barvinok}

574: Let $||\cdot||$ be a norm in $\R^n$. For any integer $d \geq 1$ there

575: exists a homogeneous polynomial $p(x)$ in $n$ variables of degree $2d$

576: such that

577: \begin{enumerate}

578: \item The polynomial $p(x)$ is a sum of squares.

579: \item For all $x \in \R^n$,

580: \[

581: p(x)^\frac{1}{2d} \leq ||x|| \leq k(n,d) \,

582: p(x)^\frac{1}{2d},

583: \]

584: where $k(n,d) := \binom{n+d-1}{d}^{\frac{1}{2d}}$.

585: \end{enumerate}

586: \end{theorem}

587: For fixed state dimension $n$, by increasing the degree $d$ of the

588: approximating polynomials, the factor in the upper bound can be made

589: arbitrarily close to one. In fact, for large $d$, we have the

590: approximation

591: \[

592: k(n,d) \; \approx \; 1 + \frac{n-1}{2} \frac{\log d}{d}.

593: \]

594:

595: %

596: %

597: %

598:

599: %

600: %

601: %

602: %

603: %

604: %

605: %

606: %

607: %

608: %

609: %

610: %

611: %

612: %

613: %

614: %

615: %

616: %

617: %

618: %

619: %

620: %

621: %

622: %

623: %

624: %

625: %

626:

627:

628: To apply these results to our problem, consider the following. If

629: $\rho(A_1,\ldots,A_m) < \gamma$, by Theorem~\ref{thm:RotaStrang} (and

630: sharper results in \cite{Bar88,Koz90,wirth}) there exists a norm

631: $\|\cdot\|$ such that

632: \[

633: ||A_i x|| \leq \gamma ||x||, \quad \forall x \in \R^n, i = 1,\ldots,m.

634: \]

635: By Theorem~\ref{thm:Barvinok}, we can therefore approximate this norm

636: with a homogeneous SOS polynomial $p(x)$ of degree $2d$ that will then

637: satisfy

638: \[

639: p(A_i x)^\frac{1}{2d}\leq ||A_i x|| \leq \gamma ||x||

640: \leq \gamma \, k(n,d) \, p(x)^\frac{1}{2d},

641: \]

642: and thus we know that there exists a feasible solution of

643: \[

644: \left\{

645: \begin{array}{rl}

646: p(x) \, & \mbox{is SOS}\\

647: \alpha^{2d} p(x) - p(A_i x) \, & \geq 0 \qquad i=1,\ldots,m,

648: \end{array}

649: \right.

650: \]

651: for $\alpha = k(n,d) \rho(A_1,\ldots,A_m)$.

652:

653: %

654: %

655: %

656:

657: Despite these appealing results, notice that in general we cannot yet

658: conclude from this that the proposed SOS relaxation will always obtain

659: a solution that is within $k(n,d)^{-1}$ from the true spectral

660: radius. The reason is that even though we can prove the existence of a

661: $p(x)$ that is SOS and for which $\alpha^{2d} p(x) - p(A_i x)$ are

662: nonnegative for all $i$, it is unclear whether the last $m$

663: expressions are actually SOS. We will show later in the paper that

664: this is indeed the case. Before doing this, we concentrate first on

665: two important cases of interest, where the described approach

666: guarantees a good quality of approximation.

667:

668: \paragraph{Planar systems.}

669: The first case corresponds to two-dimensional (planar) systems, i.e.,

670: when $n=2$. In this case, it always holds that nonnegative homogeneous

671: bivariate polynomials are SOS (e.g., \cite{Reznick}). Thus, we have

672: the following result:

673: \begin{theorem}

674: Let $\{A_1,\ldots,A_m\} \subset \R^{2 \times 2}$. Then, the SOS

675: relaxation~(\ref{eq:SOSrelax}) always produces a solution satisfying:

676: \[

677: {\textstyle \frac{1}{2}} \rho_{SOS,2d} \leq

678: (d+1)^{-\frac{1}{2d}} \,

679: \rho_{SOS,2d} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d}.

680: \]

681: This result is \emph{independent} of the number $m$ of matrices.

682: \end{theorem}

683:

684: \paragraph{Quadratic Lyapunov functions.}

685: In the quadratic case (i.e., $2d=2$), it is also true that nonnegative

686: quadratic forms are sums of squares. Since

687: \[

688: {\binom{n+d-1}{d}}^\frac{1}{2d} =

689: \binom{n}{1}^\frac{1}{2} = \sqrt{n},

690: \]

691: the inequality

692: \begin{equation}

693: \frac{1}{\sqrt{n}} \; \rho_{SOS,2} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2}

694: \label{eq:quadlyapbound}

695: \end{equation}

696: follows. This bound exactly coincides with the results of Ando and

697: Shih \cite{Ando98} or Blondel, Nesterov and Theys \cite{BlNT04}. This

698: is perhaps not surprising, since in this case both Ando and Shih's

699: proof \cite{Ando98} and Barvinok's theorem rely on the use of John's

700: ellipsoid to approximate the same underlying convex set.

701:

702:

703: \paragraph{Level sets and convexity}

704: Unlike the norms that appear in Theorem~\ref{thm:RotaStrang}, an

705: appealing feature of the SOS-based method is that we are not

706: constrained to use polynomials with convex level sets. This enables in

707: some cases much better bounds than what is promised by the theorems

708: above, as illustrated in the following example.

709:

710: \begin{figure}[t]

711: \centering

712: \includegraphics[width=0.5\columnwidth]{jsrepsilon2}

713: \caption{Level sets of the quartic homogeneous polynomial

714: $V(x_1,x_2)$. These define a Lyapunov function, under which both $A_1$

715: and $A_2$ are $(1+\epsilon)$-contractive. The value of $\epsilon$ is

716: here equal to $0.01$.}

717: \label{fig:jsr}

718: \end{figure}

719:

720: \begin{example}

721: This is based on a construction by Ando and Shih

722: \cite{Ando98}. Consider the problem of proving a bound on the joint

723: spectral radius of the following matrices:

724: \[

725: A_1 =

726: \left[\begin{array}{cc}

727: 1 & 0 \\ 1 & 0

728: \end{array}\right], \qquad

729: A_2 =

730: \left[\begin{array}{rr}

731: 0 & 1 \\ 0 & -1

732: \end{array}\right].

733: \]

734: For these matrices, it can be easily shown that

735: $\rho(A_1,A_2)=1$. Using a common quadratic Lyapunov function (i.e.,

736: the case $d=2$), the upper bound on the joint spectral radius is equal

737: to $\sqrt{2}$. However, a simple quartic SOS Lyapunov function is

738: enough to prove an upper bound of $1+\epsilon$ for every $\epsilon

739: >0$, since the SOS polynomial

740: \[

741: V(x) = (x_1^2-x_2^2)^2 + \epsilon (x_1^2+x_2^2)^2

742: \]

743: satisfies

744: \begin{eqnarray*}

745: (1+\epsilon) V(x) - V(A_1 x) &=& ( x_2^2-x_1^2+\epsilon (x_1^2+x_2^2) )^2 \\

746: (1+\epsilon) V(x) - V(A_2 x) &=& ( x_1^2-x_2^2+\epsilon (x_1^2+x_2^2) )^2.

747: \end{eqnarray*}

748: The corresponding level sets of $V(x)$ are plotted in

749: Figure~\ref{fig:jsr}, and are clearly non-convex.

750: \label{ex:ando}

751: \end{example}

752:

753:

754:

755: \section{Symmetric algebra and induced matrices}

756: \label{sec:symmalgebra}

757:

758: We present next some further bounds on the quality of the SOS

759: relaxation~(\ref{eq:SOSrelax}), either by a more refined analysis of

760: the SOS polynomials in Barvinok's theorem or by explicitly producing

761: an SOS Lyapunov function of guaranteed suboptimality properties. These

762: constructions are quite natural, and parallel some lifting ideas as

763: well as the classical iteration used in the solution of discrete-time

764: Lyapunov inequalities. Before proceeding further, we briefly revisit

765: some classical notions from multilinear algebra.

766:

767: \paragraph{Symmetric algebra of a vector space}

768: Consider a vector $x \in \R^n$, and an integer $d \geq 1$. We define

769: its $d$-lift $x^{[d]}$ as a vector in $\R^N$, where $N: =

770: \binom{n+d-1}{d}$, with components $\{ \sqrt{\alpha !} \, x^\alpha

771: \}_\alpha$, where $\alpha = (\alpha_1,\ldots,\alpha_n)$, $|\alpha| :=

772: \sum_i \alpha_i = d$, and $\alpha !$ denotes the multinomial

773: coefficient $\alpha ! := \binom{d}{\alpha_1,\alpha_2,\ldots,\alpha_n}=

774: \frac{d!}{\alpha_1!  \alpha_2! \ldots \alpha_n!}$. That is, the

775: components of the lifted vector are the monomials of degree $d$,

776: scaled by the square root of the corresponding multinomial

777: coefficients.

778: \begin{example}

779: Let $n=2$, and $x = [u,v]^T$. Then, we have

780: \[

781: \left[\begin{array}{c} u \\ v \end{array}\right]^{[1]} =

782: \left[\begin{array}{c} u \\ v \end{array}\right], \qquad

783: \left[\begin{array}{c} u \\ v \end{array}\right]^{[2]} =

784: \left[\begin{array}{c} u^2 \\ \sqrt{2} u v \\ v^2 \end{array}\right], \qquad

785: \left[\begin{array}{c} u \\ v \end{array}\right]^{[3]} =

786: \left[\begin{array}{c} u^3 \\ \sqrt{3} u^2 v \\

787: \sqrt{3} u v^2 \\ v^3 \end{array}\right].

788: \]

789: \end{example}

790: The main motivation for this specific scaling of the components, is to

791: ensure that the lifting preserves some of the properties of the

792: underlying normed space. In particular, if $||\cdot||$ denotes the

793: standard Euclidean norm, it can be easily verified that $||x^{[d]}|| =

794: ||x||^d$. Thus, the lifting operation provides a norm-preserving (up

795: to power) embedding of $\R^n$ into $\R^N$. When the original space is

796: projective, this is the so-called \emph{Veronese} embedding.

797:

798: This concept can be directly extended from vectors to linear

799: transformations. Consider a linear map in $\R^n$, and the associated

800: $n \times n$ matrix $A$. Then, the lifting described above naturally

801: induces an associated map in $\R^N$, that makes the corresponding

802: diagram commute.  The matrix representing this linear transformation

803: is the \emph{$d$-th induced matrix} of $A$, denoted by $A^{[d]}$,

804: which is the unique $N \times N$ matrix that satisfies

805: \[

806: A^{[d]} x^{[d]} = (A x)^{[d]}.

807: \]

808: In systems and control, these classical constructions of multilinear

809: algebra have been used under different names in several works, among

810: them \cite{BrockettLie,Zelen} and (implicitly) \cite{BlNes05}.

811: Although not mentioned in the Control literature, there exists a

812: simple explicit formula for the entries of these induced matrices; see

813: \cite{MarcusMultilinear,MarcusMinc}.  The $d$-th induced matrix

814: $A^{[d]}$ has dimensions $N \times N$. Its entries are given by

815: \begin{equation}

816: (A^{[d]})_{\alpha \beta} = \frac{\mathrm{per}\,  A(\alpha,\beta)}{\sqrt{\mu(\alpha) \mu(\beta)}},

817: \label{eq:perm}

818: \end{equation}

819: where the indices $\alpha,\beta$ are all the $d$-element multisets of

820: $\{1,\ldots,n\}$, the notation $\mathrm{per}$ indicates the

821: \emph{permanent}\footnote{The permanent of a matrix $A \in \R^{n

822: \times n}$ is defined as $\textrm{per}(A):=\sum_{\sigma \in \Pi_n}

823: \prod_{i=1}^n a_{i,\sigma(i)}$, where $\Pi_n$ is the set of all

824: permutations in $n$ elements.} of a square matrix, and $\mu(S)$ is the

825: product of the factorials of the multiplicities of the elements of the

826: multiset $S$.

827: \begin{example}

828: Consider the case $n=2$, $d=3$. The corresponding 3-element multisets

829: are $\{1,1,1\}$, $\{1,1,2\}$, $\{1,2,2\}$ and $\{2,2,2\}$. The third

830: induced matrix is then

831: \begin{align*}

832: A^{[3]} &=

833: \begin{bmatrix}

834:                        a_{11}^3&          \sqrt{3} a_{11}^2 a_{12} &          \sqrt{3} a_{11} a_{12}^2 &                       a_{12}^3 \\

835:           \sqrt{3} a_{11}^2 a_{21}& a_{11} (a_{11} a_{22}+2 a_{21} a_{12})& a_{12} (2 a_{11} a_{22}+a_{21} a_{12}) &          \sqrt{3} a_{12}^2 a_{22} \\

836:           \sqrt{3} a_{11} a_{21}^2& a_{21} (2 a_{11} a_{22}+a_{21} a_{12})& a_{22} (a_{11} a_{22}+2 a_{21} a_{12}) &          \sqrt{3} a_{12} a_{22}^2 \\

837:                        a_{21}^3&          \sqrt{3} a_{21}^2 a_{22}  &          \sqrt{3} a_{21} a_{22}^2 &                       a_{22}^3 \\

838: \end{bmatrix}.

839: \end{align*}

840: %

841: %

842: %

843: %

844: %

845: %

846: %

847: %

848: %

849: %

850: %

851: %

852: %

853: \end{example}

854: It can be shown that these operations define an algebra homomorphism,

855: i.e., they respect the structure of matrix multiplication. In

856: particular, for any matrices $A,B$ of compatible dimensions, the

857: following identities hold:

858: \[

859: (A B)^{[d]} = A^{[d]} B^{[d]}, \qquad (A^{-1})^{[d]} = (A^{[d]})^{-1}.

860: \]

861: Furthermore, there is a simple and appealing relationship between the

862: eigenvalues of $A^{[d]}$ and those of $A$. Concretely, if

863: $\lambda_1,\ldots,\lambda_n$ are the eigenvalues of $A$, then the

864: eigenvalues of $A^{[d]}$ are given by $\prod_{j \in S} \lambda_j$

865: where $S \subseteq \{1,\ldots,n\}, |S| = d$; there are exactly

866: $\binom{n+d-1}{d}$ such multisets. A similar relationship holds for

867: the corresponding eigenvectors. Essentially, as explained below in

868: more detail, the induced matrices are the symmetry-reduced

869: version of the $d$-fold Kronecker product.

870:

871: The symmetric algebra and associated induced matrices are classical

872: objects of multilinear algebra. Induced matrices, as defined above, as

873: well as the more usual \emph{compound matrices}, correspond to two

874: specific isotypic components of the decomposition of the $d$-fold

875: tensor product under the action of the symmetric group $S^d$ (i.e.,

876: the \emph{symmetric} and \emph{skew-symmetric} algebras).  Compound

877: matrices are associated with the alternating character (hence their

878: relationship with determinants), while induced matrices correspond

879: instead to the trivial character, thus the connection with

880: permanents. Similar constructions can be given for any other character

881: of the symmetric group, by replacing the permanent in (\ref{eq:perm})

882: with the suitable immanants; see \cite{MarcusMultilinear} for

883: additional details.

884:

885:

886: \subsection{Bounds on the quality of $\rho_{SOS,2d}$}

887: \label{sec:goodbounds}

888:

889: In this section we present a bound on the approximation properties of

890: the SOS approximation, based on the ideas introduced above. As we will

891: see, the techniques based on the lifting described will exactly yield

892: the factor $k(n,d)^{-1}$ suggested by Barvinok's theorem.

893:

894: We first prove a preliminary result on the behavior of the joint

895: spectral radius under $d$-lifting.  The scaling properties described

896: earlier can be applied to obtain the following:

897: \begin{lemma}

898: Given matrices $\{A_1,\ldots,A_m\} \subset \R^{n \times n}$ and an

899: integer $d \geq 1$, the following identity holds:

900: \[

901: \rho(A_1^{[d]},\ldots,A_m^{[d]}) = \rho(A_1,\ldots,A_m)^d.

902: \]

903: \label{lem:scalingjsr}

904: \end{lemma}

905: The proof follows directly from the definition~(\ref{eq:defjsr}) and

906: the two properties $(A B)^{[d]} = A^{[d]} B^{[d]}$, $||x^{[d]}|| =

907: ||x||^d$, and it is thus omitted.

908:

909: Combining all these inequalities, we obtain the main result of this paper:

910: \begin{theorem}

911: The SOS relaxation (\ref{eq:SOSrelax}) satisfies:

912: \begin{equation}

913: {\textstyle\binom{n+d-1}{d}}^{-\frac{1}{2d}} \; \rho_{SOS,2d} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d}.

914: \label{eq:sos2dbound}

915: \end{equation}

916: \label{thm:sos2dbound}

917: \end{theorem}

918: \begin{proof}

919: Since the dimension of $A_i^{[d]}$ is $\binom{n+d-1}{d}$, from

920: Lemma~\ref{lem:scalingjsr} and inequality (\ref{eq:quadlyapbound}) it

921: follows that:

922: \[

923: {\textstyle\binom{n+d-1}{d}}^{-\frac{1}{2}}

924: \; \rho_{SOS,2}(A_1^{[d]},\ldots,A_m^{[d]})

925: \leq

926: \rho(A_1^{[d]},\ldots,A_m^{[d]}) = \rho(A_1,\ldots,A_m)^d.

927: \]

928: Combining this with (\ref{eq:trivbound}) and the inequality (proven

929: later in Theorem~\ref{thm:3bounds}),

930: \[

931: \rho_{SOS,2d}(A_1,\ldots,A_m)^d \leq \rho_{SOS,2}(A_1^{[d]},\ldots,A_m^{[d]}),

932: \]

933: the result follows.

934: \end{proof}

935:

936:

937: \section{Sum of squares Lyapunov iteration}

938: \label{sec:soslyap}

939:

940: We describe next an alternative approach to obtain bounds on the

941: quality of the SOS approximation. As opposed to the results in the

942: previous section, the bounds now explicitly depend on the number of

943: matrices, but will usually be tighter in the case of small $m$.

944:

945: Consider the iteration defined by

946: \begin{equation}

947: V_0(x) = 0, \qquad V_{k+1}(x) = Q(x) + \frac{1}{\beta} \sum_{i=1}^m V_k(A_i x),

948: \label{eq:iteration}

949: \end{equation}

950: where $Q(x)$ is a fixed $n$-variate homogeneous polynomial of degree

951: $2d$ and $\beta > 0$.  The iteration defines an affine map in the

952: space of homogeneous polynomials of degree $2d$. As usual, the

953: iteration will converge under certain assumptions on the spectral

954: radius of this linear operator.

955: \begin{theorem}

956: The iteration defined in (\ref{eq:iteration}) converges for arbitrary

957: $Q(x)$ if $\rho(A_1^{[2d]} + \cdots + A_m^{[2d]}) < {\beta}$.

958: \label{thm:convergence}

959: \end{theorem}

960: \begin{proof}

961: The vector space of homogenous polynomials $\R_{2d}[x_1,\ldots,x_n]$

962: is naturally isomorphic to the space of linear functionals on

963: $(\R^n)^{[2d]}$, via the identification $V_k(x) = \langle v_k ,

964: x^{[2d]} \rangle$, where $v_k \in \R^{\binom{n+2d-1}{2d}}$ is the

965: vector of (scaled) coefficients of $V_k(x)$. Then, since $V_k(A_i x) =

966: \langle v_k, (A_i x)^{[2d]} \rangle = \langle v_k, A_i^{[2d]} x^{[2d]}\rangle=

967: \langle (A_i^{[2d]})^T v_k, x^{[2d]}\rangle$, the iteration

968: (\ref{eq:iteration}) can be simply expressed as:

969: \[

970: v_{k+1} = q + \frac{1}{\beta} \left( \sum_{i=1}^m A_i^{[2d]} \right)^T v_{k},

971: \]

972: and it is well known that an affine iteration converges if the

973: spectral radius of the linear term is less than one.

974: \end{proof}

975:

976: For simplicity of notation, we define the following quantity,

977: corresponding to the spectral radius of the sum of the $2d$-lifted

978: matrices:

979: \begin{equation}

980: \rho_{SR,2d} := \rho(A_1^{[2d]} + \cdots + A_m^{[2d]})^\frac{1}{2d}.

981: \label{eq:rhold}

982: \end{equation}

983:

984: \begin{theorem}

985: \label{thm:sosvsnesterov}

986: The following inequality holds:

987: \[

988: \rho_{SOS,2d} \leq \rho_{SR,2d}

989: \]

990: \end{theorem}

991: \begin{proof}

992: Choose a $Q(x)$ that is in the interior of the SOS cone, e.g., $Q(x)

993: := (\sum_{i=1}^n x_i^2)^d$, and let $\beta = \rho(A_1^{[2d]} + \cdots

994: + A_m^{[2d]})+\epsilon$. The iteration~(\ref{eq:iteration}) guarantees

995: that $V_{k+1}$ is SOS if $V_{k}$ is. By induction, all the iterates

996: $V_k$ are SOS.  By the choice of $\beta$ and

997: Theorem~\ref{thm:convergence}, the $V_k$ converge to some homogeneous

998: polynomial $V_\infty(x)$. By the closedness of the cone of SOS

999: polynomials, the limit $V_\infty$ is also SOS.  Furthermore, we have

1000: \[

1001: \beta V_\infty(x) - V_\infty(A_i x) = \beta Q(x) + \sum_{j \not = i} V_\infty (A_j x)

1002: \]

1003: and therefore the expression on the left-hand side is SOS. This

1004: implies that $p(x):=V_\infty(x)$ is a feasible solution of the SOS

1005: relaxation (\ref{eq:SOSrelax}). Taking $\epsilon \rightarrow 0$, the

1006: result follows.

1007: \end{proof}

1008: Notice that if the spectral radius condition in

1009: Theorem~\ref{thm:convergence} is satisfied, then for any fixed $Q(x)$

1010: the corresponding limit $V_\infty(x) = \langle v_\infty,

1011: x^{[2d]}\rangle$ can be simply obtained by solving the nonsingular

1012: system of linear equations

1013: \[

1014: \left(I- \frac{1}{\beta}\sum_{i=1}^m A_i^{[2d]} \right)^T v_\infty = q,

1015: \]

1016: thus generalizing the standard Lyapunov equation. The iteration

1017: argument is only used to prove that the solution of this linear system

1018: yields a strictly positive SOS polynomial. A slightly different

1019: approach here is via the finite-dimensional version of the

1020: Krein-Rutman theorem (or generalized Perron-Frobenius); see for

1021: instance \cite{Protasov1} or \cite{ParriloKhatri}.

1022:

1023: \begin{theorem}

1024: The SOS relaxation (\ref{eq:SOSrelax}) satisfies:

1025: \[

1026: m^{-\frac{1}{2d}} \, \rho_{SOS,2d} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d}.

1027: \]

1028: \label{thm:msos2dbound}

1029: \end{theorem}

1030: \begin{proof}

1031: This follows directly from inequality~(\ref{eq:trivbound}), and the fact that

1032: \[

1033: \rho_{SOS,2d} \leq \rho\left(\sum_{i=1}^m A_i^{[2d]}\right)^\frac{1}{2d} \\

1034: \leq  m^\frac{1}{2d} \cdot \rho (A_1^{[2d]}, \ldots,  A_m^{[2d]})^\frac{1}{2d} \\

1035: = m^\frac{1}{2d} \cdot \rho \left(A_1, \ldots , A_m \right),

1036: \]

1037: where the first inequality is Theorem~\ref{thm:sosvsnesterov}, the

1038: second one follows from the general fact that $\rho(A_1+\cdots+A_m)

1039: \leq m \rho(A_1,\ldots,A_m)$ (see e.g., Corollary 1 in \cite{BlNes05}), and

1040: the third from Lemma~\ref{lem:scalingjsr}.

1041: \end{proof}

1042: The iteration~(\ref{eq:iteration}) is the natural generalization of

1043: the Lyapunov recursion for the single matrix case, and of the

1044: construction by Ando and Shih in \cite{Ando98} for the quadratic

1045: case. By the remarks in Section~\ref{sec:symmalgebra} above, and as

1046: described in more detail in the next section, it can be shown that the

1047: quantity $\rho_{SR,2d}$ is essentially equal to those defined by

1048: Protasov in \cite[\S 4]{Protasov1} and Blondel and Nesterov in

1049: \cite{BlNes05}. As a consequence of Theorem~\ref{thm:sosvsnesterov},

1050: the SOS-based approach will \emph{always} produce estimates at least

1051: as good as the ones given by these procedures.

1052:

1053: \section{Comparison with earlier techniques}

1054: \label{sec:comparison}

1055:

1056: In this section we compare the $\rho_{SOS,2d}$ approach with some

1057: earlier bounds from the literature. We show that our bound is never

1058: weaker than those obtained by all the other procedures.

1059:

1060: \subsection{Methods of Protasov and Blondel-Nesterov}

1061:

1062: Protasov \cite{Protasov1} has shown that an upper bound on the

1063: ``standard'' joint spectral radius can be computed via the so-called

1064: joint $p$-radius, a generalization of the definition~(\ref{eq:defjsr})

1065: involving $p$-norms. Furthermore, he has shown that in the case of

1066: even integer $p$, the value of the $p$-radius of an irreducible finite

1067: set of matrices exactly corresponds to the spectral radius of a single

1068: operator, that can in principle be constructed based on the matrices

1069: $A_i$.

1070:

1071: Independently, Blondel and Nesterov \cite{BlNes05} developed a

1072: technique based on the calculation of the spectral radius of

1073: ``lifted'' matrices. In fact, they present two different lifting

1074: procedures (``Kronecker'' and ``semidefinite'' liftings), and in

1075: Section~5 of their paper, they describe a family of bounds obtained by

1076: arbitrary combinations of these two liftings.

1077:

1078: Both of these methods are in fact equivalent to our construction of

1079: $\rho_{SR,2d}$ in Section~\ref{sec:soslyap}, in the sense that they

1080: all yield exactly the same numerical value. By

1081: Theorem~\ref{thm:sosvsnesterov}, they are thus also weaker than the

1082: SOS-based construction.  The bound defined by $\rho_{SR,2d}$

1083: in~(\ref{eq:rhold}) relies on a single canonically defined lifting,

1084: and requires much less numerical effort than the Blondel-Nesterov

1085: construction. Furthermore, instead of the somewhat more complicated

1086: construction of Protasov, the expression of the entries of the lifted

1087: matrices are given by the simple formula~(\ref{eq:perm}), making a

1088: computer implementation straightforward, with no irreducibility

1089: assumptions being required.

1090:

1091: It can be shown that our construction (or Protasov's) exactly

1092: corresponds to a fully symmetry-reduced version of the

1093: Blondel-Nesterov procedure, thus yielding equivalent bounds, but at a

1094: much smaller computational cost since the corresponding matrices are

1095: exponentially smaller (for fixed $n$, the size grows as $O(d^{n-1})$

1096: as opposed to $O(n^{2d})$). Therefore, even if no SDPs are to be

1097: solved (as would be required by the tighter bound $\rho_{SOS,2d}$),

1098: the formulation in terms of the matrices $A_i^{[2d]}$ still has many

1099: advantages.

1100:

1101: \begin{table}[t]

1102: \begin{center}

1103: \begin{tabular}{|c|c || c|c || c|c || c|c|}

1104: \hline

1105: & & \multicolumn{2}{c||}{ \cite{BlNes05}, Kronecker } &

1106: \multicolumn{2}{c||}{ \cite{BlNes05}, semidefinite } &

1107: \multicolumn{2}{c|}{This paper} \\

1108: \cline{3-8}

1109: Steps / $2d$ &Accuracy&$n=2$& $n=10$ &$n=2$ & $n=10$ & $n=2$ & $n=10$ \\

1110: \hline\hline

1111: 1 / 2  & 0.707 & 4 & 100 & 3 & 55 & 3 & 55  \\

1112: 2 / 4  & 0.840 & 16 & 10000 & 6 & 1540 & 5 & 715 \\

1113: 3 / 8  & 0.917 & 256 & $10^8$ & 21 & 1186570 & 9 & 24310 \\

1114: 4 / 16 & 0.957 & 65536 & $10^{16}$ & 231 & $7.04 \times 10^{11}$ & 17 & 2042975 \\

1115: 5 / 32 & 0.978 & $4.29\times 10^9$ & $10^{32}$ &   26796 & $2.48 \times 10^{23}$& 33 & $3.5 \times 10^8$  \\

1116: \hline

1117: \end{tabular}

1118: \end{center}

1119: \caption{Comparison of matrix sizes for the different lifting

1120: procedures to compute $\rho_{SR,2d}$. The matrix size for the

1121: Kronecker lifting is $n^{2d}$, while the recursive semidefinite

1122: lifting is given by the $d$-step recursion $s_{2k} = \binom{s_k+1}{2}$

1123: with $s_1=n$, and the size for the symmetric algebra approach is

1124: $\binom{n+2d-1}{2d}$. The accuracy estimates correspond to the case of

1125: two matrices, i.e., $m=2$.}

1126: \label{tab:BNtwo}

1127: \end{table}

1128: As an illustrative comparison of the advantages of this reduced

1129: formulation, in Table~\ref{tab:BNtwo} we present the sizes of the

1130: matrices required by the method in~\cite{BlNes05} (using the

1131: ``Kronecker'' and ``recursive semidefinite'' liftings) and our

1132: approach to $\rho_{SR,2d}$ via the symmetric algebra. The data in

1133: Table~\ref{tab:BNtwo} corresponds to that in~\cite[p.~266]{BlNes05}

1134: (with a minor misprint corrected).

1135:

1136:

1137: \subsection{Common quadratic Lyapunov functions}

1138:

1139: This method corresponds to finding a common quadratic Lyapunov

1140: function, either directly for the matrices $A_i$, or for the lifted

1141: matrices $A_i^{[d]}$. Specifically, let

1142: \[

1143: \rho_{CQ,2d} := \inf \, \left \{ \; \gamma \; \; | \; \; \gamma^{2d} P -

1144: (A_i^{[d]})^T P A_i^{[d]} \succeq 0, \quad P \succ 0 \right \}.

1145: \]

1146: This is essentially equivalent to what is discussed in Corollary 3 of

1147: \cite{BlNes05}, except that the matrices involved in our approach are

1148: exponentially smaller (of size $\binom{n+d-1}{d}$ rather than $n^d$),

1149: as all the symmetries have been taken out\footnote{There seems to be a

1150: typo in equation (7.4) of \cite{BlNes05}, as all the terms $A_i^k$

1151: should likely read $A_i^{\otimes k}$.}. Notice also that, as a

1152: consequence of their definitions, we have

1153: \[

1154: \rho_{CQ,2d}(A_1,\ldots,A_m)^d = \rho_{SOS,2}(A_1^{[d]},\ldots,A_m^{[d]}).

1155: \]

1156:

1157: We can then collect most of these results in a single theorem:

1158: \begin{theorem}

1159: The following inequalities between all the bounds hold:

1160: \begin{equation}

1161: \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d} \leq \rho_{CQ,2d} \leq

1162: \rho_{SR,2d}.

1163: \label{eq:ineqs}

1164: \end{equation}

1165: \label{thm:3bounds}

1166: \end{theorem}

1167: \begin{proof}

1168: The left-most inequality is~(\ref{eq:trivbound}). The right-most

1169: inequality follows from a similar (but stronger) argument to the one

1170: given in Theorem~\ref{thm:sosvsnesterov} above, since the spectral

1171: radius condition $\rho(A_1^{[2d]}+\cdots + A_m^{[2d]})< \beta$

1172: actually implies the convergence of the matrix iteration in

1173: $\mathcal{S}^{N}$ given by

1174: \[

1175: P_{k+1} = Q + \frac{1}{\beta} \sum_{i=1}^m (A_i^{[d]})^T P_k A_i^{[d]}, \qquad P_0 = I.

1176: \]

1177:

1178: For the middle inequality, let $p(x):= (x^{[d]})^T P

1179: x^{[d]}$. Since $P \succ 0$, it follows that $p(x)$ is SOS. From

1180: $\gamma^{2d} P - (A_i^{[d]})^T P A_i^{[d]} \succeq 0$, left- and

1181: right-multiplying by $x^{[d]}$, we have that $\gamma^{2d} p(x) - p(A_i

1182: x)$ is also SOS, and thus $p(x)$ is a feasible solution

1183: of~(\ref{eq:SOSrelax}), from where the result directly follows.

1184: \end{proof}

1185:

1186: \begin{remark}

1187: We always have $\rho_{SOS,2} = \rho_{CQ,2}$, since both correspond

1188: to the case of a common quadratic Lyapunov function for the matrices $A_i$.

1189: \end{remark}

1190:

1191: \subsection{Computational cost}

1192: In this section we quantify the computational cost of the bound

1193: $\rho_{SOS,2d}$. In the following calculations we keep $d$ fixed, and

1194: study the scaling behavior as a function of the dimension $n$.

1195:

1196: As mentioned in Section~\ref{sec:sosnorms}, solving a semidefinite

1197: programming problem typically requires several Newton iterations, with

1198: the cost of each iteration being dominated by the construction of the

1199: Hessian and solution of the corresponding linear system. For the SOS

1200: bound $\rho_{SOS,2d}$, the underlying SDP problem has $m+1$ matrix

1201: inequalities corresponding to the SOS constraints

1202: in~(\ref{eq:SOSrelax}), each of dimension $\binom{n+d-1}{d} \approx

1203: \frac{1}{d!} \cdot n^d$, which is $O(n^d)$ for fixed $d$. The number of

1204: decision variables is approximately $m \cdot \binom{n+2d-1}{2d}

1205: \approx m \cdot n^{2d}$. Thus, using a simple bisection method for

1206: $\gamma$, exploiting the block-diagonal structure, and the fact that

1207: the number of Newton iterations is essentially constant, we obtain

1208: that the approximate cost of obtaining an $\epsilon$-approximate

1209: solution of $\rho_{SOS,2d}$ is $O(m \cdot n^{6d} \cdot \log

1210: \frac{1}{\epsilon})$, where $d$ is chosen such that $\epsilon \approx

1211: \frac{n}{2} \frac{\log d}{d}$ or $\epsilon \approx m^{-\frac{1}{2d}}$,

1212: depending on whether we use bounds that depend on the number of

1213: matrices (Theorem~\ref{thm:msos2dbound}) or not (Theorem

1214: \ref{thm:sos2dbound}).

1215:

1216: We remark that these quantities are a relatively coarse estimate of

1217: the best possible algorithmic complexity, since very little structure

1218: of the corresponding SDP problem is being exploited. It is known that

1219: for structured problems such as the ones appearing here much more

1220: efficient SDP-based algorithms can be developed. In particular, in the

1221: context of sum of squares problems several techniques are known to

1222: exploit some of the available structure for more efficient

1223: computation; see \cite{GHNV,LofbergParrilo,RohVandenberghe}.

1224:

1225: \subsection{Examples}

1226: We present next two numerical examples that compare the described

1227: techniques. In particular, we show that the bounds in

1228: Theorem~\ref{thm:3bounds} can all be strict.

1229:

1230: \begin{example}

1231: Here we revisit the construction presented earlier in

1232: Example~\ref{ex:ando}. For the matrices given there we have:

1233: \begin{align*}

1234: \rho_{SOS,2} &= \sqrt{2}, &

1235: \rho_{CQ,2}&= \sqrt{2}, &

1236: \rho_{SR,2d} &= \sqrt[2d]{2},

1237: \\

1238: \rho_{SOS,4} &= 1, &

1239: \rho_{CQ,4}&= 1. &

1240: %

1241: \end{align*}

1242: \end{example}

1243:

1244: \begin{example}

1245: \label{ex:threemats}

1246: Consider the three $4 \times 4 $ matrices (randomly generated) given by:

1247: \[

1248: A_1 =

1249: \left[

1250: \begin{array}{rrrr}

1251:      0  &   1  &    7  &    4 \\

1252:      1  &   6  &   -2  &   -3 \\

1253:     -1  &  -1  &   -2  &   -6 \\

1254:      3  &   0  &    9  &    1

1255: \end{array}

1256: \right],

1257: \quad

1258: A_2 =

1259: \left[

1260: \begin{array}{rrrr}

1261:     -3  &    3  &    0  &   -2\\

1262:     -2  &    1  &    4  &    9\\

1263:      4  &   -3  &    1  &    1\\

1264:      1  &   -5  &   -1  &   -2

1265: \end{array}

1266: \right],

1267: \quad

1268: A_3 =

1269: \left[

1270: \begin{array}{rrrr}

1271:      1   &   4  &    5  &   10 \\

1272:      0   &   5  &    1  &   -4 \\

1273:      0   &  -1  &    4  &    6 \\

1274:     -1   &   5  &    0  &    1

1275: \end{array}

1276: \right].

1277: \]

1278: The value of the different approximations are presented in

1279: Table~\ref{tab:comparison}. A lower bound is $\rho(A_1

1280: A_3)^\frac{1}{2} \approx 8.9149$, which is extremely close (and

1281: perhaps exactly equal) to the upper bound $\rho_{SOS,4}$. Notice from

1282: the $d=2$ entry of Table~\ref{tab:comparison} that all the

1283: inequalities~(\ref{eq:ineqs}) can be strict.

1284:

1285:

1286: \begin{table}[t]

1287: \begin{center}

1288: \begin{tabular}{|c|cc|ccc|}

1289: \hline $d$ & $\dim A_i^{[d]}$ & $\dim A_i^{[2d]}$ & $\rho_{SOS,2d}$ &

1290: $\rho_{CQ,2d}$ & $\rho_{SR,2d}$ \\ \hline 1 & 4 & 10 & 9.761 & 9.761 &

1291: 12.519 \\ 2 & 10 & 35 & 8.92 & 9.01 & 9.887 \\ 3 & 20 & 84 & 8.92 &

1292: 8.92 & 9.3133 \\ \hline

1293: \end{tabular}

1294: \end{center}

1295: \caption{Comparison of the different approximations for Example~\ref{ex:threemats}.}

1296: \label{tab:comparison}

1297: \end{table}

1298: \end{example}

1299:

1300: \section{Conclusions}

1301: \label{sec:conclusions}

1302:

1303: We introduced a novel scheme for the approximation of the joint

1304: spectral radius of a set of matrices using sum of squares

1305: programming. The method is based on the use of a multivariate

1306: polynomial to provide a norm-like quantity under which all matrices

1307: are contractive. We provided an asymptotically tight estimate for the

1308: quality of the bound, which is independent of the number of

1309: matrices. We also proposed an alternative bound, that depends on the

1310: number $m$ of matrices, based on a generalization of a Lyapunov

1311: iteration.

1312:

1313: Our results can be alternatively interpreted in a simpler way as

1314: providing a trajectory-preserving lifting to a higher dimensional

1315: space, and proving contractiveness with respect to an ellipsoidal norm

1316: in that space. In this case, a weaker estimate can be obtained by

1317: computing the spectral radius of a fixed matrix.  These results

1318: generalize earlier work of Ando and Shih~\cite{Ando98}, Blondel,

1319: Nesterov and Theys~\cite{BlNT04}, and provide an improvement over the

1320: lifting procedure of Blondel and Nesterov~\cite{BlNes05}. The good

1321: performance of our procedure was also verified using numerical

1322: examples.

1323:

1324: \paragraph{Acknowledgement}

1325: We thank the referees for their careful reading of the manuscript, and

1326: their many useful suggestions.

1327:

1328: %

1329: %

1330:

1331: \bibliographystyle{alpha}

1332: \bibliography{jsr}

1333:

1334: %

1335: %

1336: \end{document}

1337: