1: %
2: \documentclass[11pt,letterpaper]{article}
3: %
4: %
5: %
6: %
7: \usepackage{graphicx}
8: \usepackage{amsthm}
9: \usepackage{amssymb}
10:
11: \usepackage[margin=1in]{geometry}
12:
13: %
14: %
15: \usepackage{mathtools}
16:
17:
18: %
19:
20: \newtheorem{theorem}{Theorem}[section]
21: \newtheorem{lemma}[theorem]{Lemma}
22: \newtheorem{corollary}[theorem]{Corollary}
23: \newtheorem{proposition}[theorem]{Proosition}
24: \newtheorem{conjecture}[theorem]{Conjecture}
25: \newtheorem{algorithm}[theorem]{Algorithm}
26: \newtheorem{definition}[theorem]{Definition}
27: \newtheorem{remark}[theorem]{Remark}
28: \newtheorem{example}[theorem]{Example}
29: \newtheorem{question}{Question}
30: \newtheorem{note}[theorem]{Note}
31:
32: \graphicspath{{figures/}}
33:
34:
35: \newcommand{\R}{{\mathbb R}}
36: \newcommand{\C}{{\mathbb C}}
37: \newcommand{\Z}{{\mathbb Z}}
38: \newcommand{\Q}{{\mathbb Q}}
39: \newcommand{\N}{{\mathbb N}}
40: %
41: %
42: %
43: %
44: %
45: %
46: %
47: %
48: \newcommand{\cf}{{\it cf.}}
49: \newcommand{\eg}{{\it e.g.}}
50: \newcommand{\ie}{{\it i.e.}}
51: \newcommand{\etc}{{\it etc.}}
52: \newcommand{\ones}{\mathbf 1}
53: \newcommand{\reals}{{\mbox{\bf R}}}
54: \newcommand{\diag}{\mathop{\bf diag}}
55: \newcommand{\argmin}{\mathop{\mathrm{argmin}}}
56: %
57: \newcommand{\todo}[1]{\vspace{5 mm}\par \noindent \marginpar{\textsc{ToDo}}
58: \framebox{\begin{minipage}[c]{0.95 \columnwidth}
59: \tt #1 \end{minipage}}\vspace{5 mm}\par}
60: \newcommand{\half}{{\textstyle\frac{1}{2} }}
61: \newcommand{\sqtwo}{{\textstyle\frac{1}{\sqrt 2 } }}
62: \newcommand{\third}{{\textstyle\frac{1}{3} }}
63: \title{Approximation of the joint spectral radius \\ using sum of squares}
64: \author{Pablo A. Parrilo\thanks{Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, \texttt{parrilo@mit.edu}}
65: \and
66: Ali Jadbabaie\thanks{GRASP Laboratory, University of Pennsylvania, \texttt{jadbabai@seas.upenn.edu}}}
67:
68: \date{\today}
69: %
70:
71: %
72: %
73: %
74: %
75: %
76: %
77: %
78:
79: %
80: %
81: \date{}
82: %
83: %
84:
85: \begin{document}
86: \maketitle
87: %
88: %
89:
90: \begin{abstract}
91: We provide an asymptotically tight, computationally efficient
92: approximation of the joint spectral radius of a set of matrices using
93: sum of squares (SOS) programming. The approach is based on a search
94: for an SOS polynomial that proves simultaneous contractibility of a
95: finite set of matrices. We provide a bound on the quality of the
96: approximation that unifies several earlier results and is independent
97: of the number of matrices. Additionally, we present a comparison
98: between our approximation scheme and earlier techniques, including the
99: use of common quadratic Lyapunov functions and a method based on
100: matrix liftings. Theoretical results and numerical investigations show
101: that our approach yields tighter approximations.
102: \end{abstract}
103:
104:
105: \section{Introduction}
106:
107: Stability of discrete linear inclusions has been a topic of major
108: research over the past two decades. Such systems can be represented as
109: a switched linear system of the form $x(k+1) = A_{\sigma(k)} x(k)$,
110: where $\sigma$ is a mapping from the integers to a given set of
111: indices. The above model, and its many variations, has been studied
112: extensively across multiple disciplines including control theory,
113: theory of non-negative matrices and Markov chains, subdivision schemes
114: and wavelet theory, dynamical systems, etc. The fundamental question
115: of interest is to determine whether $x(k)$ converges to a limit, or
116: equivalently, whether the infinite matrix products chosen from the set
117: of matrices converge~\cite{BeWa92,DaLa92,DaLa01}. The research on
118: convergence of infinite products of matrices spans across four
119: decades. A majority of results in this area has been provided in the
120: special case of non-negative and/or stochastic matrices. A
121: non-exhaustive list of related research providing several necessary
122: and sufficient conditions for convergence of infinite products and
123: their applications
124: includes~\cite{CH94,DaLa01,Leiz92,ShuWuPa97}. Despite the wealth of
125: research in this area, finding algorithms that can unambiguously
126: decide convergence remains elusive. Much of the difficulty of this
127: problem stems from the hardness in computation or efficient
128: approximation of the joint spectral radius of a finite set of
129: matrices. This notion was introduced by Rota and Strang \cite{RoSt60}
130: via the definition
131: \begin{equation}
132: \rho(A_1,\ldots,A_m) := \lim_{k \rightarrow \infty}
133: \max_{\sigma \in \{1,\ldots,m\}^k} || A_{\sigma_k} \cdots
134: A_{\sigma_2} A_{\sigma_1} ||^{1/k},
135: \label{eq:defjsr}
136: \end{equation}
137: and represents the maximum growth rate that can be achieved by taking
138: arbitrary products of the matrices $A_i$. As in the case of the
139: classical spectral radius, the value of this expression is independent
140: of the choice of norm in~(\ref{eq:defjsr}). Daubechies and
141: Lagarias~\cite{DaLa92} conjectured that the joint spectral radius is
142: equal to a related quantity, the {\it generalized spectral radius},
143: which is defined in a similar way except for the fact that the norm of
144: the product is replaced by the spectral radius. Berger and
145: Wang~\cite{BeWa92} proved this conjecture to be true for finite sets
146: of matrices. Blondel and Tsitsiklis have shown that computing $\rho$
147: is hard from a computational complexity viewpoint, and even
148: approximating it is difficult~\cite{BlTi2,BlTi3}. In particular, it
149: follows from their results that the problem ``Is $\rho \leq 1$?'' is
150: undecidable. For rational matrices, the joint spectral radius is not a
151: semialgebraic function of the data, thus ruling out a very large class
152: of methods for its exact computation. We refer the reader to the
153: survey \cite[\S3.5]{BlTi1} for further results and references on the
154: computational complexity of the joint spectral radius.
155:
156: %
157: %
158: %
159:
160: It turns out that a necessary and sufficient condition for the
161: stability of a linear difference inclusion is for the corresponding
162: matrices to have a subunit joint spectral radius, i.e.,
163: $\rho(A_1,\ldots,A_m) < 1$; see e.g. \cite[Thm.~1]{ShuWuPa97} and
164: \cite{BraytonTong2}. A subunit joint spectral radius is equivalent to
165: the existence of a common norm with respect to which all matrices in
166: the set are contractive~\cite{Bar88,Koz90,wirth}; unfortunately, this
167: common norm is in general not finitely constructible. In fact a
168: similar result, due to Dayawansa and Martin~\cite{DayaMar}, holds for
169: nonlinear systems that undergo switching. A popular approach towards
170: approximating the joint spectral radius or showing that it is indeed
171: subunit has been to try to prove simultaneous contractibility (i.e.,
172: existence of a common norm with respect to which matrices are
173: contractive), by searching for a common ellipsoidal norm, or
174: equivalently, a common quadratic Lyapunov function. The benefit of
175: this approach is due to the fact that the search for a common
176: ellipsoidal norm can be posed as a semidefinite program and solved
177: efficiently using interior point techniques. However, it is not too
178: difficult to generate examples where the discrete inclusion is {\it
179: absolutely asymptotically stable}, i.e., asymptotically stable for all
180: switching sequences, but a common quadratic Lyapunov function, (or
181: equivalently a common ellipsoidal norm) does not exist.
182:
183: %
184:
185: Ando and Shih describe in~\cite{Ando98} a constructive procedure for
186: generating a set of $m$ matrices whose joint spectral radius is equal
187: to $\frac{1}{\sqrt{m}}$, but for which no quadratic Lyapunov function
188: exists. They prove that the interval $[0,\, \frac{1}{\sqrt{m}})$ is
189: effectively the ``optimal" range for the joint spectral radius
190: necessary to guarantee simultaneous contractibility under an
191: ellipsoidal norm for a finite collection of $m$ matrices. The range is
192: denoted as optimal since it is the largest subset of $[0,1)$ for which
193: if the joint spectral radius is in this subset the collection of
194: matrices is simultaneously contractible under an ellipsoidal
195: norm. Furthermore, they show that the optimal joint spectral radius
196: range for a {\it bounded} set of $n \times n$ matrices is the interval
197: $[0,\, \frac{1}{\sqrt{n}})$. The proof of this fact is based on John's
198: ellipsoid theorem \cite{JohnEllipsoid}. Roughly speaking, John's
199: ellipsoid theorem implies that every convex body in $n$-dimensional
200: Euclidean space that is symmetric with respect to the origin can be
201: approximated by inner and outer ellipsoids, up to a factor of
202: $\frac{1}{\sqrt{n}}$. Independently, Blondel, Nesterov and Theys
203: \cite{BlNT04} showed a similar result (also based on John's ellipsoid
204: theorem), that the best ellipsoidal norm approximation of the joint
205: spectral radius provides a lower bound and an upper bound on the
206: actual value. Given a set ${\mathcal M}$ of $n \times n$ matrices with
207: joint spectral radius $\rho$, and best ellipsoidal norm approximation
208: $\hat \rho$, it is shown there that
209: \begin{equation}
210: \frac{1}{\sqrt{n}} \, \hat \rho({\mathcal M}) \le \rho({\mathcal M})
211: \le \hat \rho({\mathcal M}).
212: \label{eq:sqrtn}
213: \end{equation}
214: A major consequence of these results is that finding a common Lyapunov
215: function becomes increasingly hard as the dimension goes up.
216:
217: There have been a number of earlier works proposing different
218: numerical techniques for the effective computation of bounds on the
219: joint spectral radius. A natural class of lower bounds is obtained by
220: considering periodic switching sequences, in which case only a finite
221: number of matrix norms need to be computed. Using a naive approach,
222: the required computational efforts grow exponentially as $m^k$, where
223: $k$ is the period of the sequence. Due to the cyclic property of the
224: spectral radius, some terms are redundant, and Maesumi \cite{Maesumi}
225: has shown using combinatorial techniques that the number of required
226: products can be reduced to $m^k/k$. Another approach is the work of
227: Gripenberg \cite{Gripenberg}, who has introduced a branch-and-bound
228: algorithm to produce upper and lower bounds on the joint spectral
229: radius. Protasov \cite{Protasov1,Protasov2} has developed a geometric
230: method to approximate this quantity, based on a polytopic
231: approximation of a convex set that is invariant under the action of
232: the linear operators $A_i$. This method has also been extended to the
233: computation of the so-called $p$-radius \cite{Protasov1}. More
234: recently, Blondel and Nesterov \cite{BlNes05} have proposed an
235: alternative scheme to the computation of the joint spectral radius, by
236: ``lifting'' the matrices using Kronecker products to provide better
237: approximations. A common feature in many of these approaches is the
238: presence of convexity-based methods to provide certificates of the
239: desired system properties.
240:
241: In this paper, we develop a sum of squares (SOS) based scheme for the
242: approximation of the joint spectral radius. The method computes, using
243: the techniques of semidefinite programming, a homogeneous polynomial
244: that serves as a Lyapunov-like function for the corresponding switched
245: linear system. We prove several results on the quality of
246: approximation of the proposed scheme. In particular, it will follow
247: from Theorems~\ref{thm:sos2dbound} and~\ref{thm:msos2dbound} that our
248: SOS-based approximation $\rho_{SOS,2d}$ satisfies
249: \[
250: \eta^{-\frac{1}{2d}} \, \cdot \,
251: \rho_{SOS,2d}({\mathcal M}) \le \rho({\mathcal M})
252: \le \rho_{SOS,2d}({\mathcal M}),
253: \]
254: where $\eta := \min \{ m , {\textstyle\binom{n+d-1}{d}} \}$. To prove
255: this, we use two different techniques, one inspired by recent results
256: of Barvinok~\cite{Barvinok} on approximation of norms by polynomials,
257: and the other one based on a convergent iteration similar to that used
258: for Lyapunov inequalities. Our results provide a simple and unified
259: derivation of most of the available bounds, including some new
260: ones. We prove that the SOS-based approximation is always tighter than
261: that obtained by the use of common quadratic Lyapunov functions, and
262: than the one provided by Blondel and Nesterov in
263: \cite{BlNes05}. Furthermore, we show how to compute the bound in
264: \cite{BlNes05} using matrices that are exponentially smaller than
265: those proposed there; this result also follows from the earlier work
266: of Protasov \cite{Protasov1}. A preliminary version of some of our
267: results has been presented in \cite{PabloAliHSCC}.
268:
269: A description of the paper follows. In Section~\ref{sec:sosnorms} we
270: present a class of bounds on the joint spectral radius based on
271: simultaneous contractivity with respect to a norm, followed by a sum
272: of squares-based relaxation, and the corresponding suboptimality
273: properties. In Section~\ref{sec:symmalgebra} we present some
274: background material in multilinear algebra, necessary for our
275: developments, and a derivation of a bound of the quality of the SOS
276: relaxation. An alternative development is presented in
277: Section~\ref{sec:soslyap}, where a different bound on the performance
278: of the SOS relaxation is given in terms of a very natural Lyapunov
279: iteration, similar to the classical case. In
280: Section~\ref{sec:comparison} we make a comparison with earlier
281: techniques and analyze a numerical example. Finally, in
282: Section~\ref{sec:conclusions} we present our conclusions.
283:
284: %
285: %
286: %
287:
288:
289:
290: \section{Bounds via polynomials and sums of squares}
291: \label{sec:sosnorms}
292:
293: A natural way of bounding the joint spectral radius is to find a
294: common norm that guarantees certain contractiveness properties for all
295: the matrices. In this section, we first revisit this characterization,
296: and introduce our method of using SOS relaxations to approximate this
297: common norm.
298:
299: \paragraph{Norms and the joint spectral radius.}
300: As we mentioned, there exists an intimate relationship between the
301: spectral radius and the existence of a vector norm under which all the
302: matrices are simultaneously contractive. This is summarized in the
303: following theorem, a special case of Proposition 1 in \cite{RoSt60} by
304: Rota and Strang.
305:
306: \begin{theorem}[\cite{RoSt60}]
307: \label{thm:RotaStrang}
308: Consider a finite set of matrices $\mathcal{A} =
309: \{A_1,\ldots,A_m\}$. For any $\epsilon > 0$, there exists a norm
310: $\|\cdot\|$ in $\R^n$ (denoted as JSR norm hereafter) such that
311: \[
312: ||A_i x|| \leq (\rho(\mathcal{A}) + \epsilon) \, ||x||, \qquad \forall x
313: \in \R^n, \quad i = 1,\ldots,m.
314: \]
315: \end{theorem}
316:
317: The theorem appears in this form, for instance, in Proposition~4 of
318: \cite{BlNT04}. The main idea in our approach is to replace the JSR
319: norm that approximates the joint spectral radius with a homogeneous
320: SOS polynomial $p(x)$ of degree $2d$. As we will see in the next
321: sections, we can produce arbitrarily tight SOS approximations, while
322: still being able to prove a bound on the resulting estimate.
323:
324: \paragraph{Joint spectral radius and polynomials.}
325: As the results presented above indicate, the joint spectral radius can
326: be characterized by finding a common norm under which all the maps are
327: simultaneously contractive. As opposed to the unit ball of a norm,
328: the level sets of a homogeneous polynomial are not necessarily convex
329: (see for instance Figure~\ref{fig:jsr}). Nevertheless, as the
330: following theorem suggests, we can still obtain upper bounds on the
331: joint spectral radius by replacing norms with homogeneous polynomials.
332:
333: \begin{theorem}
334: \label{thm:psdbound}
335: Let $p(x)$ be a strictly positive homogeneous polynomial of degree
336: $2d$ that satisfies
337: \[
338: p(A_i x) \leq \gamma^{2d} \, p(x), \qquad \forall x \in \R^n \quad i = 1,\ldots,m.
339: \]
340: Then, $\rho(A_1,\ldots,A_m) \leq \gamma$.
341: \end{theorem}
342: \begin{proof}
343: If $p(x)$ is strictly positive, then by compactness of the unit ball
344: in $\R^n$ and continuity of $p(x)$, there exist constants $0 < \alpha
345: \leq \beta$, such that
346: \[
347: \alpha \, ||x||^{2d} \leq p(x) \leq \beta \, ||x||^{2d} \qquad \forall x \in \R^n.
348: \]
349: Then,
350: \begin{eqnarray*}
351: ||A_{\sigma_k} \ldots A_{\sigma_1}|| &\leq&
352: \max_x \frac{||A_{\sigma_k} \ldots A_{\sigma_1} x||}{||x||} \\
353: &\leq& \left(\frac{\beta}{\alpha}\right)^\frac{1}{2d} \max_x \frac{p(A_{\sigma_k} \ldots A_{\sigma_1} x)^\frac{1}{2d}}{p(x)^\frac{1}{2d}} \\
354: & \leq & \left(\frac{\beta}{\alpha}\right)^\frac{1}{2d} \gamma^k.
355: \end{eqnarray*}
356: From the definition of the joint spectral radius in
357: equation~(\ref{eq:defjsr}), by taking $k$th roots and the limit $k
358: \rightarrow \infty$ we immediately have the upper bound
359: $\rho(A_1,\ldots,A_m) \leq \gamma$.
360: \end{proof}
361:
362: The condition in Theorem~\ref{thm:psdbound} involves positive
363: polynomials, which are computationally hard to characterize. A useful
364: scheme, introduced in \cite{Phd:Parrilo,sdprelax} and relatively
365: well-known by now, relaxes the nonnegativity constraints to a much
366: more tractable \emph{sum of squares} (SOS) condition, where $p(x)$ is
367: required to have a decomposition as $p(x) = \sum_i p_i(x)^2$. The SOS
368: condition can be equivalently expressed in terms of a semidefinite
369: programming (SDP) constraint. In what follows, we briefly describe the
370: basic ideas behind SDP and sum of squares programming, and their
371: applications to our problem.
372:
373: \paragraph{Semidefinite programming.} SDP is a specific kind of convex
374: optimization problem with very appealing numerical properties. An SDP
375: problem corresponds to the optimization of a linear function over the
376: intersection of an affine subspace and the cone of positive
377: semidefinite matrices. For much more information about SDP and its
378: many applications, we refer the reader to the surveys
379: \cite{VaB:96,ToddSDP} and the comprehensive treatment in
380: \cite{HandSDP}.
381:
382: An SDP problem in standard primal form is usually written as:
383: \begin{align*}
384: \mathrm{minimize} \quad C \bullet &X \quad &
385: \mbox{subject to} \quad A_i \bullet X &= b_i, \quad i = 1,\ldots,m \\
386: & & X &\succeq 0,
387: \end{align*}
388: where $C, A_i$ are symmetric $n \times n$ matrices, and $X \bullet Y
389: := \mathrm{trace}(X Y)$. The symmetric matrix $X$ is the optimization
390: variable over which the maximization is performed. The inequality in
391: the second line means that the matrix $X$ must be positive
392: semidefinite, i.e., all its eigenvalues should be greater than or
393: equal to zero. The set of feasible solutions, i.e., the set of
394: matrices $X$ that satisfy the constraints, is always a convex set. In
395: the particular case when $C=0$, the problem reduces to whether or not
396: the inequality can be satisfied for some matrix $X$. In this case, the
397: SDP is referred to as a \emph{feasibility problem}.
398:
399: There are a number of sophisticated and reliable methods to
400: numerically solve semidefinite programming problems. One of the most
401: successful approaches is based on \emph{primal-dual interior point
402: methods}, that generalize many of the techniques used in linear
403: programming \cite{NN}. The interior-point approach to SDP typically
404: involves the iterative solution of a perturbed version of the KKT
405: optimality conditions. Each iteration requires the computation of the
406: corresponding Newton direction, and the solution of a system of linear
407: equations. A theoretical bound on the number of Newton iterations is
408: $O(\sqrt{n} \log \frac{1}{\epsilon})$ for an $\epsilon$-approximate
409: solution. This estimate is signficantly more conservative than what is
410: usually experienced in practice, where the dependence on $n$ is very
411: mild (typically, 10-40 Newton iterations are enough for most
412: problems). The cost of each iteration heavily depends on the
413: structure and sparsity of the matrices $A_i$, and is dominated by the
414: computation of the Hessian and the solution of the corresponding
415: linear system. In the fully dense case, this cost is of the order of
416: $\max\{mn^3,m^2n^2,m^3\}$, where the first two terms correspond to the
417: construction of the Hessian, and the last one to the solution of the
418: Newton system.
419:
420:
421: \paragraph{Sums of squares programming.}
422: Consider a given multivariate polynomial for which we want to decide
423: whether a sum of squares decomposition exists. This question is
424: equivalent to a semidefinite programming (SDP) problem, because of the
425: following result, that has appeared in different forms in the work of
426: Shor \cite{Shor}, Choi-Lam-Reznick \cite{ChoiLamReznick}, Nesterov
427: \cite{NesterovSquared}, and Parrilo \cite{Phd:Parrilo,sdprelax}.
428: \begin{theorem}
429: A homogeneous multivariate polynomial $p(x)$ of degree $2d$ is a sum
430: of squares if and only if
431: \begin{equation}
432: p(x) = (x^{[d]})^T Q x^{[d]},
433: \label{Par:sosrep}
434: \end{equation}
435: where $x^{[d]}$ is a vector whose entries are (possibly scaled)
436: monomials of degree $d$ in the variables $x_i$, and $Q$ is a symmetric
437: positive semidefinite matrix.
438: \end{theorem}
439: Since in general the entries of $x^{[d]}$ are not algebraically
440: independent, the matrix $Q$ in the representation (\ref{Par:sosrep})
441: \emph{is not unique}. In fact, there is an affine subspace of matrices
442: $Q$ that satisfy the equality, as can be easily seen by expanding the
443: right-hand side and equating term by term. To obtain an SOS
444: representation, we need to find a positive semidefinite matrix in this
445: affine subspace. Therefore, the problem of checking if a polynomial
446: can be decomposed as a sum of squares is \emph{equivalent} to
447: verifying whether a certain affine matrix subspace intersects the cone
448: of positive definite matrices, and hence an SDP feasibility problem.
449:
450: \begin{example}
451: Consider the quartic homogeneous polynomial in two variables
452: described below, and define the vector of monomials as $[ x^2, y^2,
453: x y]^T$.
454: \begin{eqnarray*}
455: p(x,y) &=& 2 x^4 + 2 x^3 y - x^2 y^2 + 5 y^4 \\
456: &=&
457: \left[\begin{array}{c}
458: x^2 \\ y^2 \\ x y
459: \end{array}\right]^T
460: \left[\begin{array}{ccc}
461: q_{11} & q_{12} & q_{13} \\
462: q_{12} & q_{22} & q_{23} \\
463: q_{13} & q_{23} & q_{33}
464: \end{array}\right]
465: \left[\begin{array}{c}
466: x^2 \\ y^2 \\ x y
467: \end{array}\right]\\
468: &=&
469: q_{11} x^4 + q_{22} y^4 + (q_{33} + 2 q_{12}) x^2 y^2 + 2 q_{13} x^3 y + 2 q_{23} x y^3
470: \end{eqnarray*}
471: For the left- and right-hand sides to be identical, the following
472: linear equations should hold:
473: \begin{equation}
474: q_{11} = 2, \quad
475: q_{22} = 5, \quad
476: q_{33} + 2 q_{12} = -1, \quad
477: 2 q_{13} = 2, \quad
478: 2 q_{23} = 0.
479: \end{equation}
480:
481: A positive semidefinite $Q$ that satisfies the linear equalities can
482: then be found using SDP. A particular solution is given by:
483: \[
484: Q =
485: \left[\begin{array}{rrr}
486: 2 & -3 & 1 \\ -3 & 5 & 0 \\ 1 & 0 & 5
487: \end{array}\right]
488: = L^T L, \qquad
489: L =
490: \frac{1}{\sqrt{2}}\left[\begin{array}{rrr}
491: 2 & -3 & 1 \\
492: 0 & 1 & 3
493: \end{array}\right],
494: \]
495: and therefore we have the sum of squares decomposition:
496: \[
497: p(x,y) = \frac{1}{2} (2 x^2 - 3 y^2 + x y)^2 +
498: \frac{1}{2}(y^2 + 3 x y)^2.
499: \]
500: \label{Par:ex:sosexample}
501: \hfill $\square$
502: \end{example}
503:
504:
505:
506:
507: \subsection{Norms and SOS polynomials}
508:
509: The procedure described in the previous subsection can be easily
510: adapted to the case where the polynomial $p(x)$ is not fixed, but
511: instead we search for an SOS polynomial in a given affine family (for
512: instance, all homogeneous polynomials of a given degree).
513:
514: This line of thought immediately suggests the following SOS relaxation
515: of the conditions in Theorem~\ref{thm:psdbound}:
516: \begin{equation}
517: \rho_{SOS,2d} :=
518: \inf_{p(x) \in \R_{2d}[x], \gamma} \gamma \qquad \mbox{s.t. }\left\{
519: \begin{array}{rl}
520: p(x) \, & \mbox{is SOS}\\
521: \gamma^{2d} p(x) - p(A_i x) \, & \mbox{is SOS}
522: \end{array}
523: \right.
524: \label{eq:SOSrelax}
525: \end{equation}
526: where $\R_{2d}[x]$ is the set of homogeneous polynomials of degree
527: $2d$.
528:
529: \begin{remark}
530: Theorem~\ref{thm:psdbound} requires a strictly positive polynomial
531: $p(x)$, so it would be natural to add some strict positivity condition
532: to the relaxation~(\ref{eq:SOSrelax}). For instance, one could require
533: for the polynomial $p(x)$ to belong to the relative interior of the
534: SOS cone. However, since interior-point methods by construction
535: always produce solutions in the relative interior of the corresponding
536: convex set, this is automatically satisfied if the problem is
537: feasible. Alternatively, it is possible to give a formulation that
538: includes terms of the form $\epsilon ||x||^{2d}$, for small positive
539: $\epsilon$. These modifications are unnecessary in practice.
540: \end{remark}
541:
542: For any fixed degree $d$ and any given $\gamma$, the constraints in
543: this problem are all of SOS type, and thus equivalent to semidefinite
544: programming. Therefore, the computation of $\rho_{SOS,2d}$ is a
545: quasiconvex problem, and can be easily solved with a standard SDP
546: solver, and a simple bisection method for the scalar variable
547: $\gamma$. By Theorem~\ref{thm:psdbound}, the solution of this
548: relaxation yields an upper bound on the joint spectral radius
549: \begin{equation}
550: \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d},
551: \label{eq:trivbound}
552: \end{equation}
553: where $2d$ is the degree of the approximating polynomial.
554:
555:
556:
557:
558:
559: %
560: %
561: %
562:
563:
564: \subsection{Quality of approximation}
565:
566: What can be said about the quality of the bounds produced by the SOS
567: relaxation? We present next some results to answer this question; a
568: more complete characterization is developed in
569: Section~\ref{sec:goodbounds}. An inspiring result in this direction is
570: the following theorem of Barvinok, that quantifies how tightly SOS
571: polynomials can approximate norms:
572: \begin{theorem}[\cite{Barvinok}, p.~221]
573: \label{thm:Barvinok}
574: Let $||\cdot||$ be a norm in $\R^n$. For any integer $d \geq 1$ there
575: exists a homogeneous polynomial $p(x)$ in $n$ variables of degree $2d$
576: such that
577: \begin{enumerate}
578: \item The polynomial $p(x)$ is a sum of squares.
579: \item For all $x \in \R^n$,
580: \[
581: p(x)^\frac{1}{2d} \leq ||x|| \leq k(n,d) \,
582: p(x)^\frac{1}{2d},
583: \]
584: where $k(n,d) := \binom{n+d-1}{d}^{\frac{1}{2d}}$.
585: \end{enumerate}
586: \end{theorem}
587: For fixed state dimension $n$, by increasing the degree $d$ of the
588: approximating polynomials, the factor in the upper bound can be made
589: arbitrarily close to one. In fact, for large $d$, we have the
590: approximation
591: \[
592: k(n,d) \; \approx \; 1 + \frac{n-1}{2} \frac{\log d}{d}.
593: \]
594:
595: %
596: %
597: %
598:
599: %
600: %
601: %
602: %
603: %
604: %
605: %
606: %
607: %
608: %
609: %
610: %
611: %
612: %
613: %
614: %
615: %
616: %
617: %
618: %
619: %
620: %
621: %
622: %
623: %
624: %
625: %
626:
627:
628: To apply these results to our problem, consider the following. If
629: $\rho(A_1,\ldots,A_m) < \gamma$, by Theorem~\ref{thm:RotaStrang} (and
630: sharper results in \cite{Bar88,Koz90,wirth}) there exists a norm
631: $\|\cdot\|$ such that
632: \[
633: ||A_i x|| \leq \gamma ||x||, \quad \forall x \in \R^n, i = 1,\ldots,m.
634: \]
635: By Theorem~\ref{thm:Barvinok}, we can therefore approximate this norm
636: with a homogeneous SOS polynomial $p(x)$ of degree $2d$ that will then
637: satisfy
638: \[
639: p(A_i x)^\frac{1}{2d}\leq ||A_i x|| \leq \gamma ||x||
640: \leq \gamma \, k(n,d) \, p(x)^\frac{1}{2d},
641: \]
642: and thus we know that there exists a feasible solution of
643: \[
644: \left\{
645: \begin{array}{rl}
646: p(x) \, & \mbox{is SOS}\\
647: \alpha^{2d} p(x) - p(A_i x) \, & \geq 0 \qquad i=1,\ldots,m,
648: \end{array}
649: \right.
650: \]
651: for $\alpha = k(n,d) \rho(A_1,\ldots,A_m)$.
652:
653: %
654: %
655: %
656:
657: Despite these appealing results, notice that in general we cannot yet
658: conclude from this that the proposed SOS relaxation will always obtain
659: a solution that is within $k(n,d)^{-1}$ from the true spectral
660: radius. The reason is that even though we can prove the existence of a
661: $p(x)$ that is SOS and for which $\alpha^{2d} p(x) - p(A_i x)$ are
662: nonnegative for all $i$, it is unclear whether the last $m$
663: expressions are actually SOS. We will show later in the paper that
664: this is indeed the case. Before doing this, we concentrate first on
665: two important cases of interest, where the described approach
666: guarantees a good quality of approximation.
667:
668: \paragraph{Planar systems.}
669: The first case corresponds to two-dimensional (planar) systems, i.e.,
670: when $n=2$. In this case, it always holds that nonnegative homogeneous
671: bivariate polynomials are SOS (e.g., \cite{Reznick}). Thus, we have
672: the following result:
673: \begin{theorem}
674: Let $\{A_1,\ldots,A_m\} \subset \R^{2 \times 2}$. Then, the SOS
675: relaxation~(\ref{eq:SOSrelax}) always produces a solution satisfying:
676: \[
677: {\textstyle \frac{1}{2}} \rho_{SOS,2d} \leq
678: (d+1)^{-\frac{1}{2d}} \,
679: \rho_{SOS,2d} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d}.
680: \]
681: This result is \emph{independent} of the number $m$ of matrices.
682: \end{theorem}
683:
684: \paragraph{Quadratic Lyapunov functions.}
685: In the quadratic case (i.e., $2d=2$), it is also true that nonnegative
686: quadratic forms are sums of squares. Since
687: \[
688: {\binom{n+d-1}{d}}^\frac{1}{2d} =
689: \binom{n}{1}^\frac{1}{2} = \sqrt{n},
690: \]
691: the inequality
692: \begin{equation}
693: \frac{1}{\sqrt{n}} \; \rho_{SOS,2} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2}
694: \label{eq:quadlyapbound}
695: \end{equation}
696: follows. This bound exactly coincides with the results of Ando and
697: Shih \cite{Ando98} or Blondel, Nesterov and Theys \cite{BlNT04}. This
698: is perhaps not surprising, since in this case both Ando and Shih's
699: proof \cite{Ando98} and Barvinok's theorem rely on the use of John's
700: ellipsoid to approximate the same underlying convex set.
701:
702:
703: \paragraph{Level sets and convexity}
704: Unlike the norms that appear in Theorem~\ref{thm:RotaStrang}, an
705: appealing feature of the SOS-based method is that we are not
706: constrained to use polynomials with convex level sets. This enables in
707: some cases much better bounds than what is promised by the theorems
708: above, as illustrated in the following example.
709:
710: \begin{figure}[t]
711: \centering
712: \includegraphics[width=0.5\columnwidth]{jsrepsilon2}
713: \caption{Level sets of the quartic homogeneous polynomial
714: $V(x_1,x_2)$. These define a Lyapunov function, under which both $A_1$
715: and $A_2$ are $(1+\epsilon)$-contractive. The value of $\epsilon$ is
716: here equal to $0.01$.}
717: \label{fig:jsr}
718: \end{figure}
719:
720: \begin{example}
721: This is based on a construction by Ando and Shih
722: \cite{Ando98}. Consider the problem of proving a bound on the joint
723: spectral radius of the following matrices:
724: \[
725: A_1 =
726: \left[\begin{array}{cc}
727: 1 & 0 \\ 1 & 0
728: \end{array}\right], \qquad
729: A_2 =
730: \left[\begin{array}{rr}
731: 0 & 1 \\ 0 & -1
732: \end{array}\right].
733: \]
734: For these matrices, it can be easily shown that
735: $\rho(A_1,A_2)=1$. Using a common quadratic Lyapunov function (i.e.,
736: the case $d=2$), the upper bound on the joint spectral radius is equal
737: to $\sqrt{2}$. However, a simple quartic SOS Lyapunov function is
738: enough to prove an upper bound of $1+\epsilon$ for every $\epsilon
739: >0$, since the SOS polynomial
740: \[
741: V(x) = (x_1^2-x_2^2)^2 + \epsilon (x_1^2+x_2^2)^2
742: \]
743: satisfies
744: \begin{eqnarray*}
745: (1+\epsilon) V(x) - V(A_1 x) &=& ( x_2^2-x_1^2+\epsilon (x_1^2+x_2^2) )^2 \\
746: (1+\epsilon) V(x) - V(A_2 x) &=& ( x_1^2-x_2^2+\epsilon (x_1^2+x_2^2) )^2.
747: \end{eqnarray*}
748: The corresponding level sets of $V(x)$ are plotted in
749: Figure~\ref{fig:jsr}, and are clearly non-convex.
750: \label{ex:ando}
751: \end{example}
752:
753:
754:
755: \section{Symmetric algebra and induced matrices}
756: \label{sec:symmalgebra}
757:
758: We present next some further bounds on the quality of the SOS
759: relaxation~(\ref{eq:SOSrelax}), either by a more refined analysis of
760: the SOS polynomials in Barvinok's theorem or by explicitly producing
761: an SOS Lyapunov function of guaranteed suboptimality properties. These
762: constructions are quite natural, and parallel some lifting ideas as
763: well as the classical iteration used in the solution of discrete-time
764: Lyapunov inequalities. Before proceeding further, we briefly revisit
765: some classical notions from multilinear algebra.
766:
767: \paragraph{Symmetric algebra of a vector space}
768: Consider a vector $x \in \R^n$, and an integer $d \geq 1$. We define
769: its $d$-lift $x^{[d]}$ as a vector in $\R^N$, where $N: =
770: \binom{n+d-1}{d}$, with components $\{ \sqrt{\alpha !} \, x^\alpha
771: \}_\alpha$, where $\alpha = (\alpha_1,\ldots,\alpha_n)$, $|\alpha| :=
772: \sum_i \alpha_i = d$, and $\alpha !$ denotes the multinomial
773: coefficient $\alpha ! := \binom{d}{\alpha_1,\alpha_2,\ldots,\alpha_n}=
774: \frac{d!}{\alpha_1! \alpha_2! \ldots \alpha_n!}$. That is, the
775: components of the lifted vector are the monomials of degree $d$,
776: scaled by the square root of the corresponding multinomial
777: coefficients.
778: \begin{example}
779: Let $n=2$, and $x = [u,v]^T$. Then, we have
780: \[
781: \left[\begin{array}{c} u \\ v \end{array}\right]^{[1]} =
782: \left[\begin{array}{c} u \\ v \end{array}\right], \qquad
783: \left[\begin{array}{c} u \\ v \end{array}\right]^{[2]} =
784: \left[\begin{array}{c} u^2 \\ \sqrt{2} u v \\ v^2 \end{array}\right], \qquad
785: \left[\begin{array}{c} u \\ v \end{array}\right]^{[3]} =
786: \left[\begin{array}{c} u^3 \\ \sqrt{3} u^2 v \\
787: \sqrt{3} u v^2 \\ v^3 \end{array}\right].
788: \]
789: \end{example}
790: The main motivation for this specific scaling of the components, is to
791: ensure that the lifting preserves some of the properties of the
792: underlying normed space. In particular, if $||\cdot||$ denotes the
793: standard Euclidean norm, it can be easily verified that $||x^{[d]}|| =
794: ||x||^d$. Thus, the lifting operation provides a norm-preserving (up
795: to power) embedding of $\R^n$ into $\R^N$. When the original space is
796: projective, this is the so-called \emph{Veronese} embedding.
797:
798: This concept can be directly extended from vectors to linear
799: transformations. Consider a linear map in $\R^n$, and the associated
800: $n \times n$ matrix $A$. Then, the lifting described above naturally
801: induces an associated map in $\R^N$, that makes the corresponding
802: diagram commute. The matrix representing this linear transformation
803: is the \emph{$d$-th induced matrix} of $A$, denoted by $A^{[d]}$,
804: which is the unique $N \times N$ matrix that satisfies
805: \[
806: A^{[d]} x^{[d]} = (A x)^{[d]}.
807: \]
808: In systems and control, these classical constructions of multilinear
809: algebra have been used under different names in several works, among
810: them \cite{BrockettLie,Zelen} and (implicitly) \cite{BlNes05}.
811: Although not mentioned in the Control literature, there exists a
812: simple explicit formula for the entries of these induced matrices; see
813: \cite{MarcusMultilinear,MarcusMinc}. The $d$-th induced matrix
814: $A^{[d]}$ has dimensions $N \times N$. Its entries are given by
815: \begin{equation}
816: (A^{[d]})_{\alpha \beta} = \frac{\mathrm{per}\, A(\alpha,\beta)}{\sqrt{\mu(\alpha) \mu(\beta)}},
817: \label{eq:perm}
818: \end{equation}
819: where the indices $\alpha,\beta$ are all the $d$-element multisets of
820: $\{1,\ldots,n\}$, the notation $\mathrm{per}$ indicates the
821: \emph{permanent}\footnote{The permanent of a matrix $A \in \R^{n
822: \times n}$ is defined as $\textrm{per}(A):=\sum_{\sigma \in \Pi_n}
823: \prod_{i=1}^n a_{i,\sigma(i)}$, where $\Pi_n$ is the set of all
824: permutations in $n$ elements.} of a square matrix, and $\mu(S)$ is the
825: product of the factorials of the multiplicities of the elements of the
826: multiset $S$.
827: \begin{example}
828: Consider the case $n=2$, $d=3$. The corresponding 3-element multisets
829: are $\{1,1,1\}$, $\{1,1,2\}$, $\{1,2,2\}$ and $\{2,2,2\}$. The third
830: induced matrix is then
831: \begin{align*}
832: A^{[3]} &=
833: \begin{bmatrix}
834: a_{11}^3& \sqrt{3} a_{11}^2 a_{12} & \sqrt{3} a_{11} a_{12}^2 & a_{12}^3 \\
835: \sqrt{3} a_{11}^2 a_{21}& a_{11} (a_{11} a_{22}+2 a_{21} a_{12})& a_{12} (2 a_{11} a_{22}+a_{21} a_{12}) & \sqrt{3} a_{12}^2 a_{22} \\
836: \sqrt{3} a_{11} a_{21}^2& a_{21} (2 a_{11} a_{22}+a_{21} a_{12})& a_{22} (a_{11} a_{22}+2 a_{21} a_{12}) & \sqrt{3} a_{12} a_{22}^2 \\
837: a_{21}^3& \sqrt{3} a_{21}^2 a_{22} & \sqrt{3} a_{21} a_{22}^2 & a_{22}^3 \\
838: \end{bmatrix}.
839: \end{align*}
840: %
841: %
842: %
843: %
844: %
845: %
846: %
847: %
848: %
849: %
850: %
851: %
852: %
853: \end{example}
854: It can be shown that these operations define an algebra homomorphism,
855: i.e., they respect the structure of matrix multiplication. In
856: particular, for any matrices $A,B$ of compatible dimensions, the
857: following identities hold:
858: \[
859: (A B)^{[d]} = A^{[d]} B^{[d]}, \qquad (A^{-1})^{[d]} = (A^{[d]})^{-1}.
860: \]
861: Furthermore, there is a simple and appealing relationship between the
862: eigenvalues of $A^{[d]}$ and those of $A$. Concretely, if
863: $\lambda_1,\ldots,\lambda_n$ are the eigenvalues of $A$, then the
864: eigenvalues of $A^{[d]}$ are given by $\prod_{j \in S} \lambda_j$
865: where $S \subseteq \{1,\ldots,n\}, |S| = d$; there are exactly
866: $\binom{n+d-1}{d}$ such multisets. A similar relationship holds for
867: the corresponding eigenvectors. Essentially, as explained below in
868: more detail, the induced matrices are the symmetry-reduced
869: version of the $d$-fold Kronecker product.
870:
871: The symmetric algebra and associated induced matrices are classical
872: objects of multilinear algebra. Induced matrices, as defined above, as
873: well as the more usual \emph{compound matrices}, correspond to two
874: specific isotypic components of the decomposition of the $d$-fold
875: tensor product under the action of the symmetric group $S^d$ (i.e.,
876: the \emph{symmetric} and \emph{skew-symmetric} algebras). Compound
877: matrices are associated with the alternating character (hence their
878: relationship with determinants), while induced matrices correspond
879: instead to the trivial character, thus the connection with
880: permanents. Similar constructions can be given for any other character
881: of the symmetric group, by replacing the permanent in (\ref{eq:perm})
882: with the suitable immanants; see \cite{MarcusMultilinear} for
883: additional details.
884:
885:
886: \subsection{Bounds on the quality of $\rho_{SOS,2d}$}
887: \label{sec:goodbounds}
888:
889: In this section we present a bound on the approximation properties of
890: the SOS approximation, based on the ideas introduced above. As we will
891: see, the techniques based on the lifting described will exactly yield
892: the factor $k(n,d)^{-1}$ suggested by Barvinok's theorem.
893:
894: We first prove a preliminary result on the behavior of the joint
895: spectral radius under $d$-lifting. The scaling properties described
896: earlier can be applied to obtain the following:
897: \begin{lemma}
898: Given matrices $\{A_1,\ldots,A_m\} \subset \R^{n \times n}$ and an
899: integer $d \geq 1$, the following identity holds:
900: \[
901: \rho(A_1^{[d]},\ldots,A_m^{[d]}) = \rho(A_1,\ldots,A_m)^d.
902: \]
903: \label{lem:scalingjsr}
904: \end{lemma}
905: The proof follows directly from the definition~(\ref{eq:defjsr}) and
906: the two properties $(A B)^{[d]} = A^{[d]} B^{[d]}$, $||x^{[d]}|| =
907: ||x||^d$, and it is thus omitted.
908:
909: Combining all these inequalities, we obtain the main result of this paper:
910: \begin{theorem}
911: The SOS relaxation (\ref{eq:SOSrelax}) satisfies:
912: \begin{equation}
913: {\textstyle\binom{n+d-1}{d}}^{-\frac{1}{2d}} \; \rho_{SOS,2d} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d}.
914: \label{eq:sos2dbound}
915: \end{equation}
916: \label{thm:sos2dbound}
917: \end{theorem}
918: \begin{proof}
919: Since the dimension of $A_i^{[d]}$ is $\binom{n+d-1}{d}$, from
920: Lemma~\ref{lem:scalingjsr} and inequality (\ref{eq:quadlyapbound}) it
921: follows that:
922: \[
923: {\textstyle\binom{n+d-1}{d}}^{-\frac{1}{2}}
924: \; \rho_{SOS,2}(A_1^{[d]},\ldots,A_m^{[d]})
925: \leq
926: \rho(A_1^{[d]},\ldots,A_m^{[d]}) = \rho(A_1,\ldots,A_m)^d.
927: \]
928: Combining this with (\ref{eq:trivbound}) and the inequality (proven
929: later in Theorem~\ref{thm:3bounds}),
930: \[
931: \rho_{SOS,2d}(A_1,\ldots,A_m)^d \leq \rho_{SOS,2}(A_1^{[d]},\ldots,A_m^{[d]}),
932: \]
933: the result follows.
934: \end{proof}
935:
936:
937: \section{Sum of squares Lyapunov iteration}
938: \label{sec:soslyap}
939:
940: We describe next an alternative approach to obtain bounds on the
941: quality of the SOS approximation. As opposed to the results in the
942: previous section, the bounds now explicitly depend on the number of
943: matrices, but will usually be tighter in the case of small $m$.
944:
945: Consider the iteration defined by
946: \begin{equation}
947: V_0(x) = 0, \qquad V_{k+1}(x) = Q(x) + \frac{1}{\beta} \sum_{i=1}^m V_k(A_i x),
948: \label{eq:iteration}
949: \end{equation}
950: where $Q(x)$ is a fixed $n$-variate homogeneous polynomial of degree
951: $2d$ and $\beta > 0$. The iteration defines an affine map in the
952: space of homogeneous polynomials of degree $2d$. As usual, the
953: iteration will converge under certain assumptions on the spectral
954: radius of this linear operator.
955: \begin{theorem}
956: The iteration defined in (\ref{eq:iteration}) converges for arbitrary
957: $Q(x)$ if $\rho(A_1^{[2d]} + \cdots + A_m^{[2d]}) < {\beta}$.
958: \label{thm:convergence}
959: \end{theorem}
960: \begin{proof}
961: The vector space of homogenous polynomials $\R_{2d}[x_1,\ldots,x_n]$
962: is naturally isomorphic to the space of linear functionals on
963: $(\R^n)^{[2d]}$, via the identification $V_k(x) = \langle v_k ,
964: x^{[2d]} \rangle$, where $v_k \in \R^{\binom{n+2d-1}{2d}}$ is the
965: vector of (scaled) coefficients of $V_k(x)$. Then, since $V_k(A_i x) =
966: \langle v_k, (A_i x)^{[2d]} \rangle = \langle v_k, A_i^{[2d]} x^{[2d]}\rangle=
967: \langle (A_i^{[2d]})^T v_k, x^{[2d]}\rangle$, the iteration
968: (\ref{eq:iteration}) can be simply expressed as:
969: \[
970: v_{k+1} = q + \frac{1}{\beta} \left( \sum_{i=1}^m A_i^{[2d]} \right)^T v_{k},
971: \]
972: and it is well known that an affine iteration converges if the
973: spectral radius of the linear term is less than one.
974: \end{proof}
975:
976: For simplicity of notation, we define the following quantity,
977: corresponding to the spectral radius of the sum of the $2d$-lifted
978: matrices:
979: \begin{equation}
980: \rho_{SR,2d} := \rho(A_1^{[2d]} + \cdots + A_m^{[2d]})^\frac{1}{2d}.
981: \label{eq:rhold}
982: \end{equation}
983:
984: \begin{theorem}
985: \label{thm:sosvsnesterov}
986: The following inequality holds:
987: \[
988: \rho_{SOS,2d} \leq \rho_{SR,2d}
989: \]
990: \end{theorem}
991: \begin{proof}
992: Choose a $Q(x)$ that is in the interior of the SOS cone, e.g., $Q(x)
993: := (\sum_{i=1}^n x_i^2)^d$, and let $\beta = \rho(A_1^{[2d]} + \cdots
994: + A_m^{[2d]})+\epsilon$. The iteration~(\ref{eq:iteration}) guarantees
995: that $V_{k+1}$ is SOS if $V_{k}$ is. By induction, all the iterates
996: $V_k$ are SOS. By the choice of $\beta$ and
997: Theorem~\ref{thm:convergence}, the $V_k$ converge to some homogeneous
998: polynomial $V_\infty(x)$. By the closedness of the cone of SOS
999: polynomials, the limit $V_\infty$ is also SOS. Furthermore, we have
1000: \[
1001: \beta V_\infty(x) - V_\infty(A_i x) = \beta Q(x) + \sum_{j \not = i} V_\infty (A_j x)
1002: \]
1003: and therefore the expression on the left-hand side is SOS. This
1004: implies that $p(x):=V_\infty(x)$ is a feasible solution of the SOS
1005: relaxation (\ref{eq:SOSrelax}). Taking $\epsilon \rightarrow 0$, the
1006: result follows.
1007: \end{proof}
1008: Notice that if the spectral radius condition in
1009: Theorem~\ref{thm:convergence} is satisfied, then for any fixed $Q(x)$
1010: the corresponding limit $V_\infty(x) = \langle v_\infty,
1011: x^{[2d]}\rangle$ can be simply obtained by solving the nonsingular
1012: system of linear equations
1013: \[
1014: \left(I- \frac{1}{\beta}\sum_{i=1}^m A_i^{[2d]} \right)^T v_\infty = q,
1015: \]
1016: thus generalizing the standard Lyapunov equation. The iteration
1017: argument is only used to prove that the solution of this linear system
1018: yields a strictly positive SOS polynomial. A slightly different
1019: approach here is via the finite-dimensional version of the
1020: Krein-Rutman theorem (or generalized Perron-Frobenius); see for
1021: instance \cite{Protasov1} or \cite{ParriloKhatri}.
1022:
1023: \begin{theorem}
1024: The SOS relaxation (\ref{eq:SOSrelax}) satisfies:
1025: \[
1026: m^{-\frac{1}{2d}} \, \rho_{SOS,2d} \leq \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d}.
1027: \]
1028: \label{thm:msos2dbound}
1029: \end{theorem}
1030: \begin{proof}
1031: This follows directly from inequality~(\ref{eq:trivbound}), and the fact that
1032: \[
1033: \rho_{SOS,2d} \leq \rho\left(\sum_{i=1}^m A_i^{[2d]}\right)^\frac{1}{2d} \\
1034: \leq m^\frac{1}{2d} \cdot \rho (A_1^{[2d]}, \ldots, A_m^{[2d]})^\frac{1}{2d} \\
1035: = m^\frac{1}{2d} \cdot \rho \left(A_1, \ldots , A_m \right),
1036: \]
1037: where the first inequality is Theorem~\ref{thm:sosvsnesterov}, the
1038: second one follows from the general fact that $\rho(A_1+\cdots+A_m)
1039: \leq m \rho(A_1,\ldots,A_m)$ (see e.g., Corollary 1 in \cite{BlNes05}), and
1040: the third from Lemma~\ref{lem:scalingjsr}.
1041: \end{proof}
1042: The iteration~(\ref{eq:iteration}) is the natural generalization of
1043: the Lyapunov recursion for the single matrix case, and of the
1044: construction by Ando and Shih in \cite{Ando98} for the quadratic
1045: case. By the remarks in Section~\ref{sec:symmalgebra} above, and as
1046: described in more detail in the next section, it can be shown that the
1047: quantity $\rho_{SR,2d}$ is essentially equal to those defined by
1048: Protasov in \cite[\S 4]{Protasov1} and Blondel and Nesterov in
1049: \cite{BlNes05}. As a consequence of Theorem~\ref{thm:sosvsnesterov},
1050: the SOS-based approach will \emph{always} produce estimates at least
1051: as good as the ones given by these procedures.
1052:
1053: \section{Comparison with earlier techniques}
1054: \label{sec:comparison}
1055:
1056: In this section we compare the $\rho_{SOS,2d}$ approach with some
1057: earlier bounds from the literature. We show that our bound is never
1058: weaker than those obtained by all the other procedures.
1059:
1060: \subsection{Methods of Protasov and Blondel-Nesterov}
1061:
1062: Protasov \cite{Protasov1} has shown that an upper bound on the
1063: ``standard'' joint spectral radius can be computed via the so-called
1064: joint $p$-radius, a generalization of the definition~(\ref{eq:defjsr})
1065: involving $p$-norms. Furthermore, he has shown that in the case of
1066: even integer $p$, the value of the $p$-radius of an irreducible finite
1067: set of matrices exactly corresponds to the spectral radius of a single
1068: operator, that can in principle be constructed based on the matrices
1069: $A_i$.
1070:
1071: Independently, Blondel and Nesterov \cite{BlNes05} developed a
1072: technique based on the calculation of the spectral radius of
1073: ``lifted'' matrices. In fact, they present two different lifting
1074: procedures (``Kronecker'' and ``semidefinite'' liftings), and in
1075: Section~5 of their paper, they describe a family of bounds obtained by
1076: arbitrary combinations of these two liftings.
1077:
1078: Both of these methods are in fact equivalent to our construction of
1079: $\rho_{SR,2d}$ in Section~\ref{sec:soslyap}, in the sense that they
1080: all yield exactly the same numerical value. By
1081: Theorem~\ref{thm:sosvsnesterov}, they are thus also weaker than the
1082: SOS-based construction. The bound defined by $\rho_{SR,2d}$
1083: in~(\ref{eq:rhold}) relies on a single canonically defined lifting,
1084: and requires much less numerical effort than the Blondel-Nesterov
1085: construction. Furthermore, instead of the somewhat more complicated
1086: construction of Protasov, the expression of the entries of the lifted
1087: matrices are given by the simple formula~(\ref{eq:perm}), making a
1088: computer implementation straightforward, with no irreducibility
1089: assumptions being required.
1090:
1091: It can be shown that our construction (or Protasov's) exactly
1092: corresponds to a fully symmetry-reduced version of the
1093: Blondel-Nesterov procedure, thus yielding equivalent bounds, but at a
1094: much smaller computational cost since the corresponding matrices are
1095: exponentially smaller (for fixed $n$, the size grows as $O(d^{n-1})$
1096: as opposed to $O(n^{2d})$). Therefore, even if no SDPs are to be
1097: solved (as would be required by the tighter bound $\rho_{SOS,2d}$),
1098: the formulation in terms of the matrices $A_i^{[2d]}$ still has many
1099: advantages.
1100:
1101: \begin{table}[t]
1102: \begin{center}
1103: \begin{tabular}{|c|c || c|c || c|c || c|c|}
1104: \hline
1105: & & \multicolumn{2}{c||}{ \cite{BlNes05}, Kronecker } &
1106: \multicolumn{2}{c||}{ \cite{BlNes05}, semidefinite } &
1107: \multicolumn{2}{c|}{This paper} \\
1108: \cline{3-8}
1109: Steps / $2d$ &Accuracy&$n=2$& $n=10$ &$n=2$ & $n=10$ & $n=2$ & $n=10$ \\
1110: \hline\hline
1111: 1 / 2 & 0.707 & 4 & 100 & 3 & 55 & 3 & 55 \\
1112: 2 / 4 & 0.840 & 16 & 10000 & 6 & 1540 & 5 & 715 \\
1113: 3 / 8 & 0.917 & 256 & $10^8$ & 21 & 1186570 & 9 & 24310 \\
1114: 4 / 16 & 0.957 & 65536 & $10^{16}$ & 231 & $7.04 \times 10^{11}$ & 17 & 2042975 \\
1115: 5 / 32 & 0.978 & $4.29\times 10^9$ & $10^{32}$ & 26796 & $2.48 \times 10^{23}$& 33 & $3.5 \times 10^8$ \\
1116: \hline
1117: \end{tabular}
1118: \end{center}
1119: \caption{Comparison of matrix sizes for the different lifting
1120: procedures to compute $\rho_{SR,2d}$. The matrix size for the
1121: Kronecker lifting is $n^{2d}$, while the recursive semidefinite
1122: lifting is given by the $d$-step recursion $s_{2k} = \binom{s_k+1}{2}$
1123: with $s_1=n$, and the size for the symmetric algebra approach is
1124: $\binom{n+2d-1}{2d}$. The accuracy estimates correspond to the case of
1125: two matrices, i.e., $m=2$.}
1126: \label{tab:BNtwo}
1127: \end{table}
1128: As an illustrative comparison of the advantages of this reduced
1129: formulation, in Table~\ref{tab:BNtwo} we present the sizes of the
1130: matrices required by the method in~\cite{BlNes05} (using the
1131: ``Kronecker'' and ``recursive semidefinite'' liftings) and our
1132: approach to $\rho_{SR,2d}$ via the symmetric algebra. The data in
1133: Table~\ref{tab:BNtwo} corresponds to that in~\cite[p.~266]{BlNes05}
1134: (with a minor misprint corrected).
1135:
1136:
1137: \subsection{Common quadratic Lyapunov functions}
1138:
1139: This method corresponds to finding a common quadratic Lyapunov
1140: function, either directly for the matrices $A_i$, or for the lifted
1141: matrices $A_i^{[d]}$. Specifically, let
1142: \[
1143: \rho_{CQ,2d} := \inf \, \left \{ \; \gamma \; \; | \; \; \gamma^{2d} P -
1144: (A_i^{[d]})^T P A_i^{[d]} \succeq 0, \quad P \succ 0 \right \}.
1145: \]
1146: This is essentially equivalent to what is discussed in Corollary 3 of
1147: \cite{BlNes05}, except that the matrices involved in our approach are
1148: exponentially smaller (of size $\binom{n+d-1}{d}$ rather than $n^d$),
1149: as all the symmetries have been taken out\footnote{There seems to be a
1150: typo in equation (7.4) of \cite{BlNes05}, as all the terms $A_i^k$
1151: should likely read $A_i^{\otimes k}$.}. Notice also that, as a
1152: consequence of their definitions, we have
1153: \[
1154: \rho_{CQ,2d}(A_1,\ldots,A_m)^d = \rho_{SOS,2}(A_1^{[d]},\ldots,A_m^{[d]}).
1155: \]
1156:
1157: We can then collect most of these results in a single theorem:
1158: \begin{theorem}
1159: The following inequalities between all the bounds hold:
1160: \begin{equation}
1161: \rho(A_1,\ldots,A_m) \leq \rho_{SOS,2d} \leq \rho_{CQ,2d} \leq
1162: \rho_{SR,2d}.
1163: \label{eq:ineqs}
1164: \end{equation}
1165: \label{thm:3bounds}
1166: \end{theorem}
1167: \begin{proof}
1168: The left-most inequality is~(\ref{eq:trivbound}). The right-most
1169: inequality follows from a similar (but stronger) argument to the one
1170: given in Theorem~\ref{thm:sosvsnesterov} above, since the spectral
1171: radius condition $\rho(A_1^{[2d]}+\cdots + A_m^{[2d]})< \beta$
1172: actually implies the convergence of the matrix iteration in
1173: $\mathcal{S}^{N}$ given by
1174: \[
1175: P_{k+1} = Q + \frac{1}{\beta} \sum_{i=1}^m (A_i^{[d]})^T P_k A_i^{[d]}, \qquad P_0 = I.
1176: \]
1177:
1178: For the middle inequality, let $p(x):= (x^{[d]})^T P
1179: x^{[d]}$. Since $P \succ 0$, it follows that $p(x)$ is SOS. From
1180: $\gamma^{2d} P - (A_i^{[d]})^T P A_i^{[d]} \succeq 0$, left- and
1181: right-multiplying by $x^{[d]}$, we have that $\gamma^{2d} p(x) - p(A_i
1182: x)$ is also SOS, and thus $p(x)$ is a feasible solution
1183: of~(\ref{eq:SOSrelax}), from where the result directly follows.
1184: \end{proof}
1185:
1186: \begin{remark}
1187: We always have $\rho_{SOS,2} = \rho_{CQ,2}$, since both correspond
1188: to the case of a common quadratic Lyapunov function for the matrices $A_i$.
1189: \end{remark}
1190:
1191: \subsection{Computational cost}
1192: In this section we quantify the computational cost of the bound
1193: $\rho_{SOS,2d}$. In the following calculations we keep $d$ fixed, and
1194: study the scaling behavior as a function of the dimension $n$.
1195:
1196: As mentioned in Section~\ref{sec:sosnorms}, solving a semidefinite
1197: programming problem typically requires several Newton iterations, with
1198: the cost of each iteration being dominated by the construction of the
1199: Hessian and solution of the corresponding linear system. For the SOS
1200: bound $\rho_{SOS,2d}$, the underlying SDP problem has $m+1$ matrix
1201: inequalities corresponding to the SOS constraints
1202: in~(\ref{eq:SOSrelax}), each of dimension $\binom{n+d-1}{d} \approx
1203: \frac{1}{d!} \cdot n^d$, which is $O(n^d)$ for fixed $d$. The number of
1204: decision variables is approximately $m \cdot \binom{n+2d-1}{2d}
1205: \approx m \cdot n^{2d}$. Thus, using a simple bisection method for
1206: $\gamma$, exploiting the block-diagonal structure, and the fact that
1207: the number of Newton iterations is essentially constant, we obtain
1208: that the approximate cost of obtaining an $\epsilon$-approximate
1209: solution of $\rho_{SOS,2d}$ is $O(m \cdot n^{6d} \cdot \log
1210: \frac{1}{\epsilon})$, where $d$ is chosen such that $\epsilon \approx
1211: \frac{n}{2} \frac{\log d}{d}$ or $\epsilon \approx m^{-\frac{1}{2d}}$,
1212: depending on whether we use bounds that depend on the number of
1213: matrices (Theorem~\ref{thm:msos2dbound}) or not (Theorem
1214: \ref{thm:sos2dbound}).
1215:
1216: We remark that these quantities are a relatively coarse estimate of
1217: the best possible algorithmic complexity, since very little structure
1218: of the corresponding SDP problem is being exploited. It is known that
1219: for structured problems such as the ones appearing here much more
1220: efficient SDP-based algorithms can be developed. In particular, in the
1221: context of sum of squares problems several techniques are known to
1222: exploit some of the available structure for more efficient
1223: computation; see \cite{GHNV,LofbergParrilo,RohVandenberghe}.
1224:
1225: \subsection{Examples}
1226: We present next two numerical examples that compare the described
1227: techniques. In particular, we show that the bounds in
1228: Theorem~\ref{thm:3bounds} can all be strict.
1229:
1230: \begin{example}
1231: Here we revisit the construction presented earlier in
1232: Example~\ref{ex:ando}. For the matrices given there we have:
1233: \begin{align*}
1234: \rho_{SOS,2} &= \sqrt{2}, &
1235: \rho_{CQ,2}&= \sqrt{2}, &
1236: \rho_{SR,2d} &= \sqrt[2d]{2},
1237: \\
1238: \rho_{SOS,4} &= 1, &
1239: \rho_{CQ,4}&= 1. &
1240: %
1241: \end{align*}
1242: \end{example}
1243:
1244: \begin{example}
1245: \label{ex:threemats}
1246: Consider the three $4 \times 4 $ matrices (randomly generated) given by:
1247: \[
1248: A_1 =
1249: \left[
1250: \begin{array}{rrrr}
1251: 0 & 1 & 7 & 4 \\
1252: 1 & 6 & -2 & -3 \\
1253: -1 & -1 & -2 & -6 \\
1254: 3 & 0 & 9 & 1
1255: \end{array}
1256: \right],
1257: \quad
1258: A_2 =
1259: \left[
1260: \begin{array}{rrrr}
1261: -3 & 3 & 0 & -2\\
1262: -2 & 1 & 4 & 9\\
1263: 4 & -3 & 1 & 1\\
1264: 1 & -5 & -1 & -2
1265: \end{array}
1266: \right],
1267: \quad
1268: A_3 =
1269: \left[
1270: \begin{array}{rrrr}
1271: 1 & 4 & 5 & 10 \\
1272: 0 & 5 & 1 & -4 \\
1273: 0 & -1 & 4 & 6 \\
1274: -1 & 5 & 0 & 1
1275: \end{array}
1276: \right].
1277: \]
1278: The value of the different approximations are presented in
1279: Table~\ref{tab:comparison}. A lower bound is $\rho(A_1
1280: A_3)^\frac{1}{2} \approx 8.9149$, which is extremely close (and
1281: perhaps exactly equal) to the upper bound $\rho_{SOS,4}$. Notice from
1282: the $d=2$ entry of Table~\ref{tab:comparison} that all the
1283: inequalities~(\ref{eq:ineqs}) can be strict.
1284:
1285:
1286: \begin{table}[t]
1287: \begin{center}
1288: \begin{tabular}{|c|cc|ccc|}
1289: \hline $d$ & $\dim A_i^{[d]}$ & $\dim A_i^{[2d]}$ & $\rho_{SOS,2d}$ &
1290: $\rho_{CQ,2d}$ & $\rho_{SR,2d}$ \\ \hline 1 & 4 & 10 & 9.761 & 9.761 &
1291: 12.519 \\ 2 & 10 & 35 & 8.92 & 9.01 & 9.887 \\ 3 & 20 & 84 & 8.92 &
1292: 8.92 & 9.3133 \\ \hline
1293: \end{tabular}
1294: \end{center}
1295: \caption{Comparison of the different approximations for Example~\ref{ex:threemats}.}
1296: \label{tab:comparison}
1297: \end{table}
1298: \end{example}
1299:
1300: \section{Conclusions}
1301: \label{sec:conclusions}
1302:
1303: We introduced a novel scheme for the approximation of the joint
1304: spectral radius of a set of matrices using sum of squares
1305: programming. The method is based on the use of a multivariate
1306: polynomial to provide a norm-like quantity under which all matrices
1307: are contractive. We provided an asymptotically tight estimate for the
1308: quality of the bound, which is independent of the number of
1309: matrices. We also proposed an alternative bound, that depends on the
1310: number $m$ of matrices, based on a generalization of a Lyapunov
1311: iteration.
1312:
1313: Our results can be alternatively interpreted in a simpler way as
1314: providing a trajectory-preserving lifting to a higher dimensional
1315: space, and proving contractiveness with respect to an ellipsoidal norm
1316: in that space. In this case, a weaker estimate can be obtained by
1317: computing the spectral radius of a fixed matrix. These results
1318: generalize earlier work of Ando and Shih~\cite{Ando98}, Blondel,
1319: Nesterov and Theys~\cite{BlNT04}, and provide an improvement over the
1320: lifting procedure of Blondel and Nesterov~\cite{BlNes05}. The good
1321: performance of our procedure was also verified using numerical
1322: examples.
1323:
1324: \paragraph{Acknowledgement}
1325: We thank the referees for their careful reading of the manuscript, and
1326: their many useful suggestions.
1327:
1328: %
1329: %
1330:
1331: \bibliographystyle{alpha}
1332: \bibliography{jsr}
1333:
1334: %
1335: %
1336: \end{document}
1337: