1: \documentclass[10pt, conference]{ieeeconf}
2: %\documentclass[a4paper, 10pt, conference]{ieeeconf} % Use this line for a4
3: % paper
4:
5: \IEEEoverridecommandlockouts % This command is only
6: % needed if you want to
7: % use the \thanks command
8: \overrideIEEEmargins
9: % See the \addtolength command later in the file to balance the column lengths
10: % on the last page of the document
11:
12: % The following packages can be found on http:\\www.ctan.org
13: \usepackage{graphics} % for pdf, bitmapped graphics files
14: \usepackage{epsfig} % for postscript graphics files
15: \usepackage{mathptmx} % assumes new font selection scheme installed
16: \usepackage{times} % assumes new font selection scheme installed
17: \usepackage{amsmath} % assumes amsmath package installed
18: \usepackage{amssymb} % assumes amsmath package installed
19:
20: %\newtheorem{theorem}{Theorem}
21: %\newtheorem{lemma}{Lemma}
22: %\title{\LARGE \bf
23: %Tour de Finance }
24:
25: %\author{
26: %C. Dougal, D. Merriman, J. Humpherys, and S. Warnick
27: %\\
28: %\\
29: %Information and Decision Algorithms Laboratories, Brigham Young University, UT\\
30: %http://idealabs.byu.edu
31: %\documentclass[12pt,draftcls,onecolumn]{tac/IEEEtran}
32: %{article}
33:
34: % The following packages can be found on http:\\www.ctan.org
35: %\usepackage{graphics} % for pdf, bitmapped graphics files
36: %\usepackage{epsfig} % for postscript graphics files
37: %\usepackage{mathptmx} % assumes new font selection scheme installed
38: %\usepackage{times} % assumes new font selection scheme installed
39: %\usepackage{amsmath} % assumes amsmath package installed
40: %\usepackage{amssymb} % assumes amsmath package installed
41: \usepackage{psfrag}
42: \usepackage{subfigure}
43:
44: %\input{../../../jorge/latex/head_2}
45:
46: %\setstretch{1.5} % 1 1/2 spacing
47:
48: %\def\ssp{\def\baselinestretch{1.0}\large\normalsize}
49: %\def\dsp{\def\baselinestretch{1.5}\large\normalsize}
50:
51: %\topmargin=-21mm
52: %\setlength{\textwidth}{6.5in}
53: %\setlength{\textheight}{9.52in}
54: %\setlength{\evensidemargin}{-0.3cm}
55: %\setlength{\oddsidemargin}{-0.1cm}
56:
57: %So that figures, tables and text can be put in the same column
58: \renewcommand{\textfraction}{0.0}
59: \renewcommand{\floatpagefraction}{1.0}
60: \renewcommand{\topfraction}{1.0}
61: \renewcommand{\bottomfraction}{1.0}
62: % So that text and floats are close together
63: \addtolength{\textfloatsep}{-0.2cm}
64: \setlength{\intextsep}{0.1in}
65: \setlength{\floatsep}{0.1in}
66:
67: \def\reals{\hbox{I\kern -.19em R}}
68: \newtheorem{proposition}{Proposition}
69: \newtheorem{theorem}{Theorem}
70: \newtheorem{definition}{Definition}
71: \newtheorem{corollary}{Corollary}
72: \newtheorem{lemma}{Lemma}
73: \newcommand{\myBox}{\hfill\rule{2mm}{2mm}}
74:
75: \newcommand{\todo}[1]{\vspace{5 mm}\par \noindent
76: \framebox{\begin{minipage}[c]{8.3cm} \tt #1
77: \end{minipage}}\vspace{5 mm}\par}
78:
79: %\addtolength{\topmargin}{-0.875in}
80: %\addtolength{\oddsidemargin}{-0.875in}
81:
82:
83: %\newcommand{\myBox}{\hfill\rule{2mm}{2mm}}
84:
85: % ====== title page information ====
86: \title{ \vspace{-10mm}\Huge Dynamical Structure Functions for the Estimation of LTI Networks with Limited Information\vspace{.5cm}}
87: \author{\hspace{.75in}\begin{tabular}{c}Jorge Gon\c{c}alves\\Control Group\\Department of Engineering\\Cambridge University\\Cambridge CB2 1PZ, UK\\{\tt jmg77@cam.ac.uk}\end{tabular}\hspace{1in}\begin{tabular}{c}Sean Warnick\\Information and Decision Algorithm Laboratories\\Department of Computer Science\\Brigham Young University\\Provo, UT 84602, USA\\{\tt sean@cs.byu.edu}\end{tabular}}
88: %\thanks{S. Warnick is with the Department of Computer Science,
89: % Brigham Young University, 3361 TMCB PO Box 26576, Provo, UT 84602, USA
90: % {\tt\small sean@cs.byu.edu}}%
91: %\thanks{J. Goncalves is with the Department of Engineering, Cambridge University,
92: % Cambridge, UK
93: % {\tt\small jmg77@cam.ac.uk}}%
94: %}
95: % \\
96: %{\tt \small http://www.cds.caltech.edu/$\sim$jmg/}
97:
98: %\date{}
99:
100:
101: \begin{document}
102: \bibliographystyle{plain}
103:
104: \maketitle
105:
106: \vspace{-1.65cm}
107: %\hspace{11cm}
108: %{\LARGE ACC00-IEEE1266}
109: %{\LARGE Regular paper}
110: %\vspace{8cm}
111: \thispagestyle{empty}%\pagestyle{empty}
112:
113: \begin{abstract}
114: This research explores the role and representation of network
115: structure for LTI Systems. We demonstrate that transfer functions
116: contain no structural information without more assumptions being
117: made about the system, assumptions that we believe are unreasonable
118: when dealing with truly complex systems. We then introduce
119: Dynamical Structure Functions as an alternative, graphical-model
120: based representation of LTI systems that contain both dynamical and
121: structural information of the system. We use Dynamical Structure to
122: prove necessary and sufficient conditions for estimating structure
123: from data, and demonstrate, for example, the danger of attempting to
124: use steady-state information to estimate network structure.
125:
126: % \todo{Instead of focusing on the steady-state issue, I would just say: ``and demonstrate with simple examples how erroneous network structures can be obtained if those conditions are not satisfied.'' or something like that.}
127: \end{abstract}
128:
129:
130: %\vspace{-0.6cm}
131: \section{Introduction}
132:
133: One of the fundamental issues for modeling, identifying, and
134: controlling complex networked systems is inferring system structure
135: from input-output data. Structure is often the key for understanding
136: a variety of complex systems because it enables a decomposition of the
137: complete system into an interconnection of subsystems. When analysis
138: of the subsystems is comparatively simple, and the interconnection
139: structure is well understood, then the behavior of the complex system
140: can be deduced from an understanding of its components. Moreover,
141: exploiting structural information can tremendously reduce the
142: conservatism of robust solutions designed to compensate for system
143: uncertainty. This impact on complexity and uncertainty makes
144: structural information extremely important in the analysis of complex
145: networked systems.
146:
147: Examples of scientists working on identifying or exploiting network
148: structure arise in a variety of disciplines. Social scientists have
149: developed a rich literature on the use of network models to describe
150: interpersonal associations, perhaps one of the most famous works being
151: Milgram's "small world" experiment in the 1960's in which letters
152: passed from person to person were able to reach a particular target
153: individual in only about six steps \cite{Milgram}. More recently,
154: attention has focused on networks of business communities
155: \cite{Galaskiewicz,Mariolis, Mizruchi}, internet-enabled virtual
156: communities \cite{Holme}, citation networks in scientific communities
157: \cite{Redner, Seglen}, preference networks for product recommender
158: systems \cite{Goldberg, Resnick}, distribution networks \cite{Amaral},
159: and the detection and destabilization of terrorist networks
160: \cite{carley}. Epidemiologists have developed models for the dynamics
161: of both epidemic and endemic diseases spreading through population
162: networks \cite{Hethcote}, computer scientists have developed
163: algorithms for searching over networks that are deployed in a number
164: of popular applications \cite{Brin}, and biologists use microarray and
165: other data sources to infer the regulation structure in genomic,
166: proteomic, and metabolic networks \cite{Guelzim, Maslov, Stelling,
167: Uetz} .
168:
169: Discovering structure from data, however, can be difficult. Typical
170: identification methods do not emphasize structure estimation, but
171: rather focus on behavior generalization by selecting a model that
172: accurately predicts system outputs for unobserved inputs. As long as
173: the dynamic behavior of the system is accurately described, the
174: question of structure is often avoided altogether. For many
175: applications, various model structures for the same input-output map
176: are equally useful for forecasting and control. Nevertheless,
177: sometimes it is important not only to describe the system dynamics
178: accurately, but to do so with a model that correctly represents the
179: structure of the original system.
180:
181: In contrast with these identification methods that emphasize system
182: dynamics over structure, inference methods have been developed that
183: emphasize structure over dynamics. These methods employ graphical
184: models to describe network structure. Nodes represent system states,
185: understood to be random variables, and edges indicate conditional
186: dependence between variables. Using Bayes rule, measurements can then
187: be used to update prior distributions believed to characterize
188: relationships throughout the network. A rich literature has grown in
189: this area, and even issues of inferring causality from correlation
190: have been addressed at some level \cite{Jensen, Jordan, Pearl}.
191:
192: Nevertheless, although these Bayesian Networks provide an efficient
193: way to parameterize the joint probability distribution characterizing
194: the entire system, conditional probabilities do not capture system
195: dynamics, and the most successful inference techniques only work on
196: directed acyclic graphs \cite{Cowell}. For some applications, such as
197: modeling the citation network for a particular body of research,
198: assuming the network is acyclic is reasonable since papers generally
199: only cite previously published work. There are many applications,
200: however, such as modeling biological or social or economic networks,
201: where such an assumption insisting on the absence of feedback
202: dependencies between system states would be entirely unreasonable.
203: Moreover, often an accurate representation of system dynamics is as
204: important as that of system structure. In these situations, new
205: methods are needed.
206:
207: This paper introduces Dynamical Structure Functions as a structurally
208: accurate representation of complex LTI systems that do not ignore
209: system dynamics. We begin in the next section by demonstrating that
210: transfer functions contain no structural information without more
211: assumptions being made about the system, assumptions that we believe
212: are unreasonable when dealing with truly complex systems. We also
213: highlight some common pitfalls when estimating structure from data.
214: We then introduce in Section \ref{se:net} the Dynamical Structure
215: Function of an LTI system and discuss its properties.
216: Section~\ref{se:main} then uses Dynamical Structure to provide
217: necessary and sufficient conditions for estimating structure from
218: data, and an example is provided illustrating the danger of using only
219: steady-state information to estimate structure. Section~\ref{se:conc}
220: then concludes with a discussion of future work.
221:
222: \section{Background: Structure Estimation and Dynamic Systems}\label{se:mpf}
223:
224: Consider the network characterized by the linear system
225: \begin{equation}\label{eq:LTI}
226: \left\{\begin{array}{lll}
227: \dot x & = & Ax + Bu \\
228: y & = & Cx
229: \end{array}\right .
230: \end{equation}
231: where $x\in R^n$, $u\in R^m$, $y\in R^p$, and $C=[I \ \ 0]$. We are
232: interested in inferring the causal dependencies between the $p$
233: measured states, $y$, from limited data. Typically, $m<n$, $p<n$, and $n$
234: itself is unknown.
235:
236: In this work we do not assume that the system (\ref{eq:LTI}) is both controllable and observable from the particular inputs and outputs specified by $u$ and $y$. In the complex systems context, such an assumption would be unreasonable to impose since the number of inputs and outputs is assumed to be very small compared to the total number of states. Thus, assuming controllability and observability would be restricting our attention to networks with very special structure. As a result, we can not assume that (\ref{eq:LTI}) is a minimal realization of the corresponding input-output transfer
237: function, $G$, given by
238: \begin{equation}\label{eq:G}
239: G(s) = C\left ( sI -A \right ) ^{-1}B
240: \end{equation}
241:
242: In this work we also do not assume knowledge of the system's order. Thus, the true system, (\ref{eq:LTI}), has a particular causal structure and complexity that we can only detect through our interaction with the system at $u$ and $y$. Nevertheless, we do assume
243: throughout this work that the transfer function, $G$, can be obtained
244: from the available data, $u$ and $y$, using standard identification
245: methods.
246:
247: Notice that the transfer function does not directly reveal structural
248: information of the system. For example, consider the system
249: \begin{equation}
250: \label{eq:TFexample}
251: A=\left[\begin{array}{cccc}-1&0&0&1\\ .25&-1&0&0\\
252: 0&1&-1&0\\0&0&.25&-1\end{array}\right]\ \ \ \ \
253: B=\left[\begin{array}{ccc}1&0&0\\0&1&0\\0&0&1\\0&0&0\end{array}\right]
254: \end{equation}
255: \[
256: C=\left[\begin{array}{cccc}1&0&0&0\\0&1&0&0\\0&0&1&0\end{array}\right]
257: \]
258: Note that this system has a very definite ring structure, where $x_1\longrightarrow x_2 \longrightarrow x_3 \longrightarrow x_4 \longrightarrow x_1$. Nevertheless, the associated transfer function, $G$, is given by
259: \[\small
260: \left[\begin{array}{ccc} s^3+3s^2+3s+1 &.25 & .25s+.25 \\
261: .25s^2+.5s+.25 & s^3+3s^2+3s+1 & .625 \\
262: .25s+.25 & s^2+2s+1 & s^3+3s^2+3s+1\end{array}\right] \frac 1 {p(s)}
263: \]
264: where $p(s)=s^4+4s^3+6s^2+4s+.9375$, which reveals nothing about the
265: ring structure of the system. Although the structure is easy to read
266: from the actual state realization of the system, a transfer function
267: identified from input-output data--{\it even if identified
268: perfectly}--does not directly yield any structural information about
269: the system.
270:
271: Given this difficulty using the transfer function to obtain structural
272: information, one may ask why not identify the state space realization
273: directly. Nevertheless, it is difficult to identify a realization of
274: the system without knowing the order of the system. In this work, we
275: assume that structural information must be obtained from limited data,
276: that is, with measurements that constitute only part of the complete
277: state vector. Moreover, we do not assume knowledge of the full system complexity, or true system order. Later, we demonstrate how incorrectly assuming knowledge of
278: the system order can lead to erroneous structural estimates.
279:
280: Thus, transfer functions are generally obtainable from input-output data, but they contain no structural information. At the other extreme, state space realizations contain all information about the system, but they are difficult to obtain from limited information. We are interested in something in between, a representation that may still be obtainable from input-output data, but that also contains information about both the dynamics and the structure of the system.
281:
282: Structure is typically represented by a graph. Nodes represent system
283: variables, and edges represent interaction between variables.
284: Directed edges capture notions of directed influence, often quantified
285: by conditional probabilities. We will consider a directed edge to
286: indicate a causal relationship between variables. Regardless of how
287: the notion of directed influence is represented, however, the absence
288: of an edge between variables indicates a kind of independence between
289: those variables; $z_1 \longrightarrow z_2 $ instead of $z_1
290: \rightleftharpoons z_2$ means that $z_1$ does not depend directly on
291: $z_2$. That is, any influence $z_2$ may have on $z_1$ may only occur
292: indirectly through $z_2$'s influence on other {\it explicit} variables
293: (nodes) in the network, and their direct influence, in turn, on $z_1$.
294: In particular, it is critical to note that $z_1 \rightarrow z_2$ means
295: there may {\it not} be some hidden variable, $z_i$, {\it that has not
296: been represented in the graphical network model} through which $z_2$
297: influences $z_1$. For structure to have meaning, even hidden,
298: unmodeled variables should respect the graph defining the network and
299: only operate within edges. This has important implications for
300: dealing with uncertainty.
301:
302: In its simplest form, then, structure is simply a square binary matrix
303: $S$ with $s_{ij}=1$ indicating the presence of an edge directed from
304: $z_j$ to $z_i$. For the system $T$ given in (\ref{eq:LTI}) we would
305: define our explicitly modeled variables to be $z = [z1;\;z2] [y;\;u]$. Simple structure, $S$, would then be a $p+m$ by $p+m$
306: binary matrix; for the example (\ref{eq:TFexample}) we would have:
307: \begin{equation}
308: \label{eq:structure}
309: S = \left[\begin{array}{cc}Q_T&P_T\\P_F&Q_F\end{array}\right]=\left[\begin{array}{ccc|ccc}1&1&0&1&0&0\\ 0&1&1&0&1&0\\ 1&0&1&0&0&1\\ \hline 0&0&0&1&0&0
310: \\0&0&0&0&1&0\\ 0&0&0&0&0&1\end{array}\right].
311: \end{equation}
312:
313: Note that we consider that variables may automatically influence themselves since they may be recursively generated, thus the diagonal of $S$ is identity. The blocks $Q_T$, $P_T$, $Q_F$, and $P_F$ correspond to the partition of $z$ as inputs and outputs of the system $T$. The {\it input structure}, $P_T$ describes how inputs, $u$, influence the measured variables, $y$. The {\it output structure}, $Q_T$ describes how the measured variables, $y$, influence each other. Under the interpretation that $y$ corresponds to part of the system state vector, the output structure $Q_T$ may also be called the {\it internal structure} of the system $T$ (with $P_T$ then being called the {\it control structure} of $T$). The remaining blocks, $Q_F$ and $P_F$, describe the feedback environment of $T$. In general, when $T$ is in feedback with an operator $F$, $P_F$ is the input or control network of the feedback operator $F$, while $Q_F$ is $F$'s internal or output structure. When no feedback operator is defined and the inputs $u$ are truly considered free variables, then $P_F$ is zero (since $u$ does not depend on $y$), and $Q_F$ is identity (since $u$'s only depend on themselves and do not influence each other).
314:
315: Just as a transfer function description of a system grows or shrinks with the number of inputs and outputs of the system, the structure matrix $S$ also grows or shrinks with the number of system inputs and outputs. We call the number of outputs, $p$, the {\it $p^{th}$-order resolution} of the structural representation. Thus, when three of the states of a fourth order system are measured, the resolution of the structural representation is three, and $Q_T$ will be $3\times 3$. The fourth state is hidden and does not appear in the third-order resolution of $S$ in any way. Nevertheless, correct structural representations are {\it consistent}, in that zeros appear in lower-order (coarser) resolutions only if there are no hidden states from higher-order (finer) resolutions that could enable the interaction.
316:
317: Structure estimation for dynamic systems seeks to find $S$ corresponding to a particular realization of a dynamic system, $T$, using only input output data. Before discussing how to solve this problem, however, we first outline two flawed approaches to this problem that appear from time-to-time in the literature. First, one may assume knowledge of the system order, $n$, and then proceed to attempt to infer information about structure in light of this assumption. Second, one may estimate a particular realization of $T$ and then attempt to reconstruct $S$ from the state space model. These approaches are not entirely unrelated, but we show next that either approach can easily lead to incorrect conclusions.
318:
319: \subsection{Example: Erroneous System Order Assumption}\label{se:example1}
320:
321: Although there are some reasonable techniques for estimating order from time-series data, there is no foolproof method available. In some applications, the most common technique for order estimation continues to be to assume that the measured outputs constitute the entire state vector, that is, that $n=p$. The following example demonstrates that making this assumption incorrectly may lead to completely erroneous structure estimates.
322:
323: Consider the network in Figure~\ref{fig:example1a} with three state
324: variables structured in a chain, with the single input $u$ driving
325: $x_1$, $x_1$ in feedback with $x_3$, and $x_3$ driving $x_2$,
326: characterized by the equations
327: \begin{equation}\label{eq:example1}
328: \left[\begin{array}{c}\dot{x}_1\\\dot{x}_2\\\dot{x}_3\end{array}\right] =
329: \left[\begin{array}{rrr}
330: -1&0&-5\\ 0&-4&1\\ 5&0&-1
331: \end{array}\right]\left[
332: \begin{array}{c}x_1\\ x_2\\x_3\end{array}\right] +
333: \left[\begin{array}{c}1\\0\\0\end{array}\right]u,
334: \end{equation}
335: $$
336: \left[\begin{array}{c}y_1\\ y_2\end{array}\right] \left[\begin{array}{rrr}1&0&0\\
337: 0&1&0\end{array}\right]\left[\begin{array}{c}x_1\\ x_2\\
338: x_3\end{array}\right],$$
339: From $x_1$ and $x_2$, we would like to be able to infer the structure
340: $u\longrightarrow x_1 \longrightarrow x_2$, in spite of the fact that
341: we may have no knowledge of the not-directly-observed, yet
342: (indirectly) observable state $x_3$.
343:
344: \begin{figure}[h] % figure placement: here, top, bottom, or page
345: \centering
346: \subfigure[Network]{\label{fig:example1a}
347: \psfrag{u}[rt][rt]{$u$}
348: \psfrag{x1}[rt][rt]{$x_1$}
349: \psfrag{x2}[rt][rt]{$x_2$}
350: \psfrag{x3}[rt][rt]{$x_3$}
351: \includegraphics[width=3cm]{figures/example1.eps}}
352: \subfigure[Step response of $x_1$ and $x_2$]{\label{fig:example1b}
353: \includegraphics[width=5cm]{figures/example1a.eps}}
354: \caption{Example of a simple 3 state network}\label{fig:example1}
355: \end{figure}
356:
357: By assuming knowledge of the system order, one may then attempt to fit a state space realization
358: directly from the data. In this case, any attempt to identify a second order system given the oscillating data shown above will result in an $A$ matrix with complex eigenvalues. This implies that any real-valued $A$ matrix that reasonably fits the data will have non-zero terms in its off-diagonal positions, leading incorrectly to a fully connected network structure estimate instead of the correct chain structure.
359: \myBox
360:
361: \subsection{Example: Erroneous Structure from Realizations}\label{se:exam}
362:
363: Suppose that after a sequence of experiments, one was able to identify
364: the transfer function
365: \begin{equation}\label{eq:examp_G}
366: G(s) \left[\begin{array}{c}\frac{1}{s+1}\\\frac{1}{(s+1)(s+2)}\end{array}\right].
367: \end{equation}
368: It can be shown that this transfer function is consistent with two
369: systems with very different structures, given by
370: $$A_1 = \left[\begin{array}{rrr}-1&0&0\\0&-2&1\\0&0&-1\end{array}\right]\ \
371: B_1=\left[\begin{array}{c}1\\0\\1\end{array}\right] \ \
372: C_1 = \left[\begin{array}{ccc}1&0&0\\0&1&0\end{array}\right]$$
373: and
374: $$A_2 = \left[\begin{array}{rr}-1&0\\1&-2\end{array}\right]\ \
375: B_2=\left[\begin{array}{c}1\\0\end{array}\right] \ \ C_2 \left[\begin{array}{cc}1&0\\0&1\end{array}\right]$$
376: The networks in Figure~\ref{fig:ex21} correspond to each of the
377: possible realizations of $G$. Note that without more information
378: about the system, such that it is minimal, or order three, etc. then
379: we would not be able to say anything about structure from the transfer
380: function alone.
381:
382: \begin{figure}[h]
383: \centering \subfigure[Possible network 1.]{
384: \label{fig:ex21a}
385: \psfrag{u}[rt][rt]{$u$}
386: \psfrag{x1}[rt][rt]{$y_1$}
387: \psfrag{x2}[rt][rt]{$y_2$}
388: \psfrag{x3}[rt][rt]{$x_3$}
389: \includegraphics[width=3cm]{figures/ex21a.eps}}
390: \subfigure[Possible network 2.]{
391: \label{fig:ex21b}
392: \psfrag{u}[rt][rt]{$u$}
393: \psfrag{x1}[rt][rt]{$y_1$}
394: \psfrag{x2}[rt][rt]{$y_2$}
395: \includegraphics[width=3cm]{figures/ex21b.eps}}
396: \caption{Two possible networks given the data.}\label{fig:ex21}
397: \end{figure}
398:
399: These examples demonstrate the difficulty of estimating network
400: structure from data. Nevertheless, ideally one would estimate both
401: the network structure and the system dynamics from data. In the next
402: section, we introduce Dynamical Structure Functions as a mechanism for
403: representing both system dynamics and structure.
404:
405: %explore the degree to which partial
406: %state information disrupts network reconstruction and provide
407: %necessary and sufficient conditions for full recovery.
408:
409:
410: \section{Dynamical Structure}\label{se:net}
411:
412: Consider the system given by (\ref{eq:LTI}). Given the special structure on $C$, we note that the first p state variables are actually the measured variables $y$. Defining $x_h$ to be the remaining $n-p$ ``hidden" states, the system becomes
413: \begin{equation}\label{eq:LTIhiddenstates}
414: \left\{\begin{array}{lll}
415: \left[\begin{array}{c}\dot{y}\\\dot{x}_h \end{array}\right]& = & \left[\begin{array}{cc}A_{11}&A_{12}\\A_{21}&A_{22}\end{array}\right]\left[\begin{array}{c}y\\x_h\end{array}\right] +\left[\begin{array}{c} B_{1}\\B_2\end{array}\right]u \\
416: y & = & \left[\begin{array}{cc}I&0\end{array}\right]\left[\begin{array}{c}y\\x_h\end{array}\right]
417: \end{array}\right .
418: \end{equation}
419: Taking Laplace Transforms of the signals, we then obtain
420: \begin{equation}\label{eq:LTIlaplace}
421: \begin{array}{lll}
422: \left[\begin{array}{c}sY\\sX_h \end{array}\right]& = & \left[\begin{array}{cc}A_{11}&A_{12}\\A_{21}&A_{22}\end{array}\right]\left[\begin{array}{c}Y\\X_h\end{array}\right] +\left[\begin{array}{c} B_{1}\\B_2\end{array}\right]U
423: \end{array}
424: \end{equation}
425:
426: From this equation it is easy to construct the transfer functions from the manifest variables $z = [Y\;;\;U]$ to themselves. Solving for $X_h$, we have
427: $$X_h=\left ( sI - A_{22} \right )^{-1} A_{21} Y + \left ( sI -
428: A_{22} \right )^{-1} B_2 U$$
429: Substituting into~(\ref{eq:LTIlaplace}) then yields
430: $$Y = W Y + V U$$
431: %
432: where $W=A_{11} + A_{12}\left ( sI - A_{22} \right )^{-1} A_{21}$ and
433: $V=A_{12}\left ( sI - A_{22} \right )^{-1} B_2 +B_1$. Let $D$ be a
434: matrix with the diagonal term of $W$, i.e. $D=\mbox{diag}(W_{11}, W_{22}, ...,
435: W_{pp})$. Then,
436: $$\left ( sI - D \right ) Y = \left ( W-D \right ) Y + V U$$
437: Note that $W-D$ is a matrix with zeros on its diagonal. We then have
438: \begin{equation}
439: \label{eq:PQ}
440: Y = QY + PU
441: \end{equation}
442: where
443: \begin{equation}\label{eq:Q}
444: Q = \left ( sI - D \right )^{-1} \left ( W-D \right )
445: \end{equation}
446: and
447: \begin{equation}\label{eq:P}
448: P=\left ( sI- D \right )^{-1} V
449: \end{equation}
450: The matrix $Q$ is a matrix of transfer functions from $Y_i$ to $Y_j$,
451: $i \neq j$, or relating each measured signal to all {\it other}
452: measured signals (recall that $Q$ is zero on the diagonal). The full
453: transfer matrix from $Y$ to $Y$ thus becomes $Q_T = I+Q$. Likewise,
454: the transfer matrix from $U$ to $Y$ is $P$.
455:
456: We thus can consider the transfer matrix, $N$, relating all manifest variables, $z$, to themselves. This matrix is reminiscent of the structure matrix $S$ given in (\ref{eq:structure}), except that the entries are transfer functions relating variables instead of binary values. this gives us the following definition.
457:
458: \begin{definition} Given the system (\ref{eq:LTI}), we define the {\it Dynamical Structure Function} or {\it Network}, $N$, of the system to be
459: \begin{equation}
460: \label{eq:fulldynamicstructure}
461: N = \left[\begin{array}{cc}I+Q&P\\0&I\end{array}\right].
462: \end{equation}
463: where $Q$ and $P$ are as given in~(\ref{eq:Q}) and~(\ref{eq:P}).
464: \end{definition}
465:
466: When this function is completely characterized by $P$ and $Q$ (when
467: the system is open, that is, $u$ represents completely free inputs
468: unrelated to the measurements $y$), we refer to $(P,Q)$ as the {\it
469: Dynamical Structure} of the system. There are a number of
470: properties of the Dynamical Structure Function that makes it useful
471: for the structural analysis of linear systems:
472:
473: \begin{proposition} Given the original realisation~(\ref{eq:LTI}),
474: every entry $N_{ij}$ is a strictly proper function and unique.
475: \end{proposition}
476: Strict properness follows from the fact that $(sI-D)^{-1}$ (which is
477: strictly proper) is multiplying transfer functions that are at most
478: proper (never {\it improper}). This fact is important for the
479: interpretation of $N$ as network {\it structure}. The directed edges
480: associated with non-zero entries of this matrix imply {\it causal}
481: relations; strict properness of the transfer functions preserve this
482: interpretation. Uniqueness follows by construction of both $Q$ and
483: $P$.
484:
485: \begin{proposition} The transfer function, $G$, of the system (\ref{eq:LTI}), is related to Dynamic Structure by
486: \begin{equation}
487: G = \left(I-Q\right)^{-1}P.
488: \end{equation}
489: This fact follows directly from~(\ref{eq:PQ}) and $Y=GU$ and
490: demonstrates that Dynamic Structure is a factorization of a transfer
491: function into two parts, the {\it output} or {\it internal} structure,
492: $Q$, and the {\it input} or {\it control} structure, $P$.
493: \end{proposition}
494:
495: %\todo{Wouldn't it be better to have a single theorem?}
496:
497: %In restoring the network structure of a certain system, we typically
498: %do not measure every single state in the system. Biologists are often
499: %very interested in discovering what affect what? What causes what? For
500: %instance, in a simple loop (Figure~\ref{fig:loop3a}) where $x_1$ is
501: %transformed into $x_2$ which then activates the production of $x_3$
502: %which in turn inhibits the production of $x_1$, all of the states
503: %affect each other. So, we could easily have the system in
504: %Figure~\ref{fig:loop3b} since all species affect each other. However,
505: %in reality, $x_2$ only affects $x_1$ through $x_3$ and not directly
506: %and this is important information that we want to infer from the data.
507: %Thus, in fact what we want to recover is the network in
508: %Figure~\ref{fig:loop3a}. The question is how? And especially when most
509: %states are not measured.
510:
511: %\begin{figure}[h]
512: % \centering
513: % \subfigure[Simple loop Dynamical Structure.]{
514: % \label{fig:loop3a}
515: % \psfrag{x1}[rt][rt]{$x_1$}
516: % \psfrag{x2}[rt][rt]{$x_2$}
517: % \psfrag{x3}[rt][rt]{$x_3$}
518: % \includegraphics[width=3cm]{figures/loop3a.eps}}
519: % \subfigure[Fully connected network.]{
520: % \label{fig:loop3b}
521: % \psfrag{x1}[rt][rt]{$x_1$}
522: % \psfrag{x2}[rt][rt]{$x_2$}
523: % \psfrag{x3}[rt][rt]{$x_3$}
524: % \includegraphics[width=3cm]{figures/loop3b.eps}}
525: %% \subfigure[Small network.]{
526: %% \label{fig:loop3c}
527: %% \psfrag{x1}[rt][rt]{$x_1$}
528: %% \psfrag{x2}[rt][rt]{$x_2$}
529: %% \psfrag{x3}[rt][rt]{$x_3$}
530: %% \includegraphics[width=3cm]{figures/loop3c.eps}}
531: %% \subfigure[Reduced network with measured species only.]{
532: %% \label{fig:loop3d}
533: %% \psfrag{x1}[rt][rt]{$x_1$}
534: %% \psfrag{x2}[rt][rt]{$x_2$}
535: %% \psfrag{x3}[rt][rt]{$x_3$}
536: %% \includegraphics[width=3cm]{figures/loop3d.eps}}
537: %\caption{Example of simple 3-species loop networks.}\label{fig:loop3}
538: %\end{figure}
539: %
540: %This question will be answered in the next section. First, here we
541: %must set up mathematically the Dynamical Structure. The above example
542: %is simple in the sense that all states are measured. What if some
543: %states are not? For example, consider again the example in
544: %Figure~\ref{fig:loop3a} but now there is no connection from $x_1$ to
545: %$x_2$, i.e. $x_2 \longrightarrow x_3 \longrightarrow x_1$. If $x_3$
546: %was not measured, we would like to obtain a smaller network based on
547: %the measured species. In this case, the internal Dynamical Structure
548: %consists of a single connection from $x_2$ to $x_1$, i.e. $x_2
549: %\longrightarrow x_1$, even if there is no real direct connection from
550: %$x_2$ to $x_1$. In this case, $x_1$ is a function of $x_3$ which in
551: %turn is a function of $x_2$. Because we do not measure $x_3$, we
552: %simply write $x_1=f(x_3) = f(x_3(x_2)=f(x_2)$. Because this are
553: %dynamic linear time-invariant systems, it is best to represent these
554: %relations as transfer functions.
555:
556: %For example, he relationship between state $x_1$ and other measurable
557: %states can be represented
558: %\begin{eqnarray*}
559: %X_1(s) & = & Q_{12}(s)X_2(s) + \cdots + Q_{1p}(s)X_p(s) \\
560: %& & \ \ \ \ \ \ \ + P_{11}(s)U_1(s) + \cdots + P_{1m}(s)U_1(s)
561: %\end{eqnarray*}
562: %where $y=(x_1, ..., x_p)$ are the measurable states, $u=(u_1, ...,
563: %u_m)$ are the inputs to the system and $Q_{ij}$ and $P_{ij}$ represent
564: %the direct effects in $x_1$ from the other measurable states $x_j$
565: %and inputs $u_j$, respectively. By direct effects we mean that
566: %$Q_{ij}$ and $P_{ij}$ are the transfer functions from $x_j$ to $x_i$
567: %and from $u_j$ to $x_i$, respectably, that not contain any mode from
568: %other measurable states. Note that if it would, then it would not be
569: %a direct effect since it would have come first from another
570: %measurable state.
571:
572: %In general, we can represent the Dynamical Structure as
573: %$$Y= Q Y +P U$$
574: %where $Q$ has zeros along its diagonal
575: %and $Q_{ij}$ on the off diagonal. Alternatively, we can write
576: %\begin{equation}\label{eq:qxpu}
577: %(I-Q)Y=P U
578: %\end{equation}
579:
580: %\begin{theorem}\label{the:dns}
581: % Given the original realisation~(\ref{eq:LTI}), the Dynamical
582: % Structure $Q$ and $P$ is unique and strictly proper.
583: %\end{theorem}
584:
585: %\proof To prove this result, we must construct $Q$ and $P$ from the
586: %original realisation~(\ref{eq:LTI}). First, structure the matrices
587: %$A$, $B$ and $x$ as
588: %$$A=\left[\begin{array}{cc}A_{11}&A_{12}\\
589: % A_{21}&A_{22}\end{array}\right] , \
590: % B=\left[\begin{array}{c}B_{1} \\ B_{2}\end{array}\right] \ \
591: % \mbox{and} \ \ x=\left[\begin{array}{c}y \\
592: % x_{h}\end{array}\right]$$
593: %Taking the Laplace transforms in~(\ref{eq:LTI}) and solving for $X_h$
594: %yields
595: %$$X_h=\left ( sI - A_{22} \right )^{-1} A_{21} Y + \left ( sI -
596: % A_{22} \right )^{-1} B_2 U$$
597: %Then, solving for $Y$ in~(\ref{eq:LTI}) gives
598: %$$Y = W Y + V U$$
599: %
600: %where $W=A_{11} + A_{12}\left ( sI - A_{22} \right )^{-1} A_{21}$ and
601: %$V=A_{12}\left ( sI - A_{22} \right )^{-1} B_2 +B_1$. Let $D$ be a
602: %matrix with the diagonal term of $W$, i.e. $D=\mbox{diag}(W_{11}, W_{22}, ...,
603: %W_{pp})$. Then,
604: %$$\left ( sI - D \right ) Y = \left ( W-D \right ) Y + V U$$
605: %Note that $W-D$ is a matrix with zeros on its diagonal. Equivalently
606: %$$Y = \left ( sI - D \right )^{-1} \left ( W-D \right ) Y + \left ( sI
607: %- D \right )^{-1} V U$$
608: %Finally,
609: %$$Q = \left ( sI - D \right )^{-1}\left ( A_{11} + A_{12}\left ( sI -
610: %A_{22} \right )^{-1} A_{21} -D \right ) $$
611: %and
612: %$$P = \left ( sI - D \right )^{-1} \left ( A_{12}\left ( sI - A_{22}
613: % \right )^{-1} B_2 +B_1 \right )$$
614: %Thus, by construction, both $Q$ an d $P$ are unique. Strictly
615: %properness follows from the fact that $(sI-D)^{-1}$ (which is striclty
616: %proper) is multiplying transfer functions that are in the worst
617: %scenario proper. \myBox
618:
619:
620: It is now easy to see $N_{ij}=0$ if and only if there is no direct
621: {\it or hidden} connection from $z_j$ to $z_i$. The question is then
622: on how to determine the $p^2-p$ and $pm$ transfer functions in $Q$ and
623: $P$, respectively, to determine the Dynamical Structure from data.
624: This structure estimation, or reconstruction problem is addressed
625: next.
626:
627: \section{Dynamical Structure Reconstruction}\label{se:main}
628:
629: Assume data is collected from the original system~(\ref{eq:LTI})
630: leading to the transfer function in~(\ref{eq:G}) relating $Y = GU$.
631: Here we assume without loss of generality that $G$ is full rank.
632: Otherwise, there would be redundant inputs that could be removed to
633: get a full rank $G$. Replacing $Y = GU$ in equation~(\ref{eq:PQ}) and
634: noting that the vector $U$ is abitrarely yields
635: \begin{equation}\label{eq:main}
636: (I-Q)G=P
637: \end{equation}
638:
639: This equation shows that there are more unknowns than equations and
640: that in general Dynamical Structure of the $p$ measurable states
641: cannot be obtained from the $m$ inputs. There are $p^2-p$ unknowns in
642: $Q$, corresponding to all of the $Q_{ij}$ which represents the
643: internal Dynamical Structure. Then there are $pm$ unknowns in $P$
644: which represent the control Dynamical Structure on each measurable
645: state. Thus, all together, there are a total of $p^2-p+pm$ unknown but
646: only a total of $pm$ equations so the problem is under determined as
647: we have $p^2-p$ degrees of freedom. For instance, setting all
648: $Q_{ij}=0$ (which means no connection between measured states) and
649: $P=G$ is a solution of~(\ref{eq:main}) but probably the wrong one.
650:
651: This clearly shows that the Dynamical Structure has {\em more}
652: information than $G$ and {\em less} than the original
653: system~(\ref{eq:LTI}). Thus, to find the Dynamical Structure from $G$
654: we need {\em more} information. Either in the internal Dynamical
655: Structure (if we know some $Q_{ij}=0$), or on how the control is
656: affecting measurable state (if some $P_{ij}=0$). Next we assume we
657: have no information on the internal Dynamical Structure (i.e. no
658: information on $Q$) and consider the cases where: $m<p$ (there are
659: less inputs than measured states), $m=p$ and $m>p$. Before that, we
660: need the following technical result.
661:
662: \begin{lemma}\label{lemma:rank}
663: Rank$(P) =$ rank$(G)$.
664: \end{lemma}
665:
666: \proof Since $(I-Q)G=P$, if suffices to show that rank$(I-Q)=p$. It
667: follows that rank$(I-Q)$
668: $$= \ \ \mbox{rank} \left \{ I- \left ( sI - D \right )^{-1}\left (
669: A_{11} + A_{12}\left ( sI - A_{22} \right )^{-1} A_{21} -D \right )
670: \right \}$$
671: $$ \hspace{-8mm} = \ \ \mbox{rank} \left \{ sI - D - \left ( A_{11} +
672: A_{12}\left ( sI - A_{22} \right )^{-1} A_{21} -D \right )\right \} $$
673: $$ \hspace{-20.5mm} = \ \ \mbox{rank} \left \{ sI - \left ( A_{11} +
674: A_{12}\left ( sI - A_{22} \right )^{-1} A_{21} \right ) \right \}$$
675: which has rank $= p$. \myBox
676:
677: \subsection{$m<p$: Less Inputs than Measured States}
678:
679: If $m<p$, i.e. there are less inputs than measured states, and we have
680: no information on the internal Dynamical Structure then the Dynamical
681: Structure cannot be recovered. To see this, note that in the best
682: case scenario $m=p-1$ and we would have $mp=p^2-p$ equations. Since
683: there are $p^2-p$ unknowns from $Q$ we would need to know $P$
684: precisely.
685:
686: The example from section~\ref{se:exam} shows how different networks
687: satisfy~(\ref{eq:main}) if $m<p$. There we had two measurable states
688: $p=2$, a single input $m=1$ and $G=(G_{11},G_{21})$ given
689: by~(\ref{eq:examp_G}) . In this case, equation~(\ref{eq:main}) has
690: two equations and four unknowns
691: \begin{equation}\label{eq:ex_21}
692: \left\{\begin{array}{lll}
693: G_{11} - Q_{12} G_{21} & = & P_{11} \\
694: G_{21} - Q_{21} G_{11} & = & P_{21}
695: \end{array}\right .
696: \end{equation}
697: We must solve for the internal Dynamical Structure ($Q_{12}$ and
698: $Q_{21}$) and the control Dynamical Structure ($P_{11}$ and $P_{21}$).
699: Since there are only two equations, there are two degrees of freedom.
700: A possible solution is to set $Q_{12} = Q_{21} = 0$, i.e. no internal
701: connection between $y_1$ and $y_2$. In that case, $P_{11} =G_{11}$
702: and $P_{21} =G_{21}$ (Figure~\ref{fig:ex21a}). Note that $x_3$ is
703: playing the role of a hidden state (as $P_{21}$ is second-order) and
704: the system is not controllable (due to the common pole at $-1$), which
705: explains why $G$ is second-order and there are three states. An
706: alternative is to have $P_{21}=0$ (which fixes $Q_{21}=G_{21}/G_{11}$)
707: and $Q_{12}=0$ (which fixes $P_{11}=G_{11}$), which can be seen in
708: Figure~\ref{fig:ex21b}. Note that in this case $P_{11} \not =0$ as
709: that would result in a non-proper $Q_{12}$. These two networks are
710: different and obviously only one can be correct.
711:
712:
713: \subsection{$m=p$: Same number of Inputs than Measured States}
714:
715: \begin{theorem}\label{the:main}
716: If $m=p$ and we have no information on the internal Dynamical
717: Structure, then the Dynamical Structure can be reconstructed if and
718: only if each input controls a measured state independently,
719: i.e. $P_{ij}=0$ for $i\not = j$. In this case, the zeros of
720: $H=G^{-1}$ of the off diagonal define the internal Dynamical Structure
721: and
722: $$Q_{ij} = -\frac{H_{ij}}{H_{ii}} \ \ \mbox{and} \ \ P_{ii} = \frac
723: 1 {H_{ii}}$$
724: \end{theorem}
725:
726:
727: \proof The ``if'' part of the proof follows from the fact that there
728: are $p^2 -p +p=p^2$ unknowns and $p^2$ equations. A linear set of
729: equations can be solved for $Q$ and $P$. Multiplying on the right
730: $(I-Q)G=P$ by $H=G^{-1}$ yields $I-Q = PH$. Since $Q$ has zeros on its
731: diagonal and $P$ is diagonal, we have $1=P_{ii}H_{ii}$ or
732: $P_{ii}=1/H_{ii}$. Finally we can now solve for $Q=I-PH$ and the
733: result follows.
734:
735: For the ``only if'', assume the Dynamical Structure can be
736: reconstructed, i.e. we can solve for $Q$ and $P$ in~(\ref{eq:main})
737: uniquely and they are all strictly proper. Since rank$(G)=p$, by
738: lemma~\ref{lemma:rank} rank$(P)=p$. Thus, there are at least $p$
739: nonzero entries in $P$.
740:
741: To show that there are at the most $p$ unknowns in $P$, assume there
742: are additional unknowns in $P$, and there is a Dynamical Structure
743: with strictly proper $Q^*$ and $P^*$ that satisfy~(\ref{eq:main}). We
744: want to show that another Dynamical Structure with strictly proper
745: $Q\not = Q^*$ and $P\not = P^*$ can be constructed. Consider a vector
746: $X$ stacked with all the unknown parameters, i.e. with all unknown
747: $Q_{ij}$ and $P_{ij}$. Equation~(\ref{eq:main}) can then be written
748: as ${\cal A} X ={\cal B}$, where both ${\cal A}$ and ${\cal B}$ are
749: functions of the elements of $G$. Because there are $p^2$ equations
750: but $p^2+$ extra unknown elements in $P$, this system of equations is
751: undetermined or ${\cal A}$ has a null space. Let $\bar X \not = 0$ be
752: an element of the null space of ${\cal A}$ and $X^*$ contain the
753: elements of $Q^*$ and $P^*$, which satisfy ${\cal A} X^* ={\cal B}$.
754: Then, there exists a large enough positive integer $n$ such that
755: $$X= X^* + \bar X \frac 1 {(s+1)^n}$$
756: is also a solution of ${\cal A}
757: X ={\cal B}$ and all the elements in $X$ are strictly causal. We have
758: then found another Dynamical Structure which contradicts the
759: assumption. Thus, at the most there are only $p$ unknowns in $P$.
760:
761: Finally, there must then be exactly $p$ unknown and nonzero entries in
762: $P$. Since $P$ is full rank, each row and column must have exactly one
763: of these entries. Without loss of generality the inputs can be
764: renamed and reordered so that the diagonal of $P$ contains the unknown
765: and nonzero entries. \myBox
766:
767:
768: This theorem says that in addition to having a square and full rank
769: $G$ it is necessary and sufficient to know that each control $i$
770: affects first state $i$ before it affects any other measurable state
771: to reconstruct the Dynamical Structure. That allows to reduce the
772: number of unknowns to $p^2-p+p=p^2$ which can now be solved.
773:
774: However, if there is some {\em a priori} information about the internal
775: Dynamical Structure (such as some of the $Q_{ij}=0$) then there is
776: more flexibility and less information and constraints are required of
777: $P_{ij}$. As long as there are a total of $p^2$ nonzero elements
778: between $Q_{ij}$ and $P_{ij}$ then the Dynamical Structure can be
779: reconstructed by solving the linear system of
780: equations~(\ref{eq:main}).
781:
782: If $P$ is not diagonal and we have additional information on how the
783: inputs affect the measured states, there may be a change of basis in
784: the control vector that allows it to be converted to a diagonal matrix
785: that can then be used in theorem~\ref{the:main}. For example, if
786: $x_1$ is controlled by $u_1+u_2$ and $x_2$ by $u_1-u_2$ then one could
787: define two new input vectors $v_1=u_1+u_2$ and $v_2=u_1-u_2$.
788:
789: If all the states are measured and $B=I$, we have the following
790: result.
791:
792: \begin{corollary}\label{cor:full}
793: If $p=m=n$ and $B=C=I$ then for $i\not = j$, $H_{ij}= a_{ij}$. Thus,
794: $a_{ij}= 0$ ($i\not = j$) iff $H_{ij}=0$.
795: \end{corollary}
796:
797: \proof The proof follows since is this case $G(s)= \left ( sI -A
798: \right )^{-1}$ which means $H(s)=sI-A$. \myBox
799:
800: Note that we if knew we were measuring all the states we did not need
801: to know $B$. However, the information that the we measure all the
802: state is not captured by~(\ref{eq:main}), unless we had imposed that
803: there were only $n$ modes available to construct $Q$ and $P$.
804:
805: %\todo{Should we keep this corollary??}
806:
807: \subsection{$m>p$: More Inputs than Measured States}
808:
809: It may seem intuitive that if there are more inputs then there should
810: be more information. However, the extra inputs are in fact
811: redundant. The reason is the fact that although $G$ is $p \times m$,
812: the rank$(G)=p$, which means that the inputs really only have $m$
813: degrees of freedom. Thus, the problem reduces to having the same
814: number of inputs as measured states. The difference here is that we
815: may be able to choose from the $m$ inputs $p$ that are known to
816: control directly each measurable state.
817:
818:
819: \subsection{The Danger of Steady-State Measurements}
820:
821: Before ending this section, we want to clarify some misconceptions,
822: especially some communities no so close mathematics, on steady-state
823: identification versus time-series data. For instance,
824: in~\cite{gardner03} the authors proposed a method to estimate networks
825: based on full state measurement and control. From
826: corollary~\ref{cor:full}, the Dynamical Structure can be obtained from
827: the zeros of the non-diagonal terms of $H$, which correspond directly
828: to the entries $A$. However, in the realistic case there are less
829: measurements and control available than states. If instead of
830: estimating $G$ from time-series data we were to use only steady-state
831: data, this could lead to mistakes as the following example shows.
832:
833: Consider a third order system with measurements and control on the
834: first 2 states $x_1$ and $x_2$ and the following dynamics
835: $$A=\left[\begin{array}{ccc} -1 & 1 & -1 \\ 1 & -1 & -1 \\ 1 & 1 & -1
836: \end{array}\right]$$
837: This is a fully connected network, and so we expect the reduced
838: network consisting on $x_1$ and $x_2$ to be fully connected as
839: well. In this case,
840: $$H(s)=\left[\begin{array}{cc} \frac{s^2+2s+2}{s+1} & -\frac{s}{s+1} \\
841: -\frac{s}{s+1} & \frac{s^2+2s+2}{s+1}
842: \end{array}\right]$$
843: When $s \rightarrow 0$,
844: $$H(0)=\left[\begin{array}{cc} 2 & 0 \\
845: 0 & 2
846: \end{array}\right]$$
847: which could lead one to think the reduced order network is not
848: connected at all, i.e. $x_1$ does not affect $x_2$ and vice versa. In
849: general, for third order systems this is always true if and only if
850: $a_{12}a_{33}+a_{13}a_{32}=0$ for the connection from $x_2$ to $x_1$
851: and $a_{21}a_{33}+a_{23}a_{31}=0$ for the connection from $x_1$ to
852: $x_2$. Note that even when these equalities are not exactly zero but
853: near zero, the presence of noise may again lead to wrong decisions.
854:
855:
856: %\todo{Still do not have necessary and sufficient results on when a steady-state approach is enough... maybe for the next paper, right?}
857:
858:
859:
860:
861: \section{Conclusions}\label{se:conc}
862:
863: This paper discussed the role of network structure for LTI systems. In particular, it was shown that transfer functions alone contain no information about the structure of an LTI system. We then introduced a new representation for such systems, a factorization of the system's transfer function that we call Dynamical Structure. Dynamical Structure Functions contain more information about the system then the transfer function because they also describe the network structure between inputs and outputs. Nevertheless, Dynamical Structure contains less information about the system than its state-space description because no attempt is made to realize the network structure relating the non-measured, hidden state variables to the rest of the system. In this way, Dynamical Structure is a convenient analysis tool somewhere in between a system's full state space realization and its transfer function.
864:
865: We then used Dynamical Structure to explore the network reconstruction problem. In this problem, one would like to estimate network structure given only input-output data. This problem is extremely important for a variety of fields, such as biology or counter-terrorism, that attempt to draw structural conclusions from data. Necessary and sufficient conditions were presented that indicate that network reconstruction demands careful experiment design. Moreover, various examples were provided throughout the paper that demonstrate how failure to respect the necessary conditions may lead to incorrect conclusions about the network structure.
866:
867: %uture work will consider other limitations such as finite samples of
868: %y$ and noise.
869:
870: %\todo{Should we discuss more on noise? Adding MCMC and other techniques...}
871:
872:
873: \section{Acknowledgements} The authors would like to thank Glenn
874: Vinnicombe for his valuable input and comments, specially regarding
875: the notions of uniqueness and properness of $Q$ and $P$.
876:
877:
878: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
879: \bibliography{mybib}
880:
881:
882: \end{document}
883: %%
884: % In order to
885: % remove footnote and page number, insert
886: % \thispagestyle{empty}\pagestyle{empty}
887: % right after the \maketitle command!!
888: %%%
889: % You may want to adjust the position of the
890: % text on the page (for your specific
891: % printer) with the commands
892: % \addtolength{\oddsidemargin}{Xmm} % right Xmm
893: % \addtolength{\topmargin}{Ymm} % down Ymm
894: % "
895: %%% }
896:
897:
898: This paper starts by presenting local stability conditions for limit
899: cycles of piecewise linear systems (PLS), based on analysing the
900: linear part of Poincar\'e maps. Local stability guarantees the
901: existence of an asymptotically stable neighbourhood around the limit
902: cycle. However, tools to characterise such neighbourhood do not
903: exist.
904:
905: %\addtolength{\textheight}{-1.4cm} % This command serves to balance the column lengths
906: % on the last page of the document manually. It shortens
907: % the textheight of the last page by a suitable amount.
908: % This command does not take effect until the next page
909: % so it should come on the page before the last. Make
910: % sure that you do not shorten the textheight too much.
911:
912:
913:
914: %\vspace{-0.2cm}
915:
916: \section{Extra Stuff}
917:
918: For simplicity, we will prove the result for $x_1$ and the other
919: states follows in a similar way. Let $x_2,u_2 \in \reals^{m-1}$ denote
920: the reminder of the states and inputs of the system,
921: respectively. Write
922: $$G=\left[\begin{array}{cc}g_{11}&G_{12}\\
923: G_{21}&G_{22}\end{array}\right] \ \ \mbox{and} \ \
924: H=\left[\begin{array}{cc}h_{11}&H_{12}\\
925: H_{21}&H_{22}\end{array}\right]$$
926: From $x=Gu$ and $u=Hx$ we have
927: $$\left\{\begin{array}{ccl}
928: x_1 & = & g_{11} u_1 + G_{12}u_2 \\
929: u_2 & = & H_{21}x_1 + H_{22} x_2 \end{array}\right .$$
930: Replacing the second equation in the first gives
931: $$x_1 = g_{11} u_1 + G_{12} H_{21}x_1 + G_{12} H_{22} x_2$$
932: or
933: $$\left ( 1 - G_{12} H_{22} \right ) x_1 = g_{11} u_1 + G_{12} H_{21}x_2$$
934: Since $GH=I$, it follows that $g_{11}h_{11}+G_{12}H_{21} =1$ and
935: $g_{11}H_{12} + G_{12}H_{22} =0$. Thus,
936: $$g_{11} h_{11} x_1 = g_{11} u_1 - g_{11} H_{12}x_2$$
937: or
938: $$x_1 = \frac 1 {h_{11}} u_1 - \frac{H_{12}}{h_{11}}x_2$$
939: Thus, each
940: ``direct'' connection from some $x_i$ ($i\not = 1$) to $x_1$ is zero
941: iff the corresponding entry in $H_{1i}=0$.
942:
943:
944: \myBox
945:
946:
947:
948: \section{Full State Measurement}
949:
950: We begin our analysis by considering the system (\ref{eq:LTI}), except where $C = I^{n\times n}$, yielding $y(t)=x(t)\in R^n$. Nevertheless, we assume the available data is limited in that we only have access to $N$ samples of $y(t)$. In the sequel we show that the structure of the system can be recovered precisely when $N\geq n+m+1$ and the usual sufficiency of excitation requirements are met. This fact is not surprising since access to full state measurement restricts the basis of the state space and uniquely defines a minimal system consistent with the data, provided enough data is available to pose the estimation problem well.
951:
952: \subsection{Discrete Time Systems}
953:
954: Interpreting the system (\ref{eq:LTI}) as a discrete time system, we note that the reconstruction is trivial. Clearly, for each state variable $x_i$, we can set up the following systems of equations
955: \[
956: \left[\begin{array}{c}y_i(1)\\y_i(2)\\\vdots\\y_i(N)\end{array}\right] = \left[\begin{array}{cccc|ccc}y_1(0)&y_2(0)&\dots&y_n(0)&u_1(0)&\dots&u_m(0)\\ y_1(1)&y_2(1)&\dots&y_n(1)&u_1(1)&\dots&u_m(1)\\\vdots&&\ddots&&\vdots&&\vdots\\y_1(N-1)&y_2(N-1)&\dots&y_n(N-1)&u_1(N-1)&\dots&u_m(N-1)\end{array}\right]\left[\begin{array}{c}a_{i1}\\a_{i2}\\\vdots\\a_{in}\\ \hline b_{i1}\\\vdots\\b_{im}\end{array}\right].
957: \]
958: Clearly a necessary condition for each of these systems of equations to yield a unique set of parameters
959: is that $N\geq n+m+1$. This condition becomes sufficient when the columns of the $\left[y_1\;\dots \;y_n\;\vline\;u_1\;\dots\;u_m\right]$ matrix are linearly independent, characterizing a sufficiency of excitation.
960:
961: \subsection{Continuous Time Systems}
962:
963: More technicalities, but we should find an elegant way to show that the same results hold as for discrete time.
964:
965:
966: \section{Partial State Measurement}
967:
968: While network reconstruction is relatively straight forward in the full state measurement case, partial state measurements significantly change the situation. When only part of the state is measured, the behavior of the hidden variables (and their associated unknown initial conditions) can drastically affect the behavior of the measured states. For example, consider the system
969: \[
970: \left[\begin{array}{c}\dot{x}_1\\\dot{x}_2\end{array}\right]=\left[\begin{array}{cc}A_{11}&A_{12}\\A_{21}&A_{22}\end{array}\right]\left[\begin{array}{c}x_1\\x_2\end{array}\right]+\left[\begin{array}{c}B_1\\B_2\end{array}\right]u
971: \]
972: \[
973: y = \left[\begin{array}{cc}I&0\end{array}\right]\left[\begin{array}{c}x_1\\x_2\end{array}\right]
974: \]
975: where $x_1\in R^{n_1}$ are the measured states and $x_2\in R^{n_2}$ are the hidden states. Clearly this evolution of the measured states can be thought of as a full state system being affected by a disturbance, or process noise, $w$
976: \begin{equation}
977: \label{eq:distsystem}
978: \dot{x}_1 = A_{11}x_1+B_1u+A_{12}w,
979: \end{equation}
980: \[
981: y = x_1,
982: \]
983: where $w=x_2$.
984:
985:
986: \section{Systems with Measurement Noise}
987:
988: To the extent that a noise signal can be approximated by a high order linear system, we note from (\ref{eq:distsystem}) above that a system affected by process noise is equivalent to a partially observed system with a large number of hidden states. We now consider the case where measurements themselves are impacted by a disturbance:
989: \begin{equation}
990: \label{eq:noisesystem}
991: \left[\begin{array}{c}\dot{x}_1\\\dot{x}_2\end{array}\right]=\left[\begin{array}{cc}A_{11}&A_{12}\\A_{21}&A_{22}\end{array}\right]\left[\begin{array}{c}x_1\\x_2\end{array}\right]+\left[\begin{array}{c}B_1\\B_2\end{array}\right]u,
992: \end{equation}
993: \[
994: y = \left[\begin{array}{cc}I&0\end{array}\right]\left[\begin{array}{c}x_1\\x_2\end{array}\right]+\eta.
995: \]
996:
997: This situation is drastic for structure estimation since the measurements no longer reflect at least a partial basis of the state variables. In particular, if we model the disturbance, $\eta$, as a high order linear system given by
998: \[
999: \dot{x}_3 = A_{31}x_1+A_{32}x_2+A_{33}x_3+B_3u,
1000: \]
1001: \[
1002: \eta = C_3x_3,
1003: \]
1004: then we can augment the system (\ref{eq:noisesystem}) as follows
1005: \begin{equation}
1006: \left[\begin{array}{c}\dot{x}_1\\\dot{x}_2\\\dot{x}_3\end{array}\right]=\left[\begin{array}{ccc}A_{11}&A_{12}&0\\A_{21}&A_{22}&0\\A_{31}&A_{32}&A_{33}\end{array}\right]\left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right]+\left[\begin{array}{c}B_1\\B_2\\B_3\end{array}\right]u,
1007: \end{equation}
1008: \[
1009: y = \left[\begin{array}{ccc}I&0&C_3\end{array}\right]\left[\begin{array}{c}x_1\\x_2\\x_3\end{array}\right].
1010: \]
1011:
1012: \subsection{Idea for Proofs}
1013:
1014: For Partial State Measurements, whether noisy or not, I think the kind
1015: of proof we can develop is something that says, if the impact of the
1016: hidden states is smaller then $\gamma$, then the network can be
1017: reconstructed, otherwise it can't. Although such a result may not be
1018: terribly useful, in that I'm not sure if we can ever tell from our
1019: measurements whether the condition holds, nevertheless, I think such a
1020: condition can be shown to be necessary and sufficient.
1021:
1022:
1023:
1024: %\todo{the part from here on was from yesterday and will probably be deleted.}
1025:
1026:
1027: \subsection{Full state feedback}
1028:
1029: This is not a interesting problem since we know $B$ and $C$ and
1030: therefore $A$ can be reconstructed. However, it is a good tart to gain
1031: some intuition.
1032:
1033: \subsubsection{3rd order}
1034:
1035: \begin{proposition}
1036: Consider the above system with $n=3$ and with full state feedback.
1037: The zeros of H(s) of the off diagonal define the network structure. In
1038: this case, this is equivalent to $a_{ij} =0$ if and only if
1039: $H_{ij}=0$, for all $i \not = j$.
1040: \end{proposition}
1041:
1042: \proof Write
1043: $$G=\left[\begin{array}{ccc}G_{11}&G_{12}&G_{13}\\
1044: G_{21}&G_{22}&G_{23}\\ G_{31}&G_{32}&G_{33}\end{array}\right]$$
1045: Next is the proof for $i=1,j=2$, but the others are proven in a
1046: similar way. Note that $H_{12} = (G_{13}G_{32}-G_{12}G_{33}) /p$,
1047: where $p$ is the characteristic polynomial and
1048: $G_{12}=a_{12}(s+a_{33}) + a_{13}a_{32}$, $G_{13}=a_{13}(s+a_{22}) +
1049: a_{12}a_{23}$, $G_{32}=a_{32}(s+a_{11}) + a_{12}a_{31}$,
1050: $G_{13}=(s+a_{11})(s+a_{22}) - a_{12}a_{21}$.
1051:
1052: Thus, we want to prove that $a_{12} = 0$ iff $G_{13}G_{32} G_{12}G_{33}$. If $a_{12} = 0$ then the result follows. If
1053: $G_{13}G_{32} = G_{12}G_{33}$ then the left hand side has a polynomial
1054: of degree at most 2, while the right hand side has a term in $s^3$
1055: which is $a_{12} s^^3$. Thus, $a_{12}=0$. \myBox
1056:
1057: There are two possible interpretations of the above result.
1058: \begin{itemize}
1059: \item
1060: \end{itemize}
1061:
1062:
1063:
1064:
1065:
1066: