physics0607197/TOP4.tex
1: \documentclass{elsart}
2: \usepackage{epsfig}
3: \usepackage{amsmath}
4: \usepackage{color}
5: \usepackage{amssymb}
6: \usepackage{graphicx}
7: \usepackage{subfigure}
8: %\usepackage{hyperref}
9: 
10: \newcommand \be{\begin{equation}}
11: \newcommand \ba{\begin{eqnarray}}
12: \newcommand \ee{\end{equation}}
13: \newcommand \ea{\end{eqnarray}}
14: \bibliographystyle{elsart-num}
15: %\bibliographystyle{elsart-harv}
16: 
17: \begin{document}
18: \runauthor{Zhou and Sornette} \markboth{A}{B}
19: \begin{frontmatter}
20: \title{Lead-lag cross-sectional structure and
21: detection of correlated-anticorrelated regime shifts: application to the
22: volatilities of inflation and economic growth rates}
23: \author[ecust,nice]{\small{Wei-Xing Zhou}},
24: \author[nice,ETH]{\small{Didier Sornette}\thanksref{EM}}
25: \address[ecust]{School of Business and Research Center of Systems
26: Engineering, East China University of Science and Technology,
27: Shanghai 200237, China}
28: \address[ETH]{Department of Management, Technology
29: and Economics, ETH Zurich\\ CH-8032 Zurich, Switzerland}
30: \address[nice]{Laboratoire de Physique de la Mati\`ere Condens\'ee,
31: CNRS UMR 6622 and Universit\'e de Nice-Sophia Antipolis, 06108 Nice
32: Cedex 2, France}
33: \thanks[EM]{Corresponding author. {\it E-mail address:}\/
34: sornette@ethz.ch (D. Sornette)\\
35: http://www.er.ethz.ch/}
36: 
37: \begin{abstract}
38: We have recently introduced the ``thermal optimal path'' (TOP)
39: method to investigate the real-time lead-lag structure between two
40: time series. The TOP method consists in searching for a robust
41: noise-averaged optimal path of the distance matrix along which the
42: two time series have the greatest similarity. Here, we generalize
43: the TOP method by introducing a more general definition of distance
44: which takes into account possible regime shifts between positive and
45: negative correlations. This generalization to track possible changes
46: of correlation signs is able to identify possible transitions from
47: one convention (or consensus) to another. Numerical simulations on
48: synthetic time series verify that the new TOP method performs as
49: expected even in the presence of substantial noise. We then apply it
50: to investigate changes of convention in the dependence structure
51: between the historical volatilities of the USA inflation rate and
52: economic growth rate. Several measures show that the new TOP method
53: significantly outperforms standard cross-correlation methods.
54: \end{abstract}
55: %{\it{JEL classification:}} C14; E31; E58; G10
56: 
57: \begin{keyword}
58: Thermal optimal path; time series; inflation; GDP growth; convention
59: \end{keyword}
60: 
61: \end{frontmatter}
62: 
63: \typeout{SET RUN AUTHOR to \@runauthor}
64: %
65: %\newpage            %
66: %\tableofcontents    %
67: %\newpage            %
68: 
69: 
70: \section{Introduction}
71: \label{s1:intro}
72: 
73: The study of the lead-lag structure between two time series $X(t)$
74: and $Y(t)$ has a long history, especially in economics, econometrics
75: and finance, as it is often asked which economic variable might
76: influence other economic phenomena. A simple measure is the lagged
77: cross-correlation function $C_{X,Y}(\tau)=\langle X(t) Y(t+\tau)
78: \rangle / \sqrt{{\rm Var}[X] {\rm Var}[Y]}$, where the brackets
79: $\langle x \rangle$ denotes the statistical expectation of the
80: random variable $x$ and ${\rm Var}[x]$ is the variance of $x$. The
81: observation of a maximum of $C_{X,Y}(\tau)$ at some non-zero
82: positive time lag $\tau$ implies that the knowledge of $X$ at time
83: $t$ gives some information on the future realization of $Y$ at the
84: later time $t+\tau$. However, such correlations do not imply
85: necessarily causality in a strict sense as a correlation may be
86: mediated by a common source influencing the two time series at
87: different times. The concept of Granger causality bypasses this
88: problem by taking a pragmatic approach based on predictability: if
89: the knowledge of $X(t)$ and of its past values improves the
90: prediction of $Y(t+\tau)$ for some $\tau>0$, then it is said that
91: $X$ Granger causes $Y$ (see, e.g.,
92: \cite{Granger-1980-JEDC,Ashley-Granger-Schmalensee-1980-Em,Engle-White-1999}).
93: Such a definition does not address the fundamental philosophical and
94: epistemological question of the real causality links between $X$ and
95: $Y$ but has been found useful in practice. Our approach is similar
96: in that it does not address the question of the existence of a
97: genuine causality but attempts to detect a dependence structure
98: between two time series at non-zero (possibly varying) lags. We thus
99: use the term ``causality'' in a loose sense embodying the notion of
100: a dependence between two time series with a non-zero lag time.
101: 
102: Many alternative methods have been developed in the physical
103: community. Quiroga et al. proposed a simple and fast method to
104: measure synchronicity and time delay patterns between two time
105: series based on event synchronization
106: \cite{Quiroga-Kreuz-Grassberger-2002-PRE}. Furthermore, as a
107: generalization of the concept of recurrence plot to analyze complex
108: chaotic time series \cite{Eckmann-Kamphorst-Ruelle-1987-EPL}, Marwan
109: et al. developed cross-recurrence plot based on a distance matrix to
110: unravel nonlinear mapping of times between two systems
111: \cite{Marwan-Kurths-2002-PLA,Marwan-Thiel-Nowaczyk-2002-NPG}. In
112: Ref.~\cite{Sornette-Zhou-2005-QF}, we have introduced a novel
113: non-parametric method to test for the dynamical time evolution of
114: the lag-lead structure between two arbitrary time series based on a
115: thermal averaging of optimal paths embedded in the distance matrix
116: previously introduced in cross-recurrence plots. This method ignores
117: the thresholds used previously in constructing cross recurrence plot
118: \cite{Marwan-Kurths-2002-PLA,Marwan-Thiel-Nowaczyk-2002-NPG} and
119: focuses on the distance matrix. The idea consists in constructing a
120: distance matrix based on the matching of all sample data pairs
121: obtained from the two time series under study. The lag-lead
122: structure is searched for as the optimal path in the distance matrix
123: landscape that minimizes the total mismatch between the two time
124: series, and that obeys a one-to-one causal matching condition. To
125: make the solution robust with respect to the presence of noise that
126: may lead to spurious structures in the distance matrix landscape,
127: Sornette and Zhou generalized this search for a single absolute
128: optimal path by introducing a fuzzy search consisting in sampling
129: over all possible paths, each path being weighted according to a
130: multinomial logit or equivalently Boltzmann factor proportional to
131: the exponential of the global mismatch of this path
132: \cite{Sornette-Zhou-2005-QF}. The method is referred to in the
133: sequel as the thermal optimal path (TOP). Zhou and Sornette
134: investigated further the TOP method by considering difference
135: topologies of feasible paths and found that the two-layer scheme
136: gives the best performance \cite{Zhou-Sornette-2006-JMe}.
137: 
138: Here, we generalize the TOP method by introducing a definition of
139: distance which takes into account possible regime shifts between
140: positive and negative correlations. This extension allows us to
141: detect possible changes in the sign of the correlation between the
142: two time series. This is in part motivated by the problem of
143: identifying changes of conventions in economic and financial time
144: series. Keynes \cite{Keynes-1936} and Orl\'ean
145: \cite{Orlean-1986-Ec,Orlean-1987-Ca,Orlean-1989-RE,Orlean-1992-JEE,Orlean-2004,Boyer-Orlean-2004,Orlean-2004-Ra}
146: developed the concept of convention, according to which a pattern
147: can emerge from the self-fulfilling belief of agents acting on the
148: belief itself. Conventions are subject to shifts: in a recent study,
149: Wyart and Bouchaud claimed that the correlation between bond markets
150: and stock markets was positive in the past (because low long term
151: interest rates should favor stocks), but has recently quite suddenly
152: become negative as a new ``Flight To Quality'' convention has set
153: in: selling risky stocks and buying safe bonds has recently been the
154: dominant pattern \cite{Wyart-Bouchaud-2006-JEBO}. Similarly, Liu and
155: Liu analyzed the nexus between the historical volatility of the
156: output and of the inflation rate, using Chinese data from 1992 to
157: 2004 \cite{Liu-Liu-2005-ERJ}. They found that there is a strong
158: correlation between the two volatilities and, what is more
159: interesting, that the rolling correlation coefficient changes sign.
160: Such a change of sign of the correlation may be attributed either to
161: a shift in convention and/or to changing macroeconomic variables,
162: the two being possible entangled. Our method does not address the
163: source of the change of the sign of the correlation but provides
164: nevertheless a preliminary tool for detecting such changes of
165: correlations in an time-adaptive lead-lag framework.
166: 
167: The paper is organized as follows. In Section \ref{s1:Top}, we
168: present a brief description of our generalized TOP method.  We
169: recall that an advantage of the  TOP method is that it does not
170: require any {\it{a priori}} knowledge of the underlying dynamics.
171: The new TOP method is illustrated with the help of synthetic
172: numerical simulations in Section \ref{s1:NumSim}. Section
173: \ref{s1:Appl} presents the application of the method to the
174: investigation of a possible change of dependence between the
175: historical volatility of the USA inflation rate and the economic
176: growth rate. Section \ref{s1:concl} concludes.
177: 
178: 
179: \section{Thermal optimal path method \label{s1:Top}}
180: 
181: In Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe}, we have
182: presented the TOP method and several tests and applications. In this
183: section, to be self-contained, we briefly recall its main
184: characteristics in the context of the new proposed distance.
185: 
186: Consider two standardized time series $\{X(t_1):t_1=0,...,N\}$ and
187: $\{Y(t_2):t_2=0,...N\}$. The elements of the distance matrix
188: $E_{X,Y}$ between $X$ to $Y$ used in
189: Refs.~\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe} are
190: defined as
191: \begin{equation}
192: \epsilon_-(t_1,t_2) = [X(t_1)-Y(t_2)]^2~. \label{Eq:DM:minus}
193: \end{equation}
194: The value $[X(t_1)-Y(t_2)]^2$ defines the distance between the
195: realizations of the first time series at time $t_1$ and the second
196: time series at time $t_2$.
197: 
198: The distance matrix (\ref{Eq:DM:minus}) tracks the co-monotonic
199: relationship between $X$ and $Y$. But, two time series can be more
200: anti-monotonic than monotonic, i.e., they tend to take opposite
201: signs. Consider two limiting cases: (i) $Y(t)=X(t)$ and (ii)
202: $Y(t)=-X(t)$. Obviously, using the traditional distance
203: (\ref{Eq:DM:minus}) identifies case (i) as minimizing expression
204: (\ref{Eq:DM:minus}) for $t_1=t_2$ (actually the minimum is
205: identically zero in this special case). In contrast, notwithstanding
206: the fact that $Y(t)$ is perfectly (anti-)correlated with $X(t)$, the
207: naive idea of minimizing the distance (\ref{Eq:DM:minus}) between
208: the two time series becomes meaningless. In order to diagnose the
209: occurrence of anti-correlation, one needs to consider the
210: ``anti-monotonic'' distance
211: \begin{equation}
212: \epsilon_{+}(t_1,t_2) = [X(t_1)+Y(t_2)]^2~. \label{Eq:DM:plus}
213: \end{equation}
214: The $+$ sign ensures a correct search of synchronization between two
215: anti-correlated time series. More generally, $X$ and $Y$ may exhibit
216: more complicated lead-lag correlation relationships, positive
217: correlation over some time intervals and negative correlation at
218: other times (as in the change of conventions mentioned in the
219: introduction). In order to address all possible situations, we
220: propose to use the mixed distance expressed as follows:
221: \begin{equation}
222: \epsilon_{\pm}(t_1,t_2) =
223: \min[\epsilon_{-}(t_1,t_2),\epsilon_{+}(t_1,t_2)]~. \label{Eq:DM:pm}
224: \end{equation}
225: 
226: Fig.~\ref{Fig:TOP:TMM} is a schematic representation of how lead-lag
227: paths are defined. The first (resp. second) time series is indexed
228: by the time $t_1$ (resp. $t_2$). The nodes of the plane carry the
229: values of the distance (\ref{Eq:DM:pm}) for each pair $(t_1,t_2)$.
230: The path along the diagonal corresponds to taking $t_1=t_2$, i.e.,
231: compares the two time series at the same time. Paths below (resp.
232: above) the diagonal correspond to the second time series lagging
233: behind (resp. leading) the first time series. The figure shows three
234: arrows which define the three causal steps (time flows from the past
235: to the future both for $t_1$ and $t_2$) allowed in our construction
236: of the lead-lag paths. A given path selects a contiguous set of
237: nodes from the lower left to the upper right. The relevance or
238: quality of a given path with respect to the detection of the
239: lead-lag relationship between the two time series is quantified by
240: the sum of the distances (\ref{Eq:DM:pm}) along its length.
241: 
242: As shown in the figure, it is convenient to use the rotated
243: coordinate system $(x,t)$ such that
244: \begin{equation}
245: \left\{
246:    \begin{array}{ccl}
247:     t_1 &=& 1+\left(t-x\right)/2~ \\
248:     t_2 &=& 1+\left(t+x\right)/2~
249:     \end{array}
250: \right., \label{Eq:AxesTransform2}
251: \end{equation}
252: where $t$ is in the main diagonal direction of the $(t_1,t_2)$
253: system and $x$ is perpendicular to $t$. The origin $(x=0,t=0)$
254: corresponds to $(t_1=1,t_2=1)$. Then, the standard reference path is
255: the diagonal of equation $x=0$, and paths which have $x(t) \neq 0$
256: define varying lead-lag patterns. The idea of the TOP method is to
257: identify the lead-lag relationship between two time series as the
258: best path in a certain sense. One could first infer that the best
259: path is the one which has the minimum sum of its distances
260: (\ref{Eq:DM:pm}) along its length (paths are constructed with equal
261: lengths so as to be comparable). The problem with this idea is that
262: the noises decorating the two time series introduce spurious
263: patterns which may control the determination the path which
264: minimizes the sum of distances, leading to incorrect inferred
265: lead-lag relationships. In
266: Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe}, we have
267: shown that a robust lead-lag path is obtained by defining an average
268: over many paths, each weighted according to a Boltzmann-Gibbs
269: factor, hence the name ``thermal'' optimal path method.
270: 
271: \begin{figure}[htb]
272: \centering
273: \includegraphics[width=9cm]{FigTOP_TMM.eps}
274: \caption{(Color online) Representation of the two-layer approach in
275: the lattice $(t_1,t_2)$ and of the rotated frame $(t,x)$ as defined
276: in the text. The three arrows depict the three moves that are
277: allowed to reach any node in one step. } \label{Fig:TOP:TMM}
278: \end{figure}
279: 
280: Concretely, we first calculate the partition functions $G(x,t)$ and
281: their sum $G(t)=\sum_x G(x,t)$ so that $G(x,t)/G(t)$ can be interpreted as the
282: probability for a path to be at distance $x$ from the diagonal for a
283: distance $t$ along the diagonal. This probability $G(x,t)/G(t)$ is determined as
284: a compromise between minimizing the mismatch (similar to an ``energy'') and maximizing
285: the combinatorial weight of the number of paths with similar mismatchs in
286: a neighborhood (similar to an ``entropy''). As illustrated in Figure
287: \ref{Fig:TOP:TMM}, in order to arrive at $(t_1+1, t_2+1)$, a path
288: can come from $(t_1+1, t_2)$ vertically, $(t_1, t_2+1)$
289: horizontally, or $(t_1, t_2)$ diagonally. The recursive equation on
290: $G(x,t)$ is therefore
291: \begin{equation}\label{Eq:RecurG:xt}
292:       G(x,t+1) = [G(x-1,t)+ G(x+1,t)+G(x,t-1)]e^{-\epsilon_{\pm}(x,t)/T}~,
293: \end{equation}
294: where $\epsilon_{\pm}(x,t)$ is defined by (\ref{Eq:DM:pm}). This
295: recursion relation uses the same principle and is derived following
296: following the work of Wang et al.
297: \cite{Wang-Havlin-Schwartz-2000-JPCB}. To $G(x,t)$ at the $t$-th
298: layer, we need to know and bookkeep the previous two layers from
299: $G(\cdot,t-2)$ to $G(\cdot,t-1)$. After $G(\cdot,t)$ is determined,
300: the $G$'s at the two layers are normalized by $G(t)$ so that
301: $G(x,t)$ does not diverge at large $t$. We stress that the boundary
302: condition of $G(x,t)$ plays an crucial role. For $t=0$ and $t=1$,
303: $G(x,t) = 1$. For $t>1$, the boundary condition is taken to be
304: $G(x=\pm t,t) = 0$, in order to prevent paths to remain on the
305: boundaries.
306: 
307: Once the partition functions $G(x,t)$ have been calculated, we can
308: obtain any statistical average related to the positions of the paths
309: weighted by the set of $G(x,t)$. For instance, the local time lag
310: $\langle{x(t)}\rangle$ at time $t$ is given by
311: \begin{equation}
312:     \langle{x(t)}\rangle = \sum_x {xG(x,t)/G(t)}~.
313:     \label{Eq:Xave}
314: \end{equation}
315: Expression (\ref{Eq:Xave}) defines $\langle{x}\rangle$(t) as the
316: thermal average of the local time lag at $t$ over all possible
317: lead-lag configurations suitably weighted according to the
318: exponential of minus the measure $\epsilon_{\pm}(x,t)$ of the
319: similarities of two time series. For a given $x_0$ and temperature
320: $T$, we determine the thermal optimal path $\langle{x}\rangle(t)$.
321: We can also define an ``energy'' $e_T(x_0)$ to this path, defined as
322: the thermal average of the measure $\epsilon_{\pm}(x,t)$ of the
323: similarities of two time series:
324: \begin{equation}\label{Eq:e}
325:         e_T(x_0) = \frac{1}{2(N-|x_0|)-1}\sum_{t=|x_0|}^{2N-1-|x_0|}
326:         \sum_x {\epsilon_{\pm}(x,t)G(x,t)/G(t)}~.
327: \end{equation}
328: Obviously, the same set of calculations can be performed with
329: $\epsilon_-$ given by (\ref{Eq:DM:minus}) or with $\epsilon_{+}$
330: given by (\ref{Eq:DM:plus}). The former case has been investigated
331: in Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe}.
332: 
333: 
334: \section{Numerical experiments of the TOP approach on synthetic examples}
335: \label{s1:NumSim}
336: 
337: We now present synthetic tests of the efficiency of the optimal
338: thermal causal path method to detect multiple changes of regime.
339: Consider the following model
340: \begin{equation}
341: Y(t)=\left\{
342: \begin{array}{lr}
343:      +X(t-10) + \eta,  & ~~~~1\le t \le 100\\
344:      -X(t-~5) + \eta,  & ~~101\le t \le 200\\
345:      +X(t+~5) + \eta,  & ~~201\le t \le 300\\
346: \end{array}
347: \right.~, \label{Eq:Jump}
348: \end{equation}
349: where $\eta$ is a Gaussian white noise with variance $\sigma_\eta^2$
350: and zero mean. By construction, the time series $Y$ is lagging
351: behind $X$ with $\tau = 10$ in the first $100$ time steps, $Y$ is
352: still lagging behind $X$ with a reduced lag $\tau = 5$ in the next
353: $100$ time steps, and finally $Y$ leads $X$ with a lead time
354: $\tau=-5$ in the last $100$ time steps. In addition, $Y$ becomes
355: negatively correlated with $X$ in the middle interval, while it is positively
356: correlated with $X$ in the first and third interval.
357: The time series $X$ is
358: assumed to be the first-order auto-regressive process
359: \begin{equation}\label{Eq:TOP:AR}
360:     X(t) = 0.7X(t-1) + \xi~
361: \end{equation}
362: where $\xi$ is an i.i.d. white noise with zero mean and variance
363: $\sigma_\xi^2$. Our results are essentially the same when $X$ is
364: itself a white noise process. The two time series are standardized
365: before the construction of the distance matrix. Therefore, there is
366: only one parameter $f\triangleq\sigma_\xi/\sigma_\eta$
367: characterizing the signal-over-noise ratio of the lead-lag
368: relationship between $X$ and $Y$. We use $f=1/5$ in the simulations
369: presented below, corresponding to a weak signal-to-noise ratio.
370: 
371: Figure \ref{Fig:TOP:Jump:cmp} compares the reconstructed lead-lag
372: path $x(t)$ when using $\epsilon_-$ defined by (\ref{Eq:DM:minus}),
373: or $\epsilon_+$ defined by (\ref{Eq:DM:plus}), or $\epsilon_\pm$
374: defined by (\ref{Eq:DM:pm}). If the method worked perfectly, the
375: lead-lag path $x(t)$ would be equal to $x(t)=+10$ for $1\leqslant t
376: \leqslant 100$, $x(t)=+5$ for $101\leqslant t \leqslant 200$ and
377: $x(t)=-5$ for $201\leqslant t \leqslant 300$. One can observe that
378: the new proposed distance $\epsilon_\pm$ recovers the correct
379: solution up to moderate fluctuations. Unsurprisingly, the lead-lag
380: path reconstruction using $\epsilon_-$ gives the correct solution in
381: the first and third time intervals for which the correlation is
382: positive but is totally wrong with large fluctuations in the middle
383: time interval in which the correlation is negative. Symmetrically,
384: the lead-lag path reconstruction using $\epsilon_+$ gives the
385: correct solution in the middle interval where the correlation is
386: negative and is completely wrong with large fluctuations in the two
387: other intervals. Actually, we verify (not shown) that $\epsilon_\pm$
388: reduces to mostly $\epsilon_-$ in the first and third interval and
389: to $\epsilon_+$ in the middle interval, as it should.
390: 
391: \begin{figure}[htb]
392: \centering
393: \includegraphics[width=9cm]{FigTOP_Jump_cmp.eps}
394: \caption{(Color online) Comparison of the three lead-lag thermal
395: optimal paths using the three distances $\epsilon_-$ or
396: $\epsilon_+$, and $\epsilon_\pm$. The temperature is $T=0.1$.}
397: \label{Fig:TOP:Jump:cmp}
398: \end{figure}
399: 
400: 
401: Figure \ref{Fig:TOP:Jump:xt} tests the robustness of the
402: reconstructed lead-lag path using the distance $\epsilon_\pm$ with
403: respect to different choices of the temperature:  $T=1$, $0.2$,
404: $0.1$, and $0.01$. Recall that a vanishing temperature corresponds
405: to selecting the lead-lag path which has the minimum total sum of
406: distances along its length. At the opposite, a very large
407: temperature corresponds to wash out the information contained in the
408: distance matrix and treat all paths on the same footing. In between,
409: a finite temperature allows us to average the contribution over
410: neightboring paths with similar energies, making the estimated
411: lead-lag path more robust to noise-like structures in the distance
412: matrix due to noises decorating the two time series. It is apparent
413: that a too small temperature $T=0.01$ leads to spurious large spiky
414: fluctuations around the correct solution. A too large temperature
415: $T=1$ selects a thermally-averaged path which deviates from the
416: correct solution, here mostly at the beginning of the time series.
417: It seems that there is an optimal range of temperatures around
418: $T=0.1-0.2$ for which the correct solution is retrieved with minimal
419: fluctuations around it. The existence of an optimal range of
420: temperature is confirmed in the inset of Figure
421: \ref{Fig:TOP:Jump:xt}, which shows the root-mean-square (rms)
422: deviations between the reconstructed lead-lag path and the exact
423: solution ($x(t)=+10$ for $1\le t \le 100$, $x(t)=+5$ for $101\le t
424: \le 200$ and $x(t)=-5$ for $201\le t \le 300$) as a function of
425: temperature in the range $0.01 \leq T \leq 10$. The existence of a
426: well-defined optimal range of temperatures is strongest for smaller
427: signal-to-noise ratios $f\triangleq\sigma_\xi/\sigma_\eta$. For
428: large $f$ (weak noise), we observe that smaller temperatures are
429: better, as expected.
430: 
431: \begin{figure}[htb]
432: \centering
433: \includegraphics[width=9cm]{FigTOP_Jump_xt.eps}
434: \caption{(Color online) Thermally-averaged lead-lag paths of the
435: model (\ref{Eq:Jump}) for four different temperatures. Inset:
436: root-mean-square (rms) deviations between the reconstructed lead-lag
437: path and the exact solution ($x(t)=+10$ for $1\le t \le 100$,
438: $x(t)=+5$ for $101\le t \le 200$ and $x(t)=-5$ for $201\le t \le
439: 300$) as a function of temperature in the range $0.01 \leq T \leq
440: 10$.} \label{Fig:TOP:Jump:xt}
441: \end{figure}
442: 
443: The whole purpose of the new distance $\epsilon_\pm$ is to be able
444: to identify, not only the lead-lag structure better but also, the
445: existence of possible negative correlations as well as changes of
446: the sign of the correlation with time. We identify the sign
447: $s(t,x(t)) = s(t_1,t_2)$ of the cross-correlation of the two time
448: series at the times $t_1,t_2$ from the value of $\epsilon_\pm$: when
449: $\epsilon_\pm$ reduces to $\epsilon_-$ (resp. $\epsilon_+$), we
450: conclude that the correlation is positive (resp. negative). The
451: corresponding algorithm for the sign of the cross-correlations is
452: thus
453: \begin{equation}\label{Eq:Sign}
454:     s(t) = s(t_1,t_2) = \left\{
455:     \begin{array}{cc}
456:       +1 & ~~{\rm{if}}~~ \epsilon_\pm=\epsilon_- \\
457:       -1 & ~~{\rm{if}}~~ \epsilon_\pm=\epsilon_+
458:     \end{array}
459:     \right.
460: \end{equation}
461: Due to the noises on the two time series, $s(t)$ is also noisy. Thus,
462: to obtain a meaningful information on the sign of the
463: cross-correlations, we apply a smoothing algorithm to $s(t)$. For
464: this, we use the Savitzky-Golay filter with a linear function and
465: include 21 points to the left of each time (to ensure causality).
466: The filtered signal $S(t)$ is shown in
467: Fig.~\ref{Fig:TOP:Jump:Signal}. The results are quite consistent
468: with the model in which the correlation is negative in the middle
469: period $100<t<200$ and positive otherwise.
470: 
471: \begin{figure}[htb]
472: \centering
473: \includegraphics[width=9cm]{FigTOP_Jump_Signal.eps}
474: \caption{Reconstruction of the sign of the cross-correlation of the
475: model (\ref{Eq:Jump},\ref{Eq:TOP:AR}) by the smoothed sign
476: recognition given by expression (\ref{Eq:Sign}).}
477: \label{Fig:TOP:Jump:Signal}
478: \end{figure}
479: 
480: 
481: \section{Historical volatilities of inflation rate and economic output rate}
482: \label{s1:Appl}
483: 
484: In this section, we apply our novel technique to the relationship
485: between inflation and real economic output quantified by GDP in the
486: hope of providing new insights. This problem has attracted
487: tremendous interests in past decades in the macroeconomic
488: literature. Different theories have suggested that the impact of
489: inflation on the real economy activity could be either neutral,
490: negative, or positive. Based on the story of Mundell that higher
491: inflation would lower real output \cite{Mundell-1963-JPE}, Tobin
492: argued that higher inflation causes a shift from money to capital
493: investment and raise output per capita \cite{Tobin-1965-Em}, known
494: as the Mundell-Tobin effect. On the contrary, Fischer suggested a
495: negative effect, stating that higher inflation resulted in a shift
496: from money to other assets and reduced the efficiency of
497: transactions in the economy due to higher search costs and lower
498: productivity \cite{Fischer-1974-EI}. In the middle ground, Sidrauski
499: proposed a neutral effect where exogenous time preference fixed the
500: long-run real interest rate and capital intensity
501: \cite{Sidrauski-1967-AER}. These arguments are based on the rather
502: restrictive assumption that the Philips curve (inverse relationship
503: between inflation and unemployment), taken in addition to be linear,
504: is valid. To evaluate which model characterizes better real economic
505: systems, numerous empirical efforts have been performed and the
506: question is still open.
507: 
508: On the other hand, much focus is put on the nexus between inflation
509: and its uncertainty and economic activity. Okun made the hypothesis
510: of a positive correlation between inflation and inflation
511: uncertainty \cite{Okun-1971-BPEA}. Furthermore, Friedman argued that
512: an increase in the uncertainty of future inflation reduces the
513: economic efficiency and lowers the real output rate
514: \cite{Friedman-1977-JPE}, which is verified empirically (see, e.g.
515: \cite{Davis-Kanago-1996-OEP,Davis-Kanago-1998-JMCB,AlMarhubi-1998-AE,Grier-Perry-2000-JAEm,Hayford-2000-JMe,Fountas-Karanasos-Kim-2006-OBES}).
516: Following the seminal work of Taylor \cite{Taylor-1979-Em}, the
517: output-inflation variability trade-off has been tested extensively
518: in the literature, such as in
519: \cite{Defina-Stark-Taylor-1996-JMe,Fuhrer-1997-JMCB,Cobham-Macmillan-Mcmillan-2004-AEL,Lee-2002-SEJ,Lee-2004-CEP},
520: which are based on model specification. Liu and Liu analyzed the
521: relation between the historical volatility of the output and of the
522: inflation rate, using Chinese data from 1992 to 2004
523: \cite{Liu-Liu-2005-ERJ}. They found that there is a strong
524: correlation between the two volatilities and, what is more
525: interesting, that the rolling correlation coefficient changes its
526: sign. In the following, we investigate the nexus between the
527: historical volatilities of inflation and output in a model-free
528: manner to test for possible changes of the signs of their
529: cross-correlation structure.
530: 
531: The data sets, which were retrieved from the FRED II database,
532: include monthly consumer price index (CPI) for all urban consumers
533: and seasonally adjusted quarterly gross domestic product (GDP)
534: covering the time period from 1947 to 2005. The annualized rates of
535: inflation rate $r_{\rm{CPI}}$ and economic growth rate
536: $r_{\rm{GDP}}$ were calculated on a quarterly basis from the CPI and
537: GDP respectively. The historical volatility is calculated in a
538: rolling window as
539: \begin{equation}\label{Eq:TOP:VIVG}
540:     \nu(t) = \left[\frac{1}{\Delta{t}}\sum_{s=t-\Delta{t}+1/4}^{t} \left[r(t)-\mu(t)\right]^2
541:     \right]^{1/2}~,
542: \end{equation}
543: where $r=r_{\rm{CPI}}$ for inflation rate and $r=r_{\rm{GDP}}$ for
544: growth rate, and $\mu(t)$ is their corresponding mean in the rolling
545: window $[t-\Delta{t}+1/4,t]$. The unit of $t$ and $\Delta{t}$ is one
546: year. The resulting historical volatility series $\nu_{\rm{CPI}}(t)$
547: and $\nu_{\rm{GPD}}(t)$ are shown in the upper panel of
548: Fig.~\ref{Fig:TOP:InfGDP:VIVG} for the time period $[1950,1960]$,
549: with $\Delta{t}=3$ years. Since the volatility $\nu(t)$ is
550: non-stationary (as shown by a standard unit-root test), we use the
551: first-difference of volatility $\Delta{\nu}(t)$, shown in the lower
552: panel of Fig.~\ref{Fig:TOP:InfGDP:VIVG}. We focus on the 10-year
553: time period $[1950,1960]$ only for a clearer visualization since the
554: analysis and results are the same qualitatively in other time
555: periods.
556: 
557: \begin{figure}[htb]
558: \centering
559: \includegraphics[width=9cm]{FigTOP_InfGDP_y1950_VIVG.eps}
560: \caption{Upper panel: quarterly historical volatilities of the
561: annualized inflation rate and economic growth rate of the United
562: States of America; lower panel: their quarterly changes.}
563: \label{Fig:TOP:InfGDP:VIVG}
564: \end{figure}
565: 
566: Visual inspection of the lower panel of
567: Fig.~\ref{Fig:TOP:InfGDP:VIVG} suggests that the variations of the
568: volatilities $\nu_{\rm{CPI}}(t)$ and $\nu_{\rm{GPD}}(t)$ are
569: approximately synchronous from 1951 to 1954 and then become
570: approximately anti-phased from 1955 to 1958. Can this be confirmed
571: or falsified by the technique proposed here? To address this
572: question, we determine the smoothed sign function $S(t)$ determined
573: as explained at the end of the previous section. Our tests show that
574: the lead-lag path is close to the diagonal and that there is no
575: significant gain obtained by allowing for a time-varying lag between
576: the variations of the volatilities $\nu_{\rm{CPI}}(t)$ and
577: $\nu_{\rm{GPD}}(t)$. We thus calculate $S(t)$ by smoothing the
578: signal $s(t)$ defined by (\ref{Eq:Sign}) with the distance matrix
579: constructed using definition (\ref{Eq:DM:pm}) along the diagonal of
580: the plane $(t_1,t_2)$ (in other words, for $x(t)=0$). We again use
581: the causal Savitzky-Golay filter with a quadratic polynomial and
582: $N_L$ data points to the left of each time step $t$ plus the point
583: at $t$ itself. As shown in Fig.~\ref{Fig:TOP:InfGDP:Convention}, we
584: find that the sign signal function $S(t)$ is quite robust with
585: respect to variations of the smoothing parameter $N_L$ in the range
586: $N_L=5-15$. For comparison, we also plot in
587: Fig.~\ref{Fig:TOP:InfGDP:Convention} the cross-correlation function
588: $C(t)$ in rolling windows of three years.
589: 
590: \begin{figure}[htb]
591: \centering
592: \includegraphics[width=9cm]{FigTOP_InfGDP_y1950_Convention.eps}
593: \caption{Determination of the sign of the correlation between the
594: variations of the volatilities $\nu_{\rm{CPI}}(t)$ and
595: $\nu_{\rm{GPD}}(t)$ as a function of time in a running window of
596: three years. Our new method $S(t)$ (triangles with three values of
597: the smoothing parameter $N_L$) is compared with the
598: cross-correlation $C(t)$ in a running window of size equal to three
599: years (squares).} \label{Fig:TOP:InfGDP:Convention}
600: \end{figure}
601: 
602: The reconstructed sign of the correlations between variations of the
603: volatilities $\nu_{\rm{CPI}}(t)$ and $\nu_{\rm{GPD}}(t)$ is in good
604: agreement with and actually makes more precise the visual impression
605: mentioned above. In particular, one can observe that the transition
606: from a synchronicity to anti-phased was gradual with possible ups
607: and downs before the anti-correlation set in in 1956. In contrast,
608: the cross-correlation method suffers from a serious lack of
609: reactivity, predicting a change of correlation sign two years or so
610: after it actually happened. We can thus conclude that our new
611: measure outperforms significantly the traditional cross-correlation
612: measure for real-time identification of switching of correlation
613: structures.
614: 
615: 
616: 
617: 
618: 
619: \section{Concluding remarks}
620: \label{s1:concl}
621: 
622: We have extended the thermal optimal path method
623: \cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe} in order to, not
624: only identify the time-varying lead-lag structure between two time
625: series but also, to measure the sign of their cross-correlation. In
626: so doing, the identification of the lead-lag structure is improved
627: when there is the possibility for the sign of their correlation to
628: shift. In this goal, the main modification of the method previously
629: introduced in
630: Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe} consists in
631: generalizing the distance matrix in such a way that both correlated
632: and anti-correlated time series can be matched optimally.
633: 
634: A synthetic numerical example has been presented to verify the
635: validity of the new method. Extensive numerical simulations have
636: determined the existence of an optimal range $T\sim(0.1,1)$ of
637: temperatures to use for the robust thermal averaging. We have also
638: proposed a new measure, the sign signal function $S(t)$, that allows
639: us to identify the sign of the correlation structure between two
640: time series.
641: 
642: We have applied our new method to the investigation of possible
643: shifts between synchronous to anti-phased variations of the
644: historical volatility of the USA inflation rate and economic growth
645: rate. The two variables are found positively correlated and in a
646: synchronous state in the 1950's except over the time period from the
647: last quarter of 1954 till around 1958, when they were in a
648: asynchronous phase (approximately anti-phased). While the
649: traditional cross-correlation function fails to capture this
650: behavior, our new TOP method provides a precise quantification of
651: these regime shifts.
652: 
653: The emphasis of this paper has been methodological. Extensions will
654: investigate the economic meaning of the change of correlation
655: structures as shown here. One possible candidate is the concept of
656: shifts of convention, as discussed in the introduction. More work on
657: many more examples is needed to ascertain the generality of these
658: effects. Overall, the development of better and more precise
659: quantitative tools is progressively unraveling a picture according
660: to which variability and changes of correlation structures is the
661: rule rather than the exceptions in macroeconomics and in financial economics,
662: in the spirit of Aoki and Yoshikawa \cite{Aoki-Yoshikawa-2006}.
663: 
664: 
665: \bigskip
666: {\textbf{Acknowledgments:}}
667: 
668: We are grateful to M. Wyart for helpful discussions. This work was
669: partially supported by the National Natural Science Foundation of
670: China (Grant No. 70501011), the Fok Ying Tong Education Foundation
671: (Grant No. 101086), and the Alfred Kastler Foundation.
672: 
673: 
674: %\bibliography{Bibliography}
675: \bibliography{E:/papers/Bibliography}
676: 
677: 
678: \end{document}
679: