0607:physics0607197/TOP4.tex

1: \documentclass{elsart}

2: \usepackage{epsfig}

3: \usepackage{amsmath}

4: \usepackage{color}

5: \usepackage{amssymb}

6: \usepackage{graphicx}

7: \usepackage{subfigure}

8: %\usepackage{hyperref}

9:

10: \newcommand \be{\begin{equation}}

11: \newcommand \ba{\begin{eqnarray}}

12: \newcommand \ee{\end{equation}}

13: \newcommand \ea{\end{eqnarray}}

14: \bibliographystyle{elsart-num}

15: %\bibliographystyle{elsart-harv}

16:

17: \begin{document}

18: \runauthor{Zhou and Sornette} \markboth{A}{B}

19: \begin{frontmatter}

20: \title{Lead-lag cross-sectional structure and

21: detection of correlated-anticorrelated regime shifts: application to the

22: volatilities of inflation and economic growth rates}

23: \author[ecust,nice]{\small{Wei-Xing Zhou}},

24: \author[nice,ETH]{\small{Didier Sornette}\thanksref{EM}}

25: \address[ecust]{School of Business and Research Center of Systems

26: Engineering, East China University of Science and Technology,

27: Shanghai 200237, China}

28: \address[ETH]{Department of Management, Technology

29: and Economics, ETH Zurich\\ CH-8032 Zurich, Switzerland}

30: \address[nice]{Laboratoire de Physique de la Mati\`ere Condens\'ee,

31: CNRS UMR 6622 and Universit\'e de Nice-Sophia Antipolis, 06108 Nice

32: Cedex 2, France}

33: \thanks[EM]{Corresponding author. {\it E-mail address:}\/

34: sornette@ethz.ch (D. Sornette)\\

35: http://www.er.ethz.ch/}

36:

37: \begin{abstract}

38: We have recently introduced the ``thermal optimal path'' (TOP)

39: method to investigate the real-time lead-lag structure between two

40: time series. The TOP method consists in searching for a robust

41: noise-averaged optimal path of the distance matrix along which the

42: two time series have the greatest similarity. Here, we generalize

43: the TOP method by introducing a more general definition of distance

44: which takes into account possible regime shifts between positive and

45: negative correlations. This generalization to track possible changes

46: of correlation signs is able to identify possible transitions from

47: one convention (or consensus) to another. Numerical simulations on

48: synthetic time series verify that the new TOP method performs as

49: expected even in the presence of substantial noise. We then apply it

50: to investigate changes of convention in the dependence structure

51: between the historical volatilities of the USA inflation rate and

52: economic growth rate. Several measures show that the new TOP method

53: significantly outperforms standard cross-correlation methods.

54: \end{abstract}

55: %{\it{JEL classification:}} C14; E31; E58; G10

56:

57: \begin{keyword}

58: Thermal optimal path; time series; inflation; GDP growth; convention

59: \end{keyword}

60:

61: \end{frontmatter}

62:

63: \typeout{SET RUN AUTHOR to \@runauthor}

64: %

65: %\newpage            %

66: %\tableofcontents    %

67: %\newpage            %

68:

69:

70: \section{Introduction}

71: \label{s1:intro}

72:

73: The study of the lead-lag structure between two time series $X(t)$

74: and $Y(t)$ has a long history, especially in economics, econometrics

75: and finance, as it is often asked which economic variable might

76: influence other economic phenomena. A simple measure is the lagged

77: cross-correlation function $C_{X,Y}(\tau)=\langle X(t) Y(t+\tau)

78: \rangle / \sqrt{{\rm Var}[X] {\rm Var}[Y]}$, where the brackets

79: $\langle x \rangle$ denotes the statistical expectation of the

80: random variable $x$ and ${\rm Var}[x]$ is the variance of $x$. The

81: observation of a maximum of $C_{X,Y}(\tau)$ at some non-zero

82: positive time lag $\tau$ implies that the knowledge of $X$ at time

83: $t$ gives some information on the future realization of $Y$ at the

84: later time $t+\tau$. However, such correlations do not imply

85: necessarily causality in a strict sense as a correlation may be

86: mediated by a common source influencing the two time series at

87: different times. The concept of Granger causality bypasses this

88: problem by taking a pragmatic approach based on predictability: if

89: the knowledge of $X(t)$ and of its past values improves the

90: prediction of $Y(t+\tau)$ for some $\tau>0$, then it is said that

91: $X$ Granger causes $Y$ (see, e.g.,

92: \cite{Granger-1980-JEDC,Ashley-Granger-Schmalensee-1980-Em,Engle-White-1999}).

93: Such a definition does not address the fundamental philosophical and

94: epistemological question of the real causality links between $X$ and

95: $Y$ but has been found useful in practice. Our approach is similar

96: in that it does not address the question of the existence of a

97: genuine causality but attempts to detect a dependence structure

98: between two time series at non-zero (possibly varying) lags. We thus

99: use the term ``causality'' in a loose sense embodying the notion of

100: a dependence between two time series with a non-zero lag time.

101:

102: Many alternative methods have been developed in the physical

103: community. Quiroga et al. proposed a simple and fast method to

104: measure synchronicity and time delay patterns between two time

105: series based on event synchronization

106: \cite{Quiroga-Kreuz-Grassberger-2002-PRE}. Furthermore, as a

107: generalization of the concept of recurrence plot to analyze complex

108: chaotic time series \cite{Eckmann-Kamphorst-Ruelle-1987-EPL}, Marwan

109: et al. developed cross-recurrence plot based on a distance matrix to

110: unravel nonlinear mapping of times between two systems

111: \cite{Marwan-Kurths-2002-PLA,Marwan-Thiel-Nowaczyk-2002-NPG}. In

112: Ref.~\cite{Sornette-Zhou-2005-QF}, we have introduced a novel

113: non-parametric method to test for the dynamical time evolution of

114: the lag-lead structure between two arbitrary time series based on a

115: thermal averaging of optimal paths embedded in the distance matrix

116: previously introduced in cross-recurrence plots. This method ignores

117: the thresholds used previously in constructing cross recurrence plot

118: \cite{Marwan-Kurths-2002-PLA,Marwan-Thiel-Nowaczyk-2002-NPG} and

119: focuses on the distance matrix. The idea consists in constructing a

120: distance matrix based on the matching of all sample data pairs

121: obtained from the two time series under study. The lag-lead

122: structure is searched for as the optimal path in the distance matrix

123: landscape that minimizes the total mismatch between the two time

124: series, and that obeys a one-to-one causal matching condition. To

125: make the solution robust with respect to the presence of noise that

126: may lead to spurious structures in the distance matrix landscape,

127: Sornette and Zhou generalized this search for a single absolute

128: optimal path by introducing a fuzzy search consisting in sampling

129: over all possible paths, each path being weighted according to a

130: multinomial logit or equivalently Boltzmann factor proportional to

131: the exponential of the global mismatch of this path

132: \cite{Sornette-Zhou-2005-QF}. The method is referred to in the

133: sequel as the thermal optimal path (TOP). Zhou and Sornette

134: investigated further the TOP method by considering difference

135: topologies of feasible paths and found that the two-layer scheme

136: gives the best performance \cite{Zhou-Sornette-2006-JMe}.

137:

138: Here, we generalize the TOP method by introducing a definition of

139: distance which takes into account possible regime shifts between

140: positive and negative correlations. This extension allows us to

141: detect possible changes in the sign of the correlation between the

142: two time series. This is in part motivated by the problem of

143: identifying changes of conventions in economic and financial time

144: series. Keynes \cite{Keynes-1936} and Orl\'ean

145: \cite{Orlean-1986-Ec,Orlean-1987-Ca,Orlean-1989-RE,Orlean-1992-JEE,Orlean-2004,Boyer-Orlean-2004,Orlean-2004-Ra}

146: developed the concept of convention, according to which a pattern

147: can emerge from the self-fulfilling belief of agents acting on the

148: belief itself. Conventions are subject to shifts: in a recent study,

149: Wyart and Bouchaud claimed that the correlation between bond markets

150: and stock markets was positive in the past (because low long term

151: interest rates should favor stocks), but has recently quite suddenly

152: become negative as a new ``Flight To Quality'' convention has set

153: in: selling risky stocks and buying safe bonds has recently been the

154: dominant pattern \cite{Wyart-Bouchaud-2006-JEBO}. Similarly, Liu and

155: Liu analyzed the nexus between the historical volatility of the

156: output and of the inflation rate, using Chinese data from 1992 to

157: 2004 \cite{Liu-Liu-2005-ERJ}. They found that there is a strong

158: correlation between the two volatilities and, what is more

159: interesting, that the rolling correlation coefficient changes sign.

160: Such a change of sign of the correlation may be attributed either to

161: a shift in convention and/or to changing macroeconomic variables,

162: the two being possible entangled. Our method does not address the

163: source of the change of the sign of the correlation but provides

164: nevertheless a preliminary tool for detecting such changes of

165: correlations in an time-adaptive lead-lag framework.

166:

167: The paper is organized as follows. In Section \ref{s1:Top}, we

168: present a brief description of our generalized TOP method.  We

169: recall that an advantage of the  TOP method is that it does not

170: require any {\it{a priori}} knowledge of the underlying dynamics.

171: The new TOP method is illustrated with the help of synthetic

172: numerical simulations in Section \ref{s1:NumSim}. Section

173: \ref{s1:Appl} presents the application of the method to the

174: investigation of a possible change of dependence between the

175: historical volatility of the USA inflation rate and the economic

176: growth rate. Section \ref{s1:concl} concludes.

177:

178:

179: \section{Thermal optimal path method \label{s1:Top}}

180:

181: In Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe}, we have

182: presented the TOP method and several tests and applications. In this

183: section, to be self-contained, we briefly recall its main

184: characteristics in the context of the new proposed distance.

185:

186: Consider two standardized time series $\{X(t_1):t_1=0,...,N\}$ and

187: $\{Y(t_2):t_2=0,...N\}$. The elements of the distance matrix

188: $E_{X,Y}$ between $X$ to $Y$ used in

189: Refs.~\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe} are

190: defined as

191: \begin{equation}

192: \epsilon_-(t_1,t_2) = [X(t_1)-Y(t_2)]^2~. \label{Eq:DM:minus}

193: \end{equation}

194: The value $[X(t_1)-Y(t_2)]^2$ defines the distance between the

195: realizations of the first time series at time $t_1$ and the second

196: time series at time $t_2$.

197:

198: The distance matrix (\ref{Eq:DM:minus}) tracks the co-monotonic

199: relationship between $X$ and $Y$. But, two time series can be more

200: anti-monotonic than monotonic, i.e., they tend to take opposite

201: signs. Consider two limiting cases: (i) $Y(t)=X(t)$ and (ii)

202: $Y(t)=-X(t)$. Obviously, using the traditional distance

203: (\ref{Eq:DM:minus}) identifies case (i) as minimizing expression

204: (\ref{Eq:DM:minus}) for $t_1=t_2$ (actually the minimum is

205: identically zero in this special case). In contrast, notwithstanding

206: the fact that $Y(t)$ is perfectly (anti-)correlated with $X(t)$, the

207: naive idea of minimizing the distance (\ref{Eq:DM:minus}) between

208: the two time series becomes meaningless. In order to diagnose the

209: occurrence of anti-correlation, one needs to consider the

210: ``anti-monotonic'' distance

211: \begin{equation}

212: \epsilon_{+}(t_1,t_2) = [X(t_1)+Y(t_2)]^2~. \label{Eq:DM:plus}

213: \end{equation}

214: The $+$ sign ensures a correct search of synchronization between two

215: anti-correlated time series. More generally, $X$ and $Y$ may exhibit

216: more complicated lead-lag correlation relationships, positive

217: correlation over some time intervals and negative correlation at

218: other times (as in the change of conventions mentioned in the

219: introduction). In order to address all possible situations, we

220: propose to use the mixed distance expressed as follows:

221: \begin{equation}

222: \epsilon_{\pm}(t_1,t_2) =

223: \min[\epsilon_{-}(t_1,t_2),\epsilon_{+}(t_1,t_2)]~. \label{Eq:DM:pm}

224: \end{equation}

225:

226: Fig.~\ref{Fig:TOP:TMM} is a schematic representation of how lead-lag

227: paths are defined. The first (resp. second) time series is indexed

228: by the time $t_1$ (resp. $t_2$). The nodes of the plane carry the

229: values of the distance (\ref{Eq:DM:pm}) for each pair $(t_1,t_2)$.

230: The path along the diagonal corresponds to taking $t_1=t_2$, i.e.,

231: compares the two time series at the same time. Paths below (resp.

232: above) the diagonal correspond to the second time series lagging

233: behind (resp. leading) the first time series. The figure shows three

234: arrows which define the three causal steps (time flows from the past

235: to the future both for $t_1$ and $t_2$) allowed in our construction

236: of the lead-lag paths. A given path selects a contiguous set of

237: nodes from the lower left to the upper right. The relevance or

238: quality of a given path with respect to the detection of the

239: lead-lag relationship between the two time series is quantified by

240: the sum of the distances (\ref{Eq:DM:pm}) along its length.

241:

242: As shown in the figure, it is convenient to use the rotated

243: coordinate system $(x,t)$ such that

244: \begin{equation}

245: \left\{

246:    \begin{array}{ccl}

247:     t_1 &=& 1+\left(t-x\right)/2~ \\

248:     t_2 &=& 1+\left(t+x\right)/2~

249:     \end{array}

250: \right., \label{Eq:AxesTransform2}

251: \end{equation}

252: where $t$ is in the main diagonal direction of the $(t_1,t_2)$

253: system and $x$ is perpendicular to $t$. The origin $(x=0,t=0)$

254: corresponds to $(t_1=1,t_2=1)$. Then, the standard reference path is

255: the diagonal of equation $x=0$, and paths which have $x(t) \neq 0$

256: define varying lead-lag patterns. The idea of the TOP method is to

257: identify the lead-lag relationship between two time series as the

258: best path in a certain sense. One could first infer that the best

259: path is the one which has the minimum sum of its distances

260: (\ref{Eq:DM:pm}) along its length (paths are constructed with equal

261: lengths so as to be comparable). The problem with this idea is that

262: the noises decorating the two time series introduce spurious

263: patterns which may control the determination the path which

264: minimizes the sum of distances, leading to incorrect inferred

265: lead-lag relationships. In

266: Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe}, we have

267: shown that a robust lead-lag path is obtained by defining an average

268: over many paths, each weighted according to a Boltzmann-Gibbs

269: factor, hence the name ``thermal'' optimal path method.

270:

271: \begin{figure}[htb]

272: \centering

273: \includegraphics[width=9cm]{FigTOP_TMM.eps}

274: \caption{(Color online) Representation of the two-layer approach in

275: the lattice $(t_1,t_2)$ and of the rotated frame $(t,x)$ as defined

276: in the text. The three arrows depict the three moves that are

277: allowed to reach any node in one step. } \label{Fig:TOP:TMM}

278: \end{figure}

279:

280: Concretely, we first calculate the partition functions $G(x,t)$ and

281: their sum $G(t)=\sum_x G(x,t)$ so that $G(x,t)/G(t)$ can be interpreted as the

282: probability for a path to be at distance $x$ from the diagonal for a

283: distance $t$ along the diagonal. This probability $G(x,t)/G(t)$ is determined as

284: a compromise between minimizing the mismatch (similar to an ``energy'') and maximizing

285: the combinatorial weight of the number of paths with similar mismatchs in

286: a neighborhood (similar to an ``entropy''). As illustrated in Figure

287: \ref{Fig:TOP:TMM}, in order to arrive at $(t_1+1, t_2+1)$, a path

288: can come from $(t_1+1, t_2)$ vertically, $(t_1, t_2+1)$

289: horizontally, or $(t_1, t_2)$ diagonally. The recursive equation on

290: $G(x,t)$ is therefore

291: \begin{equation}\label{Eq:RecurG:xt}

292:       G(x,t+1) = [G(x-1,t)+ G(x+1,t)+G(x,t-1)]e^{-\epsilon_{\pm}(x,t)/T}~,

293: \end{equation}

294: where $\epsilon_{\pm}(x,t)$ is defined by (\ref{Eq:DM:pm}). This

295: recursion relation uses the same principle and is derived following

296: following the work of Wang et al.

297: \cite{Wang-Havlin-Schwartz-2000-JPCB}. To $G(x,t)$ at the $t$-th

298: layer, we need to know and bookkeep the previous two layers from

299: $G(\cdot,t-2)$ to $G(\cdot,t-1)$. After $G(\cdot,t)$ is determined,

300: the $G$'s at the two layers are normalized by $G(t)$ so that

301: $G(x,t)$ does not diverge at large $t$. We stress that the boundary

302: condition of $G(x,t)$ plays an crucial role. For $t=0$ and $t=1$,

303: $G(x,t) = 1$. For $t>1$, the boundary condition is taken to be

304: $G(x=\pm t,t) = 0$, in order to prevent paths to remain on the

305: boundaries.

306:

307: Once the partition functions $G(x,t)$ have been calculated, we can

308: obtain any statistical average related to the positions of the paths

309: weighted by the set of $G(x,t)$. For instance, the local time lag

310: $\langle{x(t)}\rangle$ at time $t$ is given by

311: \begin{equation}

312:     \langle{x(t)}\rangle = \sum_x {xG(x,t)/G(t)}~.

313:     \label{Eq:Xave}

314: \end{equation}

315: Expression (\ref{Eq:Xave}) defines $\langle{x}\rangle$(t) as the

316: thermal average of the local time lag at $t$ over all possible

317: lead-lag configurations suitably weighted according to the

318: exponential of minus the measure $\epsilon_{\pm}(x,t)$ of the

319: similarities of two time series. For a given $x_0$ and temperature

320: $T$, we determine the thermal optimal path $\langle{x}\rangle(t)$.

321: We can also define an ``energy'' $e_T(x_0)$ to this path, defined as

322: the thermal average of the measure $\epsilon_{\pm}(x,t)$ of the

323: similarities of two time series:

324: \begin{equation}\label{Eq:e}

325:         e_T(x_0) = \frac{1}{2(N-|x_0|)-1}\sum_{t=|x_0|}^{2N-1-|x_0|}

326:         \sum_x {\epsilon_{\pm}(x,t)G(x,t)/G(t)}~.

327: \end{equation}

328: Obviously, the same set of calculations can be performed with

329: $\epsilon_-$ given by (\ref{Eq:DM:minus}) or with $\epsilon_{+}$

330: given by (\ref{Eq:DM:plus}). The former case has been investigated

331: in Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe}.

332:

333:

334: \section{Numerical experiments of the TOP approach on synthetic examples}

335: \label{s1:NumSim}

336:

337: We now present synthetic tests of the efficiency of the optimal

338: thermal causal path method to detect multiple changes of regime.

339: Consider the following model

340: \begin{equation}

341: Y(t)=\left\{

342: \begin{array}{lr}

343:      +X(t-10) + \eta,  & ~~~~1\le t \le 100\\

344:      -X(t-~5) + \eta,  & ~~101\le t \le 200\\

345:      +X(t+~5) + \eta,  & ~~201\le t \le 300\\

346: \end{array}

347: \right.~, \label{Eq:Jump}

348: \end{equation}

349: where $\eta$ is a Gaussian white noise with variance $\sigma_\eta^2$

350: and zero mean. By construction, the time series $Y$ is lagging

351: behind $X$ with $\tau = 10$ in the first $100$ time steps, $Y$ is

352: still lagging behind $X$ with a reduced lag $\tau = 5$ in the next

353: $100$ time steps, and finally $Y$ leads $X$ with a lead time

354: $\tau=-5$ in the last $100$ time steps. In addition, $Y$ becomes

355: negatively correlated with $X$ in the middle interval, while it is positively

356: correlated with $X$ in the first and third interval.

357: The time series $X$ is

358: assumed to be the first-order auto-regressive process

359: \begin{equation}\label{Eq:TOP:AR}

360:     X(t) = 0.7X(t-1) + \xi~

361: \end{equation}

362: where $\xi$ is an i.i.d. white noise with zero mean and variance

363: $\sigma_\xi^2$. Our results are essentially the same when $X$ is

364: itself a white noise process. The two time series are standardized

365: before the construction of the distance matrix. Therefore, there is

366: only one parameter $f\triangleq\sigma_\xi/\sigma_\eta$

367: characterizing the signal-over-noise ratio of the lead-lag

368: relationship between $X$ and $Y$. We use $f=1/5$ in the simulations

369: presented below, corresponding to a weak signal-to-noise ratio.

370:

371: Figure \ref{Fig:TOP:Jump:cmp} compares the reconstructed lead-lag

372: path $x(t)$ when using $\epsilon_-$ defined by (\ref{Eq:DM:minus}),

373: or $\epsilon_+$ defined by (\ref{Eq:DM:plus}), or $\epsilon_\pm$

374: defined by (\ref{Eq:DM:pm}). If the method worked perfectly, the

375: lead-lag path $x(t)$ would be equal to $x(t)=+10$ for $1\leqslant t

376: \leqslant 100$, $x(t)=+5$ for $101\leqslant t \leqslant 200$ and

377: $x(t)=-5$ for $201\leqslant t \leqslant 300$. One can observe that

378: the new proposed distance $\epsilon_\pm$ recovers the correct

379: solution up to moderate fluctuations. Unsurprisingly, the lead-lag

380: path reconstruction using $\epsilon_-$ gives the correct solution in

381: the first and third time intervals for which the correlation is

382: positive but is totally wrong with large fluctuations in the middle

383: time interval in which the correlation is negative. Symmetrically,

384: the lead-lag path reconstruction using $\epsilon_+$ gives the

385: correct solution in the middle interval where the correlation is

386: negative and is completely wrong with large fluctuations in the two

387: other intervals. Actually, we verify (not shown) that $\epsilon_\pm$

388: reduces to mostly $\epsilon_-$ in the first and third interval and

389: to $\epsilon_+$ in the middle interval, as it should.

390:

391: \begin{figure}[htb]

392: \centering

393: \includegraphics[width=9cm]{FigTOP_Jump_cmp.eps}

394: \caption{(Color online) Comparison of the three lead-lag thermal

395: optimal paths using the three distances $\epsilon_-$ or

396: $\epsilon_+$, and $\epsilon_\pm$. The temperature is $T=0.1$.}

397: \label{Fig:TOP:Jump:cmp}

398: \end{figure}

399:

400:

401: Figure \ref{Fig:TOP:Jump:xt} tests the robustness of the

402: reconstructed lead-lag path using the distance $\epsilon_\pm$ with

403: respect to different choices of the temperature:  $T=1$, $0.2$,

404: $0.1$, and $0.01$. Recall that a vanishing temperature corresponds

405: to selecting the lead-lag path which has the minimum total sum of

406: distances along its length. At the opposite, a very large

407: temperature corresponds to wash out the information contained in the

408: distance matrix and treat all paths on the same footing. In between,

409: a finite temperature allows us to average the contribution over

410: neightboring paths with similar energies, making the estimated

411: lead-lag path more robust to noise-like structures in the distance

412: matrix due to noises decorating the two time series. It is apparent

413: that a too small temperature $T=0.01$ leads to spurious large spiky

414: fluctuations around the correct solution. A too large temperature

415: $T=1$ selects a thermally-averaged path which deviates from the

416: correct solution, here mostly at the beginning of the time series.

417: It seems that there is an optimal range of temperatures around

418: $T=0.1-0.2$ for which the correct solution is retrieved with minimal

419: fluctuations around it. The existence of an optimal range of

420: temperature is confirmed in the inset of Figure

421: \ref{Fig:TOP:Jump:xt}, which shows the root-mean-square (rms)

422: deviations between the reconstructed lead-lag path and the exact

423: solution ($x(t)=+10$ for $1\le t \le 100$, $x(t)=+5$ for $101\le t

424: \le 200$ and $x(t)=-5$ for $201\le t \le 300$) as a function of

425: temperature in the range $0.01 \leq T \leq 10$. The existence of a

426: well-defined optimal range of temperatures is strongest for smaller

427: signal-to-noise ratios $f\triangleq\sigma_\xi/\sigma_\eta$. For

428: large $f$ (weak noise), we observe that smaller temperatures are

429: better, as expected.

430:

431: \begin{figure}[htb]

432: \centering

433: \includegraphics[width=9cm]{FigTOP_Jump_xt.eps}

434: \caption{(Color online) Thermally-averaged lead-lag paths of the

435: model (\ref{Eq:Jump}) for four different temperatures. Inset:

436: root-mean-square (rms) deviations between the reconstructed lead-lag

437: path and the exact solution ($x(t)=+10$ for $1\le t \le 100$,

438: $x(t)=+5$ for $101\le t \le 200$ and $x(t)=-5$ for $201\le t \le

439: 300$) as a function of temperature in the range $0.01 \leq T \leq

440: 10$.} \label{Fig:TOP:Jump:xt}

441: \end{figure}

442:

443: The whole purpose of the new distance $\epsilon_\pm$ is to be able

444: to identify, not only the lead-lag structure better but also, the

445: existence of possible negative correlations as well as changes of

446: the sign of the correlation with time. We identify the sign

447: $s(t,x(t)) = s(t_1,t_2)$ of the cross-correlation of the two time

448: series at the times $t_1,t_2$ from the value of $\epsilon_\pm$: when

449: $\epsilon_\pm$ reduces to $\epsilon_-$ (resp. $\epsilon_+$), we

450: conclude that the correlation is positive (resp. negative). The

451: corresponding algorithm for the sign of the cross-correlations is

452: thus

453: \begin{equation}\label{Eq:Sign}

454:     s(t) = s(t_1,t_2) = \left\{

455:     \begin{array}{cc}

456:       +1 & ~~{\rm{if}}~~ \epsilon_\pm=\epsilon_- \\

457:       -1 & ~~{\rm{if}}~~ \epsilon_\pm=\epsilon_+

458:     \end{array}

459:     \right.

460: \end{equation}

461: Due to the noises on the two time series, $s(t)$ is also noisy. Thus,

462: to obtain a meaningful information on the sign of the

463: cross-correlations, we apply a smoothing algorithm to $s(t)$. For

464: this, we use the Savitzky-Golay filter with a linear function and

465: include 21 points to the left of each time (to ensure causality).

466: The filtered signal $S(t)$ is shown in

467: Fig.~\ref{Fig:TOP:Jump:Signal}. The results are quite consistent

468: with the model in which the correlation is negative in the middle

469: period $100<t<200$ and positive otherwise.

470:

471: \begin{figure}[htb]

472: \centering

473: \includegraphics[width=9cm]{FigTOP_Jump_Signal.eps}

474: \caption{Reconstruction of the sign of the cross-correlation of the

475: model (\ref{Eq:Jump},\ref{Eq:TOP:AR}) by the smoothed sign

476: recognition given by expression (\ref{Eq:Sign}).}

477: \label{Fig:TOP:Jump:Signal}

478: \end{figure}

479:

480:

481: \section{Historical volatilities of inflation rate and economic output rate}

482: \label{s1:Appl}

483:

484: In this section, we apply our novel technique to the relationship

485: between inflation and real economic output quantified by GDP in the

486: hope of providing new insights. This problem has attracted

487: tremendous interests in past decades in the macroeconomic

488: literature. Different theories have suggested that the impact of

489: inflation on the real economy activity could be either neutral,

490: negative, or positive. Based on the story of Mundell that higher

491: inflation would lower real output \cite{Mundell-1963-JPE}, Tobin

492: argued that higher inflation causes a shift from money to capital

493: investment and raise output per capita \cite{Tobin-1965-Em}, known

494: as the Mundell-Tobin effect. On the contrary, Fischer suggested a

495: negative effect, stating that higher inflation resulted in a shift

496: from money to other assets and reduced the efficiency of

497: transactions in the economy due to higher search costs and lower

498: productivity \cite{Fischer-1974-EI}. In the middle ground, Sidrauski

499: proposed a neutral effect where exogenous time preference fixed the

500: long-run real interest rate and capital intensity

501: \cite{Sidrauski-1967-AER}. These arguments are based on the rather

502: restrictive assumption that the Philips curve (inverse relationship

503: between inflation and unemployment), taken in addition to be linear,

504: is valid. To evaluate which model characterizes better real economic

505: systems, numerous empirical efforts have been performed and the

506: question is still open.

507:

508: On the other hand, much focus is put on the nexus between inflation

509: and its uncertainty and economic activity. Okun made the hypothesis

510: of a positive correlation between inflation and inflation

511: uncertainty \cite{Okun-1971-BPEA}. Furthermore, Friedman argued that

512: an increase in the uncertainty of future inflation reduces the

513: economic efficiency and lowers the real output rate

514: \cite{Friedman-1977-JPE}, which is verified empirically (see, e.g.

515: \cite{Davis-Kanago-1996-OEP,Davis-Kanago-1998-JMCB,AlMarhubi-1998-AE,Grier-Perry-2000-JAEm,Hayford-2000-JMe,Fountas-Karanasos-Kim-2006-OBES}).

516: Following the seminal work of Taylor \cite{Taylor-1979-Em}, the

517: output-inflation variability trade-off has been tested extensively

518: in the literature, such as in

519: \cite{Defina-Stark-Taylor-1996-JMe,Fuhrer-1997-JMCB,Cobham-Macmillan-Mcmillan-2004-AEL,Lee-2002-SEJ,Lee-2004-CEP},

520: which are based on model specification. Liu and Liu analyzed the

521: relation between the historical volatility of the output and of the

522: inflation rate, using Chinese data from 1992 to 2004

523: \cite{Liu-Liu-2005-ERJ}. They found that there is a strong

524: correlation between the two volatilities and, what is more

525: interesting, that the rolling correlation coefficient changes its

526: sign. In the following, we investigate the nexus between the

527: historical volatilities of inflation and output in a model-free

528: manner to test for possible changes of the signs of their

529: cross-correlation structure.

530:

531: The data sets, which were retrieved from the FRED II database,

532: include monthly consumer price index (CPI) for all urban consumers

533: and seasonally adjusted quarterly gross domestic product (GDP)

534: covering the time period from 1947 to 2005. The annualized rates of

535: inflation rate $r_{\rm{CPI}}$ and economic growth rate

536: $r_{\rm{GDP}}$ were calculated on a quarterly basis from the CPI and

537: GDP respectively. The historical volatility is calculated in a

538: rolling window as

539: \begin{equation}\label{Eq:TOP:VIVG}

540:     \nu(t) = \left[\frac{1}{\Delta{t}}\sum_{s=t-\Delta{t}+1/4}^{t} \left[r(t)-\mu(t)\right]^2

541:     \right]^{1/2}~,

542: \end{equation}

543: where $r=r_{\rm{CPI}}$ for inflation rate and $r=r_{\rm{GDP}}$ for

544: growth rate, and $\mu(t)$ is their corresponding mean in the rolling

545: window $[t-\Delta{t}+1/4,t]$. The unit of $t$ and $\Delta{t}$ is one

546: year. The resulting historical volatility series $\nu_{\rm{CPI}}(t)$

547: and $\nu_{\rm{GPD}}(t)$ are shown in the upper panel of

548: Fig.~\ref{Fig:TOP:InfGDP:VIVG} for the time period $[1950,1960]$,

549: with $\Delta{t}=3$ years. Since the volatility $\nu(t)$ is

550: non-stationary (as shown by a standard unit-root test), we use the

551: first-difference of volatility $\Delta{\nu}(t)$, shown in the lower

552: panel of Fig.~\ref{Fig:TOP:InfGDP:VIVG}. We focus on the 10-year

553: time period $[1950,1960]$ only for a clearer visualization since the

554: analysis and results are the same qualitatively in other time

555: periods.

556:

557: \begin{figure}[htb]

558: \centering

559: \includegraphics[width=9cm]{FigTOP_InfGDP_y1950_VIVG.eps}

560: \caption{Upper panel: quarterly historical volatilities of the

561: annualized inflation rate and economic growth rate of the United

562: States of America; lower panel: their quarterly changes.}

563: \label{Fig:TOP:InfGDP:VIVG}

564: \end{figure}

565:

566: Visual inspection of the lower panel of

567: Fig.~\ref{Fig:TOP:InfGDP:VIVG} suggests that the variations of the

568: volatilities $\nu_{\rm{CPI}}(t)$ and $\nu_{\rm{GPD}}(t)$ are

569: approximately synchronous from 1951 to 1954 and then become

570: approximately anti-phased from 1955 to 1958. Can this be confirmed

571: or falsified by the technique proposed here? To address this

572: question, we determine the smoothed sign function $S(t)$ determined

573: as explained at the end of the previous section. Our tests show that

574: the lead-lag path is close to the diagonal and that there is no

575: significant gain obtained by allowing for a time-varying lag between

576: the variations of the volatilities $\nu_{\rm{CPI}}(t)$ and

577: $\nu_{\rm{GPD}}(t)$. We thus calculate $S(t)$ by smoothing the

578: signal $s(t)$ defined by (\ref{Eq:Sign}) with the distance matrix

579: constructed using definition (\ref{Eq:DM:pm}) along the diagonal of

580: the plane $(t_1,t_2)$ (in other words, for $x(t)=0$). We again use

581: the causal Savitzky-Golay filter with a quadratic polynomial and

582: $N_L$ data points to the left of each time step $t$ plus the point

583: at $t$ itself. As shown in Fig.~\ref{Fig:TOP:InfGDP:Convention}, we

584: find that the sign signal function $S(t)$ is quite robust with

585: respect to variations of the smoothing parameter $N_L$ in the range

586: $N_L=5-15$. For comparison, we also plot in

587: Fig.~\ref{Fig:TOP:InfGDP:Convention} the cross-correlation function

588: $C(t)$ in rolling windows of three years.

589:

590: \begin{figure}[htb]

591: \centering

592: \includegraphics[width=9cm]{FigTOP_InfGDP_y1950_Convention.eps}

593: \caption{Determination of the sign of the correlation between the

594: variations of the volatilities $\nu_{\rm{CPI}}(t)$ and

595: $\nu_{\rm{GPD}}(t)$ as a function of time in a running window of

596: three years. Our new method $S(t)$ (triangles with three values of

597: the smoothing parameter $N_L$) is compared with the

598: cross-correlation $C(t)$ in a running window of size equal to three

599: years (squares).} \label{Fig:TOP:InfGDP:Convention}

600: \end{figure}

601:

602: The reconstructed sign of the correlations between variations of the

603: volatilities $\nu_{\rm{CPI}}(t)$ and $\nu_{\rm{GPD}}(t)$ is in good

604: agreement with and actually makes more precise the visual impression

605: mentioned above. In particular, one can observe that the transition

606: from a synchronicity to anti-phased was gradual with possible ups

607: and downs before the anti-correlation set in in 1956. In contrast,

608: the cross-correlation method suffers from a serious lack of

609: reactivity, predicting a change of correlation sign two years or so

610: after it actually happened. We can thus conclude that our new

611: measure outperforms significantly the traditional cross-correlation

612: measure for real-time identification of switching of correlation

613: structures.

614:

615:

616:

617:

618:

619: \section{Concluding remarks}

620: \label{s1:concl}

621:

622: We have extended the thermal optimal path method

623: \cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe} in order to, not

624: only identify the time-varying lead-lag structure between two time

625: series but also, to measure the sign of their cross-correlation. In

626: so doing, the identification of the lead-lag structure is improved

627: when there is the possibility for the sign of their correlation to

628: shift. In this goal, the main modification of the method previously

629: introduced in

630: Refs.\cite{Sornette-Zhou-2005-QF,Zhou-Sornette-2006-JMe} consists in

631: generalizing the distance matrix in such a way that both correlated

632: and anti-correlated time series can be matched optimally.

633:

634: A synthetic numerical example has been presented to verify the

635: validity of the new method. Extensive numerical simulations have

636: determined the existence of an optimal range $T\sim(0.1,1)$ of

637: temperatures to use for the robust thermal averaging. We have also

638: proposed a new measure, the sign signal function $S(t)$, that allows

639: us to identify the sign of the correlation structure between two

640: time series.

641:

642: We have applied our new method to the investigation of possible

643: shifts between synchronous to anti-phased variations of the

644: historical volatility of the USA inflation rate and economic growth

645: rate. The two variables are found positively correlated and in a

646: synchronous state in the 1950's except over the time period from the

647: last quarter of 1954 till around 1958, when they were in a

648: asynchronous phase (approximately anti-phased). While the

649: traditional cross-correlation function fails to capture this

650: behavior, our new TOP method provides a precise quantification of

651: these regime shifts.

652:

653: The emphasis of this paper has been methodological. Extensions will

654: investigate the economic meaning of the change of correlation

655: structures as shown here. One possible candidate is the concept of

656: shifts of convention, as discussed in the introduction. More work on

657: many more examples is needed to ascertain the generality of these

658: effects. Overall, the development of better and more precise

659: quantitative tools is progressively unraveling a picture according

660: to which variability and changes of correlation structures is the

661: rule rather than the exceptions in macroeconomics and in financial economics,

662: in the spirit of Aoki and Yoshikawa \cite{Aoki-Yoshikawa-2006}.

663:

664:

665: \bigskip

666: {\textbf{Acknowledgments:}}

667:

668: We are grateful to M. Wyart for helpful discussions. This work was

669: partially supported by the National Natural Science Foundation of

670: China (Grant No. 70501011), the Fok Ying Tong Education Foundation

671: (Grant No. 101086), and the Alfred Kastler Foundation.

672:

673:

674: %\bibliography{Bibliography}

675: \bibliography{E:/papers/Bibliography}

676:

677:

678: \end{document}

679: