0404:nlin0404054/PRE2.tex

1: % ****** Start of file apssamp.tex ******

2: %   This file is part of the APS files in the REVTeX 4 distribution.

3: %   Version 4.0 of REVTeX, August 2001

4: %   Copyright (c) 2001 The American Physical Society.

5: %   See the REVTeX 4 README file for restrictions and more information.

6: % TeX'ing this file requires that you have AMS-LaTeX 2.0 installed

7: % as well as the rest of the prerequisites for REVTeX 4.0

8: % See the REVTeX 4 README file

9: % It also requires running BibTeX. The commands are as follows:

10: %  1)  latex apssamp.tex

11: %  2)  bibtex apssamp

12: %  3)  latex apssamp.tex

13: %  4)  latex apssamp.tex

14: %\documentclass[preprint,showpacs,preprintnumbers,amsmath,amssymb]{revtex4}

15: % Some other (several out of many) possibilities

16: %\documentclass[preprint,aps]{revtex4}

17: %\documentclass[preprint,aps,draft]{revtex4}

18: %\documentclass[prb]{revtex4}% Physical Review B

19: % Include figure files

20: % Align table columns on decimal point

21: % bold math

22: %\nofiles

23:

24:

25: \documentclass[twocolumn,showpacs,preprintnumbers,a4paper,fleqn,pre]{revtex4}

26: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

27: \usepackage{amssymb}

28: \usepackage{amsfonts}

29: \usepackage{amsmath}

30: \usepackage{graphicx}

31: \usepackage{dcolumn}

32: \usepackage{bm}

33:

34: \setcounter{MaxMatrixCols}{10}

35: %TCIDATA{OutputFilter=LATEX.DLL}

36: %TCIDATA{Version=4.00.0.2312}

37: %TCIDATA{LastRevised=Thursday, January 06, 2005 15:44:58}

38: %TCIDATA{<META NAME="GraphicsSave" CONTENT="32">}

39: %TCIDATA{Language=American English}

40:

41: \input{tcilatex}

42:

43: \begin{document}

44:

45: \preprint{APS/123-QED}

46: \title{Surrogate Test to Distinguish between Chaotic and Pseudoperiodic Time

47: Series}

48: \author{Xiaodong Luo }

49: \email{enxdluo@eie.polyu.edu.hk}

50: \author{Tomomichi Nakamura}

51: \author{Michael Small}

52: \affiliation{Department of Electronic and Information Engineering, Hong Kong Polytechnic

53: University, Hung Hom, Hong Kong.}

54: \date{\today }

55:

56: \begin{abstract}

57: In this communication a new algorithm is proposed to produce surrogates for

58: pseudoperiodic time series. By imposing a few constraints on the noise

59: components of pseudoperiodic data sets, we devise an effective method to

60: generate surrogates. Unlike other algorithms, this method properly copes

61: with pseudoperiodic orbits contaminated with linear colored observational

62: noise. We will demonstrate the ability of this algorithm to distinguish

63: chaotic orbits from pseudoperiodic orbits through simulation data sets from

64: the

65: %TCIMACRO{\TeXButton{Rossler}{R\"{o}ssler} }%

66: %BeginExpansion

67: R\"{o}ssler

68: %EndExpansion

69: system. As an example of application of this algorithm, we will also employ

70: it to investigate a human electrocardiogram (ECG) record.

71: \end{abstract}

72:

73: \pacs{05.45.-a}

74: \maketitle

75:

76: \section{Introduction}

77:

78: Surrogate tests \cite{Theiler testing} are examples of Monte Carlo

79: hypothesis tests \cite{Galka topics}. Taking the surrogate test of

80: nonlinearity in a time series \cite{Theiler testing} as an example, we first

81: need to adopt a null hypothesis, which usually supposes the time series is

82: generated by a linear stochastic process and potentially filtered by a

83: nonlinear filter \cite{note nonlinearity}. Based on this null hypothesis, a

84: large number of data sets (surrogates) are to be produced from the original

85: time series, which keeps the linearity of the original time series but

86: destroys all other structures. We then calculate some nonlinear statistics

87: (discriminating statistics), for example, correlation dimension, of both the

88: original time series and the surrogates. If the discriminating statistic of

89: the original time series deviates from those of the surrogates, we can

90: reject the null hypothesis we proposed and claim that the original time

91: series is deterministic with certain confidence level (depending on how many

92: surrogates we have generated, to be shown later). In general, to apply the

93: surrogate technique to test if a time series possesses the property $P$ we

94: are interested, we first select a null hypothesis, which assumes the time

95: series instead has a property $Q$ opposite to $P$. We then devise a

96: corresponding algorithm to produce surrogates from the observed data set. In

97: principle, these surrogates shall preserve the potential property $Q$ while

98: destroying all others. The next step is to choose a suitable discriminating

99: statistic, which shall be an invariant measure for both the surrogates and

100: the original time series if the null hypothesis is true. Hence if the

101: discriminating statistic of the original time series distinctly deviates

102: from the distribution of the discriminating statistic of the surrogates, the

103: null hypothesis is unlikely to be true, or in other words, the time series

104: is much more likely to possess the property $P$ than $Q$. In this way, we

105: can assess the statistical significance of our calculations through

106: surrogate test technique even when we have only a very limited amount of

107: observations. Such assessments are important because in many practical

108: situations statistical fluctuations are inevitable due to the presence of

109: noise, hence the surrogate test is a proper tool to evaluate the reliability

110: of our results in a statistical sense.

111:

112: In this communication, we are focused on discussing the algorithm to

113: generate surrogates for pseudoperiodic time series. By pseudoperiodic time

114: series we mean a representative of a periodic orbit perturbed by dynamical

115: noise, or contaminated by observational noise, or with the combination of

116: the both noises, whose states within one cycle are largely independent of

117: those within previous cycles given a cycle length. Note that, in our

118: discussions we will always assume we have detected that the time series are

119: produced from nonlinear deterministic systems, but they are also possibly

120: contaminated by some noises. As we know, if an irregular time series comes

121: from a nonlinear deterministic system, it shall be either chaotic or

122: pseudoperiodic in most cases. In some situations, it might be important for

123: us to discriminate between pseudoperiodicity and chaos. However, chaotic and

124: pseudoperiodic time series often look similar, we might not be able to

125: distinguish them from each other only through visual inspections,

126: quantitative techniques are needed instead at this time. One choice is to

127: apply the direct test techniques. For instance, we can calculate some

128: characteristic statistics of the time series, such as the Lyapunov exponent

129: and the correlation dimension. However, a direct test usually will not give

130: out the confidence level. If we find the Lyapunov exponent of a time series

131: is, for example, $0.01$, it may be difficult for us to tell whether the time

132: series is chaotic or the time series is pseudoperiodic, but the presence of

133: noise causes the Lyapunov exponent to be slightly larger than zero. As an

134: alternative choice, we suggest one utilizes the surrogate test rather than

135: the direct test, which can provide us the confidence level by calculating a

136: large number of surrogates. Through the surrogate tests, if we could exclude

137: the possibility that the time series is pseudoperiodic, then the time series

138: is more likely to be chaotic. This is the essential idea to apply our

139: algorithm to distinguish chaos from pseudoperiodicity, as to be shown in

140: section III.

141:

142: First let us briefly review some of the algorithms to generate surrogates

143: for pseudoperiodic time series. Initially, to generate surrogates for

144: pseudoperiodic time series, Theiler \cite{Theiler on} proposed the cycle

145: shuffling algorithm. The idea is to divide the whole data set into some

146: segments and let each segment contain exactly an integer number of cycles.

147: The surrogates are obtained by randomly shuffling these segments, which will

148: preserve the intracycle dynamics but destroy the intercycle ones by

149: randomizing the temporal sequence of the individual cycles. The difficulty

150: in applying this algorithm is that it requires preknowledge of the precise

151: periodicity, otherwise shuffling the individual cycles might lead to

152: spurious results \cite{Small surrogate}.

153:

154: Recently, with the development of the cyclic theory of chaos \cite{Ayerbach

155: exploring}, many authors have shown interest in searching unstable periodic

156: orbits (UPOs) in noisy data sets from chaotic dynamical systems. The

157: algorithms proposed in \cite{Pierson detecting} essentially deal with the

158: unstable fixed points of the UPOs. But as observed, the presence of noise

159: will reduce the statistical significance of these algorithms. One remedy is

160: to introduce the surrogate test for reliability assessments, e.g., Dolan

161: \textit{et.al} \cite{Pierson detecting} claimed that the randomly shuffling

162: surrogate algorithm \cite{Theiler testing} together with the simple

163: recurrence method \cite{Pierson detecting} correctly tests the appropriate

164: null hypothesis. Essentially, this approach is very similar to the cycle

165: shuffling algorithm described previously. The simple recurrence algorithm is

166: equivalent to applying a Poincar\'{e} map on the continuous dynamical

167: systems and then studying only the data points falling on the cross-section

168: plane, hence one does not need to consider the intracycle dynamics and no

169: knowledge of the periodicity is required, while randomly shuffling these

170: data points exactly aims to randomize the temporal sequence of the cycles.

171: However, one potential problem of this algorithm is that it might generate

172: spuriously high statistical significance due to the correlation between the

173: cycles \cite{Petracchi the}.

174:

175: Later, Small \textit{et.al }\cite{Small surrogate}\textit{\ }proposed the

176: pseudoperiodic surrogate (PPS) algorithm from another viewpoint. They first

177: apply the time delay embedding reconstruction \cite{Takens detecting} to the

178: original data set, then utilize a method based on local linear modelling

179: techniques to produce surrogate data which approximate the behavior of the

180: underlying dynamical system. As the authors pointed out, this algorithm

181: works well even with very large dynamical noise, but it may incorrectly

182: reject the null hypothesis if the intercycles of the pseudoperiodic orbit

183: have a linear stochastic dependence induced by colored additive

184: observational noise \cite{note noise}.

185:

186: In this communication we propose a new surrogate algorithm for continuous

187: dynamical systems, which properly copes with linear stochastic dependence

188: between the cycles of the pseudoperiodic orbits. The null hypothesis to be

189: tested is that the stationary data set is pseudoperiodic with noise

190: components which are (approximately) identically distributed and

191: uncorrelated for sufficiently large temporal translations. Note the

192: constraints of the noise components in our null hypothesis are stronger than

193: that of Theiler's algorithm, which requires the noise distribution only

194: periodically depends on the phase of the signal. However, under our

195: hypothesis, we can produce the surrogates in a simple way through the

196: algorithm to be described below. In addition, a large scope of noise

197: processes often encountered in practical situations, including (but not

198: limited to) linear colored additive observational noise described by the $%

199: ARMA(p,q)$ model \cite{Box time}, match the above constraints.

200:

201: The remainder of this communication is organized as follows. In Sec. II we

202: will introduce the new algorithm to generate pseudoperiodic surrogates,

203: while in Sec. III we will apply this algorithm to simulation data sets from

204: the R\"{o}ssler system, which demonstrates the ability of the surrogate test

205: based on this algorithm to distinguish chaotic orbits from pseudoperiodic

206: ones. As one of the applications, we will use this surrogate technique to

207: investigate whether a human electrocardiogram (ECG) record is possibly

208: presentative of a chaotic dynamical system. Finally, in Sec. IV, we will

209: have a summary of the whole communication.

210:

211: \section{A New Algorithm to Generate Pseudoperiodic Surrogates}

212:

213: %TCIMACRO{%

214: %\TeXButton{rosslerP5perObvDim4}{\begin{figure}

215: %\centering

216: %\includegraphics[width=3.5in]{rosslerP5perObv.eps}

217: %

218: %\parbox{3.5in}{

219: %\caption{\label{rosslerP5perObvDim4} (a) Pseudoperiodic time series contaminated

220: %by observational noise;

221: %(b) State space $x_{i+n}$ vs. $x_{i}$ of the pseudoperiodic time series from the

222: %R\"{o}ssler system with $n=16$;

223: %(c) Surrogate test for the pseudoperiodic time series based on our algorithm.

224: %The abscissa is the indices

225: %of 100 surrogates and the ordinate is the corresponding correlation dimensions.

226: %The middle line is the mean correlation dimension of the original time series

227: %calculated $100$ times using the GKA, the upper and lower lines denote the

228: %correlation dimensions twice the standard deviation away from the mean value

229: %and the asterisks indicate the correlation dimensions of $100$ surrogates. }

230: %}

231: %

232: %\end{figure}}}%

233: %BeginExpansion

234: \begin{figure}

235: \centering

236: \includegraphics[width=3.5in]{rosslerP5perObv.eps}

237:

238: \parbox{3.5in}{

239: \caption{\label{rosslerP5perObvDim4} (a) Pseudoperiodic time series contaminated

240: by observational noise;

241: (b) State space $x_{i+n}$ vs. $x_{i}$ of the pseudoperiodic time series from the

242: R\"{o}ssler system with $n=16$;

243: (c) Surrogate test for the pseudoperiodic time series based on our algorithm.

244: The abscissa is the indices

245: of 100 surrogates and the ordinate is the corresponding correlation dimensions.

246: The middle line is the mean correlation dimension of the original time series

247: calculated $100$ times using the GKA, the upper and lower lines denote the

248: correlation dimensions twice the standard deviation away from the mean value

249: and the asterisks indicate the correlation dimensions of $100$ surrogates. }

250: }

251:

252: \end{figure}%

253: %EndExpansion

254:

255: Let $\left\{ x_{i}\right\} _{i=1}^{N}$ be a data set with $N$ observations

256: (the form $\left\{ x_{i}\right\} $ is adopted instead for convenience when

257: causing no confusion), where $x_{i}$ means the observation measured at time $%

258: t_{i}=i\cdot \Delta t_{s}$ with $\Delta t_{s}$ denoting the sampling time.

259: We assume $\left\{ x_{i}\right\} _{i=1}^{N}$ is stationary and can be

260: decomposed into the deterministic components and the noise components, which

261: are approximately independent of each other. Similar to the surrogate test

262: idea of time shifting to desynchronize two data sets \cite{quiroga

263: performance}, we assume the noise components (approximately) follow an

264: identical distribution and are uncorrelated for sufficiently large temporal

265: translations (or time shifts). According to the null hypothesis we proposed

266: in the previous section, if the deterministic components are periodic, then

267: we can write a data point $x_{i}$ as $x_{i}=p_{i}+n_{i}$, where $p_{i}$ and $%

268: n_{i}$ denote the periodic component and the noise component respectively.

269: In many cases, we can set $E(p_{i})=E(n_{i})=0$ where $E$ is the expectation

270: operator. Since $\left\{ p_{i}\right\} $ are roughly independent of $\left\{

271: n_{i}\right\} $, we have the autocovariance $%

272: var(x_{i})=var(p_{i})+var(n_{i}) $. Let

273: \begin{equation}

274: y_{i}^{\tau }=\alpha x_{i}+\beta x_{i+\tau }=(\alpha p_{i}+\beta p_{i+\tau

275: })+(\alpha n_{i}+\beta n_{i+\tau })  \label{linear combination}

276: \end{equation}%

277: with $i=1,2,~...,N-\tau $, where coefficients $\alpha $ and $\beta $ satisfy

278: $\alpha ^{2}+\beta ^{2}=1$ and parameter $\tau $ is the temporal translation

279: between subsets $\left\{ x_{i}\right\} _{i=1}^{N-\tau }$ and $\left\{

280: x_{i+\tau }\right\} _{i=1}^{N-\tau }$, then the autocovariance function $%

281: var(y_{i}^{\tau })=var(\alpha p_{i}+\beta p_{i+\tau })+var(\alpha

282: n_{i}+\beta n_{i+\tau })$. Now let us consider the noise components. If $%

283: \tau $ is sufficiently large, under our hypothesis, $n_{i}$ and $n_{i+\tau }$

284: are uncorrelated. We also note that $\left\{ n_{i}\right\} $ and $\left\{

285: n_{i+\tau }\right\} $ are drawn from (approximately) the same distribution,

286: we have $var(\alpha n_{i}+\beta n_{i+\tau })=var(n_{i})$. For the

287: deterministic component, if we require the translation $\tau $ to satisfy $%

288: cov(p_{i},p_{i+\tau })=0$, then $var(\alpha p_{i}+\beta p_{i+\tau

289: })=var(p_{i})$. Hence by choosing a suitable temporal translation, the noise

290: levels of $\left\{ y_{i}^{\tau }\right\} $, defined by $\left( var(\alpha

291: n_{i}+\beta n_{i+\tau })/var(y_{i}^{\tau })\right) ^{1/2}$, will be the same

292: as that of $\left\{ x_{i}\right\} _{i=1}^{N}$, i.e., $\left(

293: var(n_{i})/var(x_{i})\right) ^{1/2}$. The reason to preserve the noise level

294: is that, the presence of noise will affect the calculation of the

295: correlation dimension, hence we would like to let the surrogates and the

296: original time series (roughly) have the same noise level in order to make

297: the results more conceivable.

298:

299: %TCIMACRO{%

300: %\TeXButton{rosslerP5perObv015DDim4}{\begin{figure}

301: %\centering

302: %\includegraphics[width=3.5in]{rosslerP5perObv015D.eps}

303: %

304: %\parbox{3.5in}{

305: %\caption{\label{rosslerP5perObv015DDim4} (a) Pseudoperiodic time series

306: %with both observational noise and dynamical noise;

307: %(b) State space $x_{i+n}$ vs. $x_{i}$ of the pseudoperiodic time series from

308: %the R\"{o}ssler system with $n=16$;

309: %(c) Surrogate test for the pseudoperiodic time series based on our algorithm.

310: %The meaning of the lines

311: %and the asterisks is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4}. }

312: %}

313: %

314: %\end{figure}}}%

315: %BeginExpansion

316: \begin{figure}

317: \centering

318: \includegraphics[width=3.5in]{rosslerP5perObv015D.eps}

319:

320: \parbox{3.5in}{

321: \caption{\label{rosslerP5perObv015DDim4} (a) Pseudoperiodic time series

322: with both observational noise and dynamical noise;

323: (b) State space $x_{i+n}$ vs. $x_{i}$ of the pseudoperiodic time series from

324: the R\"{o}ssler system with $n=16$;

325: (c) Surrogate test for the pseudoperiodic time series based on our algorithm.

326: The meaning of the lines

327: and the asterisks is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4}. }

328: }

329:

330: \end{figure}%

331: %EndExpansion

332:

333: The above deduction leads to the central idea of our surrogate algorithm.

334: From Eq. (\ref{linear combination}), we note that if $\left\{ p_{i}\right\} $

335: is periodic, the nonconstant deterministic components $\left\{ \alpha

336: p_{i}+\beta p_{i+\tau }\right\} $ shall also be periodic. In addition, $%

337: \left\{ x_{i}\right\} _{i=1}^{N}$ and $\left\{ y_{i}^{\tau }\right\} $ shall

338: have the same noise level if a suitable translation $\tau $ is selected.

339: Therefore by randomizing the coefficient $\alpha $ or $\beta $, we can

340: generate many data sets $\left\{ y_{i}^{\tau }\right\} $ as the surrogates

341: of $\left\{ x_{i}\right\} _{i=1}^{N}$. Note that $\left\{ p_{i}\right\} $

342: and $\left\{ \alpha p_{i}+\beta p_{i+\tau }\right\} $ have the same

343: degree-of-freedom, if both of them are periodic, their correlation

344: dimensions \cite{Grassberger characterization} will theoretically be the

345: same. Now let us consider the noise components. Although the noise

346: components $\left\{ \alpha n_{i}+\beta n_{i+\tau }\right\} $ may have a

347: different distribution from that of $\left\{ n_{i}\right\} $, the noise

348: level is preserved after the transform in Eq. (\ref{linear combination}). As

349: Diks \cite{Diks estimating} has reported, the Gaussian kernel algorithm

350: (GKA) can reasonably estimate the correlation dimensions of noisy data sets

351: with different noise distributions. This implies that, under the same noise

352: level, the correlation dimensions of $\left\{ x_{i}\right\} _{i=1}^{N}$ and $%

353: \left\{ y_{i}^{\tau }\right\} $, calculated by the GKA, shall statistically

354: be the same if $\left\{ x_{i}\right\} _{i=1}^{N}$ and $\left\{ y_{i}^{\tau

355: }\right\} $ are both pseudoperiodic (and satisfy the constraints we

356: imposed). In contrast, if $\left\{ p_{i}\right\} $ is chaotic, its linear

357: combination, $\left\{ \alpha p_{i}+\beta p_{i+\tau }\right\} $, may have a

358: new dynamical structure with a different correlation dimension from that of $%

359: \left\{ p_{i}\right\} $, hence by adopting the correlation dimension as the

360: discriminating statistic we might detect this difference.

361:

362: We shall also note that, for an unstable periodic orbit, even a small

363: dynamical noise might drive the resultant orbit far away from the original

364: position after a sufficiently long time, and the pseudoperiodicity might be

365: broken. In such situations, our algorithm might fail to work. Nevertheless,

366: we suggest to apply our algorithm as the first step in pseudoperiodicity

367: test, which is computationally fast and in principle deals well with a large

368: scope of observational noise (comparatively, the PPS algorithm will

369: sometimes fail for colored observational noise). If we can reject the null

370: hypothesis proposed previously, the time series in test is possibly chaotic

371: or pseudoperiodic perturbed by dynamical noise. Then we can adopt the PPS

372: algorithm for further tests, which works well even under a large amount of

373: dynamical noise. If \ the corresponding null hypothesis, i.e., the time

374: series is pseudoperiodic perturbed by dynamical noise, can be rejected

375: again, then we may claim the time series is very likely to be chaotic.

376:

377: We now consider several computational issues in our algorithms. As described

378: in Eq. (\ref{linear combination}), to generate the surrogates $\left\{

379: y_{i}^{\tau }\right\} $, we select two subsets of $\left\{ x_{i}\right\}

380: _{i=1}^{N}$, $\left\{ x_{i}\right\} _{i=1}^{N-\tau }$ and $\left\{ x_{i+\tau

381: }\right\} _{i=1}^{N-\tau }$, multiply them by the coefficients $\alpha $ and

382: $\beta $ respectively and then add them together. We shall emphasize that

383: choosing the temporal translation $\tau $ is a crucial issue for our

384: algorithm. From one aspect, we require the translation $\tau $ to satisfy

385: the condition $cov(p_{i},p_{i+\tau })=0$. The reason is that we want to keep

386: the noise level for the original time series and the surrogates. In

387: addition, we want the deterministic components $\left\{ \alpha p_{i}\right\}

388: $ to be orthogonal to $\left\{ \beta p_{i+\tau }\right\} $ for arbitrary

389: coefficients $\alpha $ and $\beta $, otherwise the projection of $\left\{

390: \alpha p_{i}\right\} $ onto $\left\{ \beta p_{i+\tau }\right\} $ might

391: counteract $\left\{ \beta p_{i+\tau }\right\} $ under some situations, for

392: example, if $p_{i}\approx -p_{i+\tau }$ and $\alpha =\beta $, the

393: deterministic components $\left\{ \alpha p_{i}+\beta p_{i+\tau }\right\} $

394: will almost vanish while the noise components $\left\{ \alpha n_{i}+\beta

395: n_{i+\tau }\right\} $ remain. Hence the correlation dimensions calculated

396: are actually those of the noise components instead of the deterministic

397: components, which will certainly cause the false rejection of the null

398: hypothesis. From another aspect, we require $\tau $ to be sufficiently large

399: to guarantee the decorrelation between the noise components. However, we

400: expect $\left\{ x_{i}\right\} _{i=1}^{N-\tau }$ and $\left\{ x_{i+\tau

401: }\right\} _{i=1}^{N-\tau }$ shall have at least some overlaps to make use of

402: the information of the whole data set $\left\{ x_{i}\right\} _{i=1}^{N}$,

403: which means $\tau $ shall not exceed $N/2$. In addition, it is recommended

404: the length of a data set shall not be too short in order to appropriately

405: calculate its correlation dimension \cite{Jedynak failure}, which also

406: implies $\tau $ shall not be too large.

407:

408: %TCIMACRO{%

409: %\TeXButton{rossleCh5perObvDim5_pps}{\begin{figure}

410: %\centering

411: %\includegraphics[width=3.5in]{rossleCh5perObvDim5_pps.eps}

412: %

413: %\parbox{3.5in}{

414: %\caption{\label{rossleCh5perObvDim5_pps} (a) Chaotic time series

415: %contaminated by observational noise ;

416: %(b) State space $x_{i+n}$ vs. $x_{i}$ of the chaotic time series from

417: %the R\"{o}ssler system with $n=16$;

418: %(c) Surrogate test for the chaotic time series based on our new algorithm.

419: %The meaning of the lines

420: %and the curve is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4};

421: %(d) Surrogate test for the chaotic time series based on the PPS algorithm. The meaning

422: %of the lines and the asterisks is the same as that in panel $(c)$ of Fig.

423: %\ref{rosslerP5perObvDim4}.

424: %}

425: %}

426: %

427: %\end{figure}}}%

428: %BeginExpansion

429: \begin{figure}

430: \centering

431: \includegraphics[width=3.5in]{rossleCh5perObvDim5_pps.eps}

432:

433: \parbox{3.5in}{

434: \caption{\label{rossleCh5perObvDim5_pps} (a) Chaotic time series

435: contaminated by observational noise ;

436: (b) State space $x_{i+n}$ vs. $x_{i}$ of the chaotic time series from

437: the R\"{o}ssler system with $n=16$;

438: (c) Surrogate test for the chaotic time series based on our new algorithm.

439: The meaning of the lines

440: and the curve is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4};

441: (d) Surrogate test for the chaotic time series based on the PPS algorithm. The meaning

442: of the lines and the asterisks is the same as that in panel $(c)$ of Fig.

443: \ref{rosslerP5perObvDim4}.

444: }

445: }

446:

447: \end{figure}%

448: %EndExpansion

449:

450: From Eq. (\ref{linear combination}) we see that the surrogates are generated

451: from two segments $\{x_{i}\}_{i=1}^{N-\tau }$ and $\{x_{i+\tau

452: }\}_{i=1}^{N-\tau }$ of the original time series $\{x_{i}\}_{i=1}^{N}$. We

453: want segments $\{x_{i}\}_{i=1}^{N-\tau }$ and $\{x_{i+\tau }\}_{i=1}^{N-\tau

454: }$ to equivalently affect the generation of the surrogates, therefore we

455: would like to let $\max (\left\vert \alpha /\beta \right\vert )=\max

456: (\left\vert \beta /\alpha \right\vert )$, $\min (\left\vert \alpha /\beta

457: \right\vert )=\min (\left\vert \beta /\alpha \right\vert )$ and $\Pr

458: (\left\vert \alpha /\beta \right\vert \geqslant 1)\simeq \Pr (\left\vert

459: \beta /\alpha \right\vert \geqslant 1)$, where $\max (\cdot )$, $\min (\cdot

460: )$ and $\Pr (\cdot )$ denote the maximal function, the minimal function and

461: the probability function respectively. But note that the coefficient ratio $%

462: \alpha /\beta $ (or $\beta /\alpha $) shall not be too large or too small,

463: otherwise $\left\{ y_{i}^{\tau }\right\} $ will be very close to $\left\{

464: x_{i}\right\} _{i=1}^{N-\tau }$ or $\left\{ x_{i+\tau }\right\}

465: _{i=1}^{N-\tau }$, which will lead to approximately the same correlation

466: dimensions of $\left\{ x_{i}\right\} _{i=1}^{N}$ and $\left\{ y_{i}^{\tau

467: }\right\} $ regardless of the dynamical behavior of $\left\{ x_{i}\right\}

468: _{i=1}^{N}$, and thus decrease the discriminating power of the correlation

469: dimension. In our calculations we let $\alpha $ be uniformly drawn from the

470: interval $\left[ -0.8,-0.6\right] \cup \left[ 0.6,0.8\right] $ and $\beta =%

471: \sqrt{1-\alpha ^{2}}$, which satisfies our requirements and provides

472: moderate values for the ratio $\alpha /\beta $.

473:

474: \section{Surrogate Test to Distinguish between Chaotic and Pseudoperiodic

475: Time Series}

476:

477: In this section, through four examples from the R\"{o}ssler system, we

478: demonstrate the ability of surrogate test based on our algorithm to

479: discriminate chaotic orbits from pseudoperiodic ones. As an application, we

480: will also employ the surrogate technique to investigate whether a recorded

481: human electrocardiogram (ECG) data set is possibly chaotic.

482:

483: \subsection{EXAMPLES}

484:

485: The equations of the R\"{o}ssler system are given by

486: \begin{equation}

487: \left\{

488: \begin{array}{l}

489: \dot{x}=-y-z, \\

490: \dot{y}=x+ay, \\

491: \dot{z}=b+z(x-c).%

492: \end{array}%

493: \right.  \label{rossler}

494: \end{equation}%

495: with the initial conditions $x(0)=y(0)=z(0)=0.1$. We choose parameters $b=2$%

496: , $c=4$ and the sampling time $\Delta t_{s}$ $=0.1$ time units. For each

497: example, the system is to be integrated $10,000$ times and the first $1,000$

498: data points are discarded to avoid including transient states.

499:

500: In the first example, we set parameter $a=0.39095$. The R\"{o}ssler system

501: exhibits limit cycle behavior of period 6. To obtain pseudoperiodic time

502: series, we introduce $5\%$ observational noise into the periodic time

503: series. Although Gaussian white observational noise is the most common

504: choice in this situation, in order to demonstrate the ability of our

505: surrogate algorithm to deal with colored noise, we will instead adopt the

506: noise generated from the $AR(1)$ process \cite{Box time} $\xi _{i+1}=0.8\xi

507: _{i}+\epsilon _{i}$ with the variable $\epsilon $ following the normal

508: Gaussian distribution $N(0,1)$, which is the more difficult case due to the

509: correlation between noise components. However, one shall note that, Gaussian

510: white noise and other colored noises satisfying the constraints in our null

511: hypothesis, for example, those modelled by the $ARMA(p,q)$ processes, in

512: principle can be dealt with in the same way. For convenience of observation

513: and comparison, we plot the time series and the corresponding attractor in

514: two dimensional state space (or embedding space) in panels $\left( a\right) $

515: and $\left( b\right) $ of Fig. \ref{rosslerP5perObvDim4} respectively.

516:

517: To produce surrogate data, first we shall choose a suitable temporal

518: translation. Since it is impractical to separate noise from signal

519: completely, in general it is difficult to estimate the correlation decay

520: time between noise components. Fortunately, to decorrelate noise components,

521: all temporal translations are equivalent as long as they are large enough.

522: In addition, in many real situations, it is often possible to observe the

523: background noise and thus estimate the decay time. In our example, we think

524: the $AR(1)$ noise to be uncorrelated when the temporal translation is larger

525: than $50$ (in units of the sampling time $\Delta t_{s}$). As another

526: requirement, temporal translation satisfying $cov(p_{i},p_{i+\tau })=0$ is

527: desired. In practice, of course, this requirement is generally impractical

528: due to the digitization and quantization in sampling process. Recall the

529: discussion in the previous section, by letting $E(p_{i})=0$ and $\alpha

530: ^{2}+\beta ^{2}=1$, we have $var(\alpha p_{i}+\beta p_{i+\tau

531: })=var(p_{i})+2\alpha \beta \cdot cov(p_{i},p_{i+\tau })$. Function $%

532: cov(p_{i},p_{i+\tau })\neq 0$ means we do not preserve the noise level.

533: However, under the null hypothesis of pseudoperiodicity, there shall always

534: be some temporal translations to make $cov(p_{i},p_{i+\tau })\sim 0$, hence

535: the noise level will not deviate from the original one too much. Besides,

536: according to Eq. (\ref{linear combination}), we generate the surrogates by

537: uniformly drawing coefficient $\alpha $ from interval $\left[ -0.8,-0.6%

538: \right] \cup \left[ 0.6,0.8\right] $ ($\beta =\sqrt{1-\alpha ^{2}}$ is

539: always kept positive), the noise level of the surrogates will fluctuate

540: around that of the original one due to the alternative signs of product $%

541: \alpha \beta $. Therefore, $cov(p_{i},p_{i+\tau })\neq 0$ will only cause

542: some fluctuations when to calculate the correlation dimension because of the

543: fluctuations of noise level, however, generally such fluctuations will not

544: affect our conclusion if we can select a temporal translation $\tau $ to let

545: $cov(p_{i},p_{i+\tau })\sim 0$. Since we have assumed the noise components

546: are roughly independent of the deterministic components, then $%

547: cov(x_{i},x_{i+\tau })=cov(p_{i},p_{i+\tau })$ for a large enough temporal

548: translation (to decorrelate noise components), therefore in all of the

549: examples, in order to let $cov(p_{i},p_{i+\tau })\sim 0$, we can

550: equivalently require $cov(x_{i},x_{i+\tau })\sim 0$. In the first example,

551: there are many temporal translations satisfying the two constraints

552: discussed above, i.e., $\tau >50$ and $cov(x_{i},x_{i+\tau })\sim 0$. To

553: pick a value from all these candidates, we first select an interval $\left[

554: 100,150\right] $, then search the temporal translation which makes the

555: absolute value $\left\vert cov(x_{i},x_{i+\tau })\right\vert $ be the

556: minimum (most close to zero) among all translations $100\leqslant \tau

557: \leqslant 150$. One shall note that the choice of the interval $\left[

558: 100,150\right] $ is arbitrary, except that we have to make sure that the

559: lower bound of the interval is larger than $50$, and there exists temporal

560: translations to let $cov(x_{i},x_{i+\tau })\sim 0$ within the interval.

561: After selecting the temporal translation, by randomizing the coefficient $%

562: \alpha $ we will generate $100$ surrogates according to Eq. (\ref{linear

563: combination}).

564:

565: In order to calculate the correlation dimension, we adopt the time delay

566: embedding reconstruction \cite{Takens detecting} to recover the underlying

567: system from the scalar time series. Two parameters, i.e., embedding

568: dimension and time delay, shall be properly chosen to apply this technique.

569: Throughout this communication, we will use the false nearest neighbour

570: criterion \cite{kernel determining} to determine the global optimal

571: embedding dimension. Using the program in TISEAN package \cite{hegger

572: practical}, the embedding dimension $m$ of the original time series is

573: selected at $4$, which is the minimal value to make the fraction of false

574: nearest neighbours be zero. To choose a suitable time delay, we will use the

575: algorithm of redundancy and irrelevance tradeoff exponent (RITE) proposed in

576: \cite{Luo geometric}. This algorithm selects the time delay by searching the

577: optimal tradeoff between redundancy (due to too small time delay) and

578: irrelevance (due to too large time delay). As demonstrated, the RITE

579: algorithm can select equivalently suitable time delays compared to the

580: average mutual information (AMI) criterion \cite{fraser and swinney

581: independent}, however, its implementation is much simpler and the

582: computational cost is fairly low. Therefore in case of large data sets,

583: adopting the RITE algorithm can facilitate our calculations. In the first

584: example we generate $100$ surrogates, and for each surrogate we keep the

585: embedding dimension $m=4$ and use the RITE algorithm to choose the suitable

586: time delay for time delay reconstruction.

587:

588: %TCIMACRO{%

589: %\TeXButton{rossleCh5perObv015dDim4_pps}{\begin{figure}

590: %\centering

591: %\includegraphics[width=3.5in]{rossleCh5perObv015dDim4_pps.eps}

592: %

593: %\parbox{3.5in}{

594: %\caption{\label{rossleCh5perObv015dDim4_pps} (a) Chaotic time series

595: %with both dynamical and observational noises;

596: %(b) State space $x_{i+n}$ vs. $x_{i}$ of the chaotic time series from

597: %the R\"{o}ssler system with $n=16$;

598: %(c) Surrogate test for the chaotic time series based on our new algorithm.

599: %The meaning of the lines

600: %and the asterisks is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4};

601: %(d) Surrogate test for the chaotic time series based on the PPS algorithm. The meaning

602: %of the lines and the asterisks is the same as that in panel $(c)$ of Fig.

603: %\ref{rosslerP5perObvDim4}.

604: %}

605: %}

606: %

607: %\end{figure}}}%

608: %BeginExpansion

609: \begin{figure}

610: \centering

611: \includegraphics[width=3.5in]{rossleCh5perObv015dDim4_pps.eps}

612:

613: \parbox{3.5in}{

614: \caption{\label{rossleCh5perObv015dDim4_pps} (a) Chaotic time series

615: with both dynamical and observational noises;

616: (b) State space $x_{i+n}$ vs. $x_{i}$ of the chaotic time series from

617: the R\"{o}ssler system with $n=16$;

618: (c) Surrogate test for the chaotic time series based on our new algorithm.

619: The meaning of the lines

620: and the asterisks is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4};

621: (d) Surrogate test for the chaotic time series based on the PPS algorithm. The meaning

622: of the lines and the asterisks is the same as that in panel $(c)$ of Fig.

623: \ref{rosslerP5perObvDim4}.

624: }

625: }

626:

627: \end{figure}%

628: %EndExpansion

629:

630: We will follow Diks's method \cite{Diks estimating} to calculate the

631: correlation dimension, which is more robust against noise by extending the

632: hard kernel function (or the Heaviside function) \cite{Grassberger

633: characterization} in calculation of correlation integral to the general

634: kernel functions. In his discussions, Diks adopted the Gaussian kernel

635: function, hence this method is called Gaussian kernel algorithm (GKA). Here

636: we will use the GKA implemented in \cite{Yu efficient} to calculate the

637: correlation dimensions, which further enhances the computational speed. Note

638: that to speed up the calculation, only 2000 data points are used as the

639: reference points for the GKA. There are some statistical fluctuations even

640: for the same data set when calculating its correlation dimension, therefore

641: for the original time series, we will calculate $100$ times to estimate the

642: mean correlation dimension and the standard deviation. As shown in panel $%

643: \left( c\right) $ of Fig. \ref{rosslerP5perObvDim4}, there are three lines

644: parallel to the abscissa. The middle line denote the estimation of the mean

645: correlation dimension of the original time series, while the upper and lower

646: lines indicate the positions twice the standard deviation away from the mean

647: value. For the surrogates, however, we will calculate their correlation

648: dimensions only once to save time. The results are illustrated as the

649: asterisks in panel $\left( c\right) $ of Fig. \ref{rosslerP5perObvDim4}.

650:

651: After the calculation of the correlation dimensions, we need to inspect

652: whether the result is consistent with our null hypothesis. Here we use the

653: ranking criterion \cite{Theiler using} to determine whether the null

654: hypothesis shall be rejected or not. The idea of this criterion is that,

655: suppose the discriminating statistic of the original data set is $Q_{0}$,

656: and those of $N_{S}$ surrogates are $\left\{

657: Q_{1},Q_{2},~...,Q_{N_{S}}\right\} $. Rank the statistics $\left\{

658: Q_{0},Q_{1},~...,Q_{N_{S}}\right\} $ in the increasing order and denote the

659: rank of $Q_{0}$ by $r_{0}$, if the data set is consistent with the

660: hypothesis (i.e., no evidence to reject), $r_{0}$ can have an equal

661: possibility be any integer value between $1$ and $N_{S}+1$. However, if the

662: hypothesis is false, $Q_{0}$ might deviate from the surrogate distribution $%

663: \left\{ Q_{1},Q_{2},~...,Q_{N_{S}}\right\} $, i.e, $Q_{0}$ will be the

664: smallest or largest amongst $\left\{ Q_{0},Q_{1},~...,Q_{N_{S}}\right\} $,

665: hence we can reject the null hypothesis if $r_{0}=1$ or $N_{S}+1$, the

666: probability of a false rejection is $1/\left( N_{S}+1\right) $ for one-sided

667: tests and $2/\left( N_{S}+1\right) $ for two-sided tests.

668:

669: For the first example, from panel $\left( c\right) $ of Fig. \ref%

670: {rosslerP5perObvDim4} we can see that, the mean correlation dimension of the

671: original time series falls within the dimension distribution of the

672: surrogates, therefore we cannot reject the null hypothesis as we expect,

673: which means the original time series is possibly pseudoperiodic \cite{note

674: interpretation}.

675:

676: Now let us examine the other examples. In the second example, we still set

677: parameter $a=0.39095$ in Eq. (\ref{rossler}). However, to obtain the

678: pseudoperiodic time series, we first generate a data set by adding Gaussian

679: white noise with the standard deviation of $0.15\%$ to the $x$ component at

680: each integration step, which simulates the system perturbed by additive

681: dynamical noise, and then introduce $5\%$ observational $AR(1)$ noise into

682: the previously obtained data set. The global optimal embedding dimension is

683: chosen at $m=4$. Note in all of the four examples, we will generate $100$

684: surrogates, and parameter choices for surrogate generation will be the same,

685: i.e., we let the temporal translation be selected from $\left[ 100,150\right]

686: $ and coefficient $\alpha $ be uniformly drawn from $\left[ -0.8,-0.6\right]

687: \cup \left[ 0.6,0.8\right] $ ($\beta =\sqrt{1-\alpha ^{2}}$). For the second

688: example, the correlation dimensions of the original time series and the

689: surrogates are shown in panel $\left( c\right) $ of Fig. \ref%

690: {rosslerP5perObv015DDim4}. Under the ranking criterion, once again we cannot

691: reject our null hypothesis. Therefore the time series is possibly

692: pseudoperiodic, which is consistent with our knowledge.

693:

694: In the third example, we change parameter $a$ of Eq. (\ref{rossler}) to be $%

695: 0.395$. The R\"{o}ssler system exhibits chaotic behavior. We integrate Eq. (%

696: \ref{rossler}) to obtain a time series and then introduce $5\%$

697: observational $AR(1)$ noise. The optimal embedding dimension $m$ is selected

698: at $m=5$. From panel $\left( c\right) $ of Fig. \ref{rossleCh5perObvDim5_pps}%

699: , we find that the mean correlation dimension of the original time series

700: deviates from the distribution of the surrogate dimensions. Using the

701: ranking criterion, we can reject our null hypothesis. In order to exclude

702: the possibility that the time series is generated from an unstable period

703: orbit perturbed by dynamical noise, we also apply the PPS\ algorithm for

704: further test. From the PPS algorithm we generate $100$ surrogates, and then

705: use the GKA to calculate their correlation dimensions. The results are shown

706: in panel $\left( d\right) $ of Fig. \ref{rossleCh5perObvDim5_pps}, as we can

707: see, the mean correlation dimension of the original time series also falls

708: outside the distribution of the surrogate dimensions, therefore we can

709: reject the null hypothesis again. After the two surrogate tests for

710: pseudoperiodicity, we can claim the time series is chaotic with a confidence

711: level up to $96\%$ ($98\%\times 98\%$) for two-sided test.

712:

713: %TCIMACRO{%

714: %\TeXButton{ecgtestEm5}{\begin{figure}

715: %\centering

716: %\includegraphics[width=3.5in]{ecgtestEm5.eps}

717: %

718: %\parbox{3.5in}{

719: %\caption{\label{ecgtestEm5} (a) Time series of a human electrocardiogram

720: %(ECG) record;

721: %(b) State space $x_{i+n}$ vs. $x_{i}$ of the ECG record with $n=23$;

722: %(c) A surrogate data generated from our algorithm with coefficient $\alpha=\beta=1/\sqrt{2}$ (cf. Eq. (\ref{linear combination}));

723: %(d) Surrogate test for the ECG record based on our algorithm.

724: %The meaning of the lines

725: %and the asterisks is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4}. }

726: %}

727: %

728: %\end{figure}}}%

729: %BeginExpansion

730: \begin{figure}

731: \centering

732: \includegraphics[width=3.5in]{ecgtestEm5.eps}

733:

734: \parbox{3.5in}{

735: \caption{\label{ecgtestEm5} (a) Time series of a human electrocardiogram

736: (ECG) record;

737: (b) State space $x_{i+n}$ vs. $x_{i}$ of the ECG record with $n=23$;

738: (c) A surrogate data generated from our algorithm with coefficient $\alpha=\beta=1/\sqrt{2}$ (cf. Eq. (\ref{linear combination}));

739: (d) Surrogate test for the ECG record based on our algorithm.

740: The meaning of the lines

741: and the asterisks is the same as that in panel $(c)$ of Fig. \ref{rosslerP5perObvDim4}. }

742: }

743:

744: \end{figure}%

745: %EndExpansion

746:

747: The final example to be demonstrated is a chaotic time series also from the R%

748: \"{o}ssler system. To generate the time series, we keep parameter $a=0.395$.

749: Similar to the way in the second example, we add Gaussian white noise with

750: the standard deviation of $0.15\%$ to the $x$ component at each integration

751: step as the dynamical noise, and then introduce $5\%$ observational $AR(1)$

752: noise into the previously obtained data set. The global optimal embedding

753: dimension is found to be $m=4$. The results of surrogate tests based on the

754: new algorithm and the PPS algorithm are shown in panel $(c)$ and $(d)$ of

755: Fig. \ref{rossleCh5perObv015dDim4_pps} respectively, from which we can see

756: that, surrogate tests based on both algorithms can detect the chaos in the

757: time series. Again we can claim the time series is chaotic with a confidence

758: level up to $96\%$ for two-sided test.

759:

760: We have also investigated examples under different observational noise

761: levels (but keep the same dynamical noise if they have). For example, if we

762: reduce the $AR(1)$ observational noise levels to $3\%$ in the above four

763: examples, we can obtain the same results as we have reported. If we increase

764: the observational noise levels to $10\%$, for the pseudoperiodic time series

765: we can still obtain the expected results, i.e., we cannot reject our null

766: hypothesis. However, for the chaotic time series, we will falsely accept our

767: null hypothesis due to the correlation dimension of the original time series

768: marginally falling within the dimension distribution of the surrogates. The

769: reason of false acceptance might be that, under large noise levels, the

770: correlation dimension is not sensitive enough to detect the structure

771: changes of the chaotic time series. For such cases, we will have to look for

772: more powerful discriminating statistics \cite{schreiber dircrimination}.

773:

774: \subsection{AN APPLICATION}

775:

776: As an example of application, we employ the surrogate test based on our

777: algorithm to investigate whether a human electrocardiogram (ECG) record

778: (with $8975$ data points) is likely to be chaotic. The ECG record was

779: obtained by measuring from a resting healthy subject ($22$ years old) in a

780: shielded room at the sampling rate of $1$ KHz. The ECG record indicated in

781: panel $(a)$ of Fig. \ref{ecgtestEm5} looks very regular and even possibly

782: periodic, but we need a quantitative approach to verify the periodicity.

783: Here we choose the surrogate test technique. Using the false nearest

784: neighbour criterion, the global optimal embedding dimension is chosen at $%

785: m=5 $. The background noise is mainly from the measurement instruments,

786: usually it is a blend of white and correlated noise. By observing the linear

787: second order correlation function of the ECG data, we let the temporal

788: translation be within the interval $\left[ 100,150\right] $ (large enough to

789: decorrelate the noise components), where there exists an integer temporal

790: translation to make the correlation function almost be zero. Then by

791: uniformly drawing values from $\left[ -0.8,-0.6\right] \cup \left[ 0.6,0.8%

792: \right] $ for coefficient $\alpha $ in Eq. (\ref{linear combination}) ($%

793: \beta =\sqrt{1-\alpha ^{2}}$ ), we generate $100$ surrogates. For

794: demonstration, we plot in panel $\left( c\right) $ one surrogate generated

795: from Eq. (\ref{linear combination}) with coefficient $\alpha =\beta =1/\sqrt{%

796: 2}$. We can see that the surrogate is different from the original ECG data

797: in that there appear more spikes in the surrogate. However, as we can also

798: find, the surrogate indicates the similar regularity to that in the original

799: data, which implies that the surrogate preserves the potential periodicity

800: in the original data as we expect (although in a different pattern). With

801: regards to the surrogate test, our calculation of the correlation dimensions

802: is also based on the GKA. The results are indicated in panel $\left(

803: d\right) $ of Fig. \ref{ecgtestEm5}, from which we can see that the mean

804: correlation dimension of the ECG data falls within the distribution of the

805: correlation dimensions of the surrogates, therefore we cannot reject our

806: null hypothesis. Hence the ECG record is possibly periodic. Moreover, there

807: is no evidence of chaos.

808:

809: \section{Conclusion}

810:

811: To summarize, by imposing a few constraints on the noise process, we devise

812: a simple but effective way to produce surrogates for pseudoperiodic orbits.

813: The main idea of this algorithm is that a linear combination of any two

814: segments of the same periodic orbit will generate another periodic orbit. By

815: properly choosing the temporal translation between the two segments, under

816: the same noise level we can obtain statistically the same correlation

817: dimensions of the pseudoperiodic orbit and its surrogates. Choosing the

818: temporal translation is a crucial issue for our algorithm, which in

819: principle shall guarantee the decorrelation between the noise components and

820: preserve the noise level. Another important issue is to select a proper

821: discriminating statistic which helps determine whether to reject the null

822: hypothesis. The correlation dimension is a suitable discriminating statistic

823: in this case.\ It is possible there are other suitable discriminating

824: statistics, we will leave the problem of finding such statistics for future

825: study.

826:

827: The surrogate test technique based on our algorithm can be utilized to

828: distinguish between chaotic and pseudoperiodic time series. Initially, the

829: PPS algorithm was proposed to generate surrogates for a pseudoperiodic orbit

830: driven by dynamical noise, but sometimes surrogate tests based on this

831: algorithm will falsely reject the null hypothesis if the time series is also

832: contaminated by colored observational noise. As a complement to the PPS

833: algorithm, our algorithm deals well with observational noise, but it might

834: fail for large dynamical noise. However, due to the convenience in

835: computation, we suggest to adopt surrogate test based on our algorithm as

836: the first step for pseudoperiodicity detection. If we can reject the null

837: hypothesis of our algorithm, then we shall use the PPS algorithm for further

838: tests. If we can reject the null hypotheses of both the algorithms, then the

839: time series under investigation is very likely to be chaotic. In this

840: communication, the concrete procedures of surrogate test for

841: pseudoperiodicity are demonstrated through four simulation examples. As an

842: application in practice, we also employ the surrogate technique based on our

843: algorithm to investigate whether a human ECG record is possible to be

844: chaotic.

845:

846: This research is supported by a Hong Kong University Grants Council

847: Competitive Earmarked Research Grant (CERG) number PolyU 5235/03E.

848:

849: \begin{thebibliography}{10}

850: \bibitem[1]{Theiler testing} J. Theiler, S. Eubank, A. Longtin, B.

851: Galdrikian, and J.D. Farmer, Physica D 58, 77 (1992).

852:

853: \bibitem[2]{Theiler on} J. Theiler, Phys. Lett. A 196, 335 (1995).

854:

855: \bibitem[3]{Theiler using} J. Theiler and D. Prichard, Fields Inst. Commun.

856: 11, 99 (1997).

857:

858: \bibitem[4]{Galka topics} A. Galka, \textit{Topics in Nonlinear Time Series

859: Analysis with Implications for EEG Analysis} (World Scientific, 2000).

860:

861: \bibitem[5]{note nonlinearity} Here we would like to elucidate that, the

862: irregularity of a time series is usually brought by stochasticity or

863: nonlinearity (often chaos), therefore in the example of nonlinearity

864: detection, if we can exclude (most of) the probability that the time series

865: is generated by a stochastic process, then it is very likely that the time

866: series is generated from a nonlinear deterministic system.

867:

868: \bibitem[6]{Ayerbach exploring} D. Auerbach, P. Cvitanovi\'{c}, J.P.

869: Eckmann, and G. Gunaratne, Phys. Rev. Lett. 58, 2387 (1987); P. Cvitanovi%

870: \'{c}, Phys. Rev. Lett. 61, 2729 (1988).

871:

872: \bibitem[7]{Pierson detecting} D. Pierson and F. Moss, Phys. Rev. Lett. 75,

873: 2124 (1995); X. Pei and F. Moss, Nature (Lodon) 379, 618 (1996); P. So, E.

874: Ott, S.J. Schiff, D.T. Kaplan, T. Sauer, and C. Grebogi, Phys. Rev. Lett.

875: 76, 4705 (1996); P. Schmelcher and F.K. Diakonos, Phys. Rev. Lett. 78, 4733

876: (1997); K. Dolan, A. Witt, M.L. Spano, A. Neiman, and F. Moss, Phys. Rev. E

877: 59, 5235 (1999).

878:

879: \bibitem[8]{Petracchi the} D. Petracchi, Chaos, Solitons \& Fractals 8, 327

880: (1997);

881:

882: \bibitem[9]{Small surrogate} M. Small, D. Yu, and R.G. Harrison, Phys. Rev.

883: Lett. 87, 188101 (2001).

884:

885: \bibitem[10]{note noise} For the definitions of the terminologies such as

886: additive observational noise, please refer to \cite{Argyris the} for more

887: details.

888:

889: \bibitem[11]{Argyris the} J. Argyris, I. Andreadis, G. Pavlos, and M.

890: Athanasiou, Chaos, Solitons \& Fractals 9, 343 (1998).

891:

892: \bibitem[12]{quiroga performance} R. Quian Quiroga, A. Kraskov, T. Kreuz,

893: and P. Grassberger, Phys. Rev. E 65, 041903 (2002).

894:

895: \bibitem[13]{Grassberger characterization} P. Grassberger and I. Procaccia,

896: Phys. Rev. Lett. 50, 346 (1983).

897:

898: \bibitem[14]{Jedynak failure} A. Jedynak, M. Bach, and J. Timmer, Phys. Rev.

899: E 50, 1770 (1994).

900:

901: \bibitem[15]{Box time} G.E.P. Box, G.M. Jenkins, and G.C. Reinsel, \textit{%

902: Time Series Analysis: Forecasting and Control, 3rd Ed. }(Prentice-Hall, 1994)%

903: \textit{.}

904:

905: \bibitem[16]{Takens detecting} F. Takens, in D. Rand and L.S. Young,

906: editors, \textit{Dynamical Systems and Turbulence} (Springer, 1981).

907:

908: \bibitem[17]{kernel determining} M. B. Kennel, R. Brown, and H. D. I.

909: Abarbanel, Phys. Rev. A 45, 3403 (1992).

910:

911: \bibitem[18]{hegger practical} R. Hegger, H. Kantz, and T. Schreiber, CHAOS

912: 9, 413 (1999).

913:

914: \bibitem[19]{fraser and swinney independent} A. M. Fraser and H. L. Wwinney,

915: Phys. Rev. A 33, 1134 (1986).

916:

917: \bibitem[20]{Diks estimating} C. Diks, Phys. Rev. E 53, 4263 (1996).

918:

919: \bibitem[21]{Yu efficient} D.J. Yu, M. Small, R.G. Harrison, and C. Diks,

920: Phys. Rev. E 61, 3750 (2000).

921:

922: \bibitem[22]{Luo geometric} X. Luo and M. Small (submitted, available from

923: the URL: http://arxiv.org/abs/nlin.CD/0312023/).

924:

925: \bibitem[23]{note interpretation} We cannot claim the time series is

926: pseudoperiodic definitely, because if we cannot reject a null hypothesis,

927: there could be many interpretations other than that the data in test is

928: consistent with our null hypothsis. For more details, see, for example, Ref.

929: \cite{Galka topics}.

930:

931: \bibitem[24]{schreiber dircrimination} T. Schreiber and A. Schmitz, Phys.

932: Rev. E 55, 5443 (1997).

933: \end{thebibliography}

934:

935: \end{document}

936: