0604:math0604043/c14.tex

1: %First revision of annals paper on change-point transformation models

2: \documentclass[11pt]{article}

3:

4: \RequirePackage[OT1]{fontenc}

5:

6: \RequirePackage[aos,amsthm,amsmath,natbib]{imsarttech}

7: \usepackage{amsfonts,graphicx}

8:

9: \begin{document}

10:

11: \begin{frontmatter}

12:

13: \title{Further details on inference under right censoring

14: for transformation models

15: with a change-point based on a covariate threshold}

16: \runtitle{Change-point transformation models}

17: \author{Michael R. Kosorok\thanksref{t1}}

18: \and

19: \author{Rui Song\thanksref{t1}}

20: \affiliation{University of Wisconsin-Madison}

21: \address{Michael R. Kosorok\\ Departments of Statistics\\

22: and Biostatistics \& Medical Informatics\\

23: 1300 University Avenue\\

24: Madison, WI  53706\\USA\\Email: kosorok@biostat.wisc.edu}

25: \address{Rui Song\\ Department of Statistics\\

26: 1300 University Avenue\\

27: Madison, WI  53706\\USA\\Email: rsong@stat.wisc.edu}

28: \runauthor{M. R. Kosorok and R. Song}

29: \thankstext{t1}{Supported in part by Grant CA075142 from the

30: National Cancer Institute.}

31:

32: \begin{abstract}

33: We consider linear transformation models applied to right censored survival

34: data with a change-point in the regression coefficient based on a covariate

35: threshold.  We establish consistency and weak convergence of the

36: nonparametric maximum likelihood estimators.  The change-point parameter

37: is shown to be $\,n$-consistent, while the remaining parameters are shown to

38: have the expected root-$n$ consistency. We show that the

39: procedure is adaptive in the sense that the non-threshold parameters are

40: estimable with the same precision as if the true threshold value were known.

41: We also develop Monte-Carlo methods of inference for model parameters

42: and score tests for the existence of a change-point.

43: A key difficulty here is that some of the model

44: parameters are not identifiable under the null hypothesis of no change-point.

45: Simulation studies establish the validity of the proposed

46: score tests for finite sample sizes.

47: \end{abstract}

48:

49: \begin{keyword}[class=AMS]

50: \kwd[Primary ]{62N01}

51: \kwd{62F05}

52: \kwd[; secondary ]{62G20}

53: \kwd{62G10.}

54: \end{keyword}

55:

56: \begin{keyword}

57: \kwd{Change-point models}

58: \kwd{Empirical processes}

59: \kwd{Nonparametric maximum likelihood}

60: \kwd{Proportional hazards model}

61: \kwd{Proportional odds model}

62: \kwd{Right censoring}

63: \kwd{Semiparametric efficiency}

64: \kwd{Transformation models.}

65: \end{keyword}

66:

67: \end{frontmatter}

68:

69: \newtheorem{theorem}{\indent \sc Theorem}

70: \newtheorem{corollary}{\indent \sc Corollary}

71: \newtheorem{lemma}{\indent \sc Lemma}

72: \newtheorem{proposition}{\indent \sc Proposition}

73: \newtheorem{remark}{\indent \sc Remark}

74: \newcommand{\phif}{\textsc{igf}}

75: \newcommand{\sign}{\mbox{sign}}

76: \newcommand{\phgf}{\textsc{gf}}

77: \newcommand{\fix}{$\textsc{gf}_0$}

78: \newcommand{\mb}[1]{\mbox{\bf #1}}

79: \newcommand{\Exp}[1]{\mbox{E}\left[#1\right]}

80: \newcommand{\pr}[1]{\mbox{P}\left[#1\right]}

81: \newcommand{\pp}[0]{\mathbb{P}}

82: \newcommand{\ee}[0]{\mbox{E}}

83: \newcommand{\re}[0]{\mathbb{R}}

84: \newcommand{\argmax}[0]{\mbox{argmax}}

85: \newcommand{\argmin}[0]{\mbox{argmin}}

86: \newcommand{\ind}[0]{\mbox{\Large\bf 1}}

87: \newcommand{\narrow}{\stackrel{n\rightarrow\infty}{\longrightarrow}}

88: \newcommand{\weakpn}{\stackrel{P_n}{\leadsto}}

89: \newcommand{\weakpnboot}{\mbox{\raisebox{-1.5ex}{$\stackrel

90: {\mbox{\scriptsize $P_n$}}{\stackrel{\mbox{\normalsize $\leadsto$}}

91: {\mbox{\normalsize $\circ$}}}$}}\,}

92: \newcommand{\ol}[1]{\overline{#1}}

93: \newcommand{\avgse}[1] { \bar{\hat{\sigma}}_{#1} }

94: \newcommand{\mcse}[1]  { \sigma^{*}_{#1} }

95: \newcommand{\po}{\textsc{po}}

96: \newcommand{\ph}{\textsc{ph}}

97:

98: \section{Introduction} The linear transformation model states that

99: a continuous outcome $U$, given a $d$-dimensional covariate vector $Z$,

100: has the form

101: \begin{eqnarray}

102:  H(U)= - \beta ^\prime Z + \varepsilon, \label{s1.e1}

103: \end{eqnarray}

104: where $H$ is an increasing, unknown transformation function, $\beta\in\re^d$

105: are the unknown regression parameters of interest, and $\varepsilon$ has a

106: known distribution $F$. This model is readily applied to a failure time $T$

107: by letting $U=\log T$ and $H(u)=\log A(e^u)$, where $A$ is an

108: unspecified integrated baseline hazard. Setting

109: $F(s)=1-\exp(-e^s)$ results in the Cox model, while setting

110: $F(s)=e^s/(1+e^s)$ results in the proportional odds model. More

111: generally, the transformation model for a survival time $T$ conditionally

112: on a time-dependent covariate $\tilde{Z}(t)=\{Z(s),0\leq s\leq t\}$,

113: takes the form

114: \begin{eqnarray}

115: \pr{T>t|\tilde{Z}(t)}=S_Z(t)&\equiv&\Lambda\left(\int_0^te^{\beta'Z(s)}

116: dA(s)\right),\label{new.e1}

117: \end{eqnarray}

118: where $\Lambda$ is a known decreasing function

119: with $\Lambda(0)=1$. The model~(\ref{new.e1})

120: becomes model~(\ref{s1.e1}) when the covariates are time-independent

121: and $F(s)=1-\Lambda(e^s)$.

122:

123: In data analysis, the assumption of linearity of the regression

124: effect in~(\ref{new.e1}) is not always satisfied over the whole

125: range of the covariate, and the fit may be improved with a

126: two-phase transformation model having a change-point at an

127: unknown threshold of a one-dimensional covariate $Y$. Let

128: $Z=(Z_1,Z_2)$, where $Z_1$ and $Z_2$ are possibly time-dependent

129: covariates in $\re^p$ and $\re^q$, respectively, where $p+q=d$

130: and $q\geq 1$. The new model is obtained by replacing $\beta'Z(s)$

131: in~(\ref{new.e1}) with

132: \begin{eqnarray}

133: r_{\xi}(s;Z,Y)\equiv\beta'Z(s) +  [\alpha + \eta'Z_2(s)]\ind\{Y >

134: \zeta\},\label{new.e2}

135: \end{eqnarray}

136: where $\alpha$ is a scalar, $\eta\in\re^q$, $\ind\{B\}$ is the

137: indicator of $B$, and $\xi$

138: denotes the collected parameters $(\alpha,\beta,\eta,\zeta)$.

139: We also require $Y$ to be time-independent but allow it to possibly

140: be one of the covariates in $Z(t)$. The overall goal of this paper

141: is to develop methods of inference for this model applied to

142: right censored data.

143:

144: We note that for the special case when $\alpha=0$ and

145: $\Lambda(t)=e^{-t}$, the model~(\ref{new.e2})

146: becomes the Cox model considered by \cite{p03} under a slightly

147: different parameterization. Permitting a nonzero $\alpha$

148: allows the possibility of a ``bent-line''

149: covariate effect. Suppose, for example, that

150: $Z_2$ is one-dimensional and time-independent,

151: while $Z_1\in\re^{d-1}$ may be time-dependent.

152: If we set $Y=Z_2$ and $\beta=(\beta_1',\beta_2')'$, where

153: $\beta_1\in\re^{d-1}$ and $\beta_2\in\re$, the model~(\ref{new.e2})

154: becomes $r_{\xi}(s;Z,Y)=\beta_1'Z_1(s)+\beta_2Z_2

155: +(\alpha+\eta Z_2)\ind\{Z_2>\zeta\}$.

156: When $\alpha=-\eta\zeta$, the covariate effect for $Z_2$

157: consists of two connected linear segments.  In many biological settings,

158: such a bent-line effect is realistic and can be much easier

159: to interpret than a quadratic or more complex nonlinear effect \cite{c89}.

160: Hence including the intercept term $\alpha$ is useful

161: for applications.

162:

163: Linear transformation models of the form~(\ref{s1.e1}) have been

164: widely used and studied (see, for example, \cite{bc64,bd81,bc82,p82,dd88,

165: cwy95,cwy97,fyw98,bn04}). Efficient methods of estimation in the

166: uncensored setting were rigorously studied by \cite{br97},

167: among others. The model~(\ref{new.e1}) for right-censored data has

168: also been studied rigorously for a variety of specific choices of

169: $\Lambda$ \cite{p84,mrv97,stg98,s98}; for

170: general but known $\Lambda$ \cite{sv04}; and for certain

171: parameterized families of $\Lambda$ \cite{klf04}.

172:

173: Change-point models have also been studied extensively and have

174: proven to be popular in clinical research. Several researchers have

175: considered a nonregular Cox model involving a two-phase regression on

176: time-dependent covariates, with a change-point at an unknown time

177: \cite{lsl90,lb97,ltc97}. As mentioned above, \cite{p03}

178: considered the Cox model with a

179: change-point at an unknown threshold of a covariate.

180: These authors studied the maximum partial likelihood estimators of

181: the parameters and the estimator of the baseline hazard function.

182: They show that the estimator of the threshold parameter

183: is $n$-consistent, while the regression parameters are

184: $\sqrt{n}$-consistent. This happens because

185: the likelihood function is not differentiable with respect to the

186: threshold parameter, and hence the usual Taylor expansion

187: is not available. In this paper, we focus on

188: the covariate threshold setting. While time threshold models are

189: also interesting, we will not pursue them further in this paper

190: because the underlying techniques for estimation and

191: inference are quite distinct from the covariate threshold setting.

192:

193: The contribution of our paper builds on \cite{p03} in three

194: important ways.  Firstly, we extend to general transformation models.

195: This results in a significant increase in complexity over the Cox

196: model since estimation of the baseline hazard can no longer be

197: avoided through the use of the partial-profile likelihood.  Secondly,

198: we study nonparametric maximum likelihood inference for all model parameters.

199: As part of this, we show that the estimation procedure is adaptive in the

200: sense that the non-threshold parameters---including the infinite-dimensional

201: parameter~$A$---are estimable with the same precision

202: as if the true threshold parameter were known. Thirdly, we develop hypothesis

203: tests for the existence of a change-point. This is quite challenging

204: since some of the model parameters are no longer identifiable under

205: the null hypothesis of no change-point.  \cite{a01} considers

206: similar nonstandard testing problems when the model is fully

207: parametric and establishes asymptotic null and local alternative

208: distributions of a number of likelihood-based test procedures.

209: Unfortunately, Andrews' results are not directly

210: applicable to our setting because of the presence of an infinite

211: dimensional nuisance parameter, the baseline integrated hazard $A$,

212: and new methods are required.

213:

214: The next section, section~2, presents the data and model

215: assumptions. The nonparametric maximum log-likelihood estimation

216: (NPMLE) procedure is

217: presented in section~3. In section~4, we establish the consistency

218: of the estimators. Score and information operators of the regular

219: parameters are given in section~5. Results on the convergence rates

220: of the estimators are established in section~6. Section~7 presents

221: weak convergence results for the estimators, including the

222: asymptotic distribution of the change-point estimator and

223: the asymptotic normality of the other parameters. This section

224: also establishes the adaptive semiparametric efficiency

225: mentioned above. Monte Carlo inference for the parameters is discussed

226: in section~8. Methods for testing the existence of a change-point are

227: then presented in section~9. A brief discussion on implementation

228: and a small simulation study evaluating the moderate

229: sample size performance of the proposed change-point tests are

230: given in section~10. Proofs are given in section~11.

231:

232: \section{The data set-up and model assumptions}

233: The data $X_i=(V_i,\delta_i,$ $Z_i,Y_i)$, $i=1,\ldots,n$, consists of $n$

234: i.i.d. realizations of $X = (V,\delta,Z,Y)$, where $V = T \land C$, $\delta

235: = 1(T \le C)$, and $C$ is a right censoring time. The analysis is

236: restricted to the interval $[0, \tau]$, where $\tau < \infty$.

237: The covariate $Y\in\re$ and $Z \equiv \{Z(t), t \in [0, \tau] \}$ is assumed

238: to be a caglad (left-continuous with right-hand limits) process with

239: $Z(t)=(Z_1'(t),Z_2'(t))'\in\re^p\times\re^q$, for all $t\in [0, \tau]$,

240: where $q\geq 1$ but $p=0$ is allowed.

241:

242: We assume that conditionally on $Z$ and $Y$, the survival function

243: at time $t$ has the form:

244: \begin{eqnarray}

245: S_{Z,Y}(t)\equiv\Lambda \left(\int_0^t e^{r_{\xi}(u;Z,Y)}dA(u)\right),

246: \label{s2.e1}

247: \end{eqnarray}

248: where $\Lambda$ is a known, thrice differentiable

249: decreasing function with $\Lambda(0)=1$,

250: $r_{\xi}(s;Z,Y)$ is as defined in~(\ref{new.e2}), and $A$ is an

251: unknown increasing function restricted to $[0,\tau]$.

252:

253: Let $G\equiv-\log\Lambda$,

254: and define the derivatives

255: $\dot{\Lambda}\equiv\partial\Lambda(t)/(\partial t)$,

256: $\ddot{\Lambda}\equiv\partial\dot{\Lambda}(t)/(\partial t)$,

257: $\dot G \equiv \partial G(t) /(\partial t)$,

258: $\ddot G \equiv \partial \dot G (t) /(\partial t)$, and

259: $\dddot G \equiv \partial\ddot G /(\partial t)$.

260: We also define the collected parameters

261: $\gamma\equiv(\alpha,\eta,\beta)$, $\psi\equiv(\gamma, A)$, and

262: $\theta \equiv(\psi, \zeta)$. We use $P$ to denote the

263: true probability measure, while the true parameter values are

264: indicated with a subscript 0.

265:

266: We now make the following additional assumptions:

267: \begin{itemize}

268: \item[A1]: $P[C=0]=0$, $P[C \ge \tau | Z,Y] = P[C = \tau | Z,Y] > 0$

269: almost surely, and censoring is independent of $T$ given $(Z,Y)$

270: and uninformative.

271: \item[A2]: The total variation of $Z(\cdot)$ on $[0, \tau]$ is

272: $\le m_0<\infty$ almost surely.

273: \item[B1]: $\zeta_0\in(a,b)$, for some known $-\infty<a<b<\infty$

274: with $P[Y<a]>0$ and $P[Y>b]>0$.

275: \item[B2]: For some neighborhood $\tilde{V}(\zeta_0)$ of $\zeta_0$:

276: \begin{itemize}

277: \item[(i)] the density of $Y$, $\tilde{h}$, exists and

278: is strictly positive, bounded and continuous for all

279: $y\in\tilde{V}(\zeta_0)$; and

280: \item[(ii)] the conditional law of $(C,Z)$ given $Y=y$,

281: ${\cal L}_y$, is left-continuous with right-hand limits

282: over $\tilde{V}(\zeta_0)$.

283: \end{itemize}

284: \item[B3]: For some $t_1,t_2\in(0,\tau]$, both var$[Z(t_1)|Y=\zeta_0]$

285: and var$[Z(t_2)|Y=\zeta_0+]$ are positive definite.

286: \item[B4]: For some $t_3,t_4\in(0,\tau]$, both

287: var$[Z(t_3)|Y<a]$ and var$[Z(t_4)|Y>b]$ are positive definite.

288: \item[C1]: $\alpha_0\in\Upsilon\subset\re$, $\beta_0\in B_1\subset\re^d$,

289: $\eta_0\in B_2\subset\re^q$, where $d\geq q\geq 1$, and $\Upsilon$, $B_1$

290: and $B_2$ are open, convex, bounded and known.

291: \item[C2]: Either $\alpha_0\neq 0$ or $\eta_0\neq 0$.

292: \item[C3]: $A_0\in{\cal A}$, where ${\cal A}$ is the set of all

293: increasing functions $A:[0,\tau]\mapsto[0,\infty)$ with

294: $A(0)=0$ and $A(\tau)<\infty$; and $A_0$ has derivative $a_0$

295: satisfying $0<a_0(t)<\infty$ for all $t\in[0,\tau]$.

296: \item[D1]: $G:[0,\infty)\mapsto[0,\infty)$ is thrice continuously

297: differentiable, with $G(0)=0$, and, for each $u\in[0,\infty)$,

298: $0<\dot{G}(u),\ddot{\Lambda}(u)<\infty$ and

299: $\sup_{s\in[0,u]}|\dddot G(s)|<\infty$.

300: \item[D2]: For some $c_0>0$, both

301: $\sup_{u\geq 0}|u^{c_0}\Lambda(u)|<\infty$ and

302: $\sup_{u\geq 0}|u^{1+c_0}\dot{\Lambda}(u)|<\infty$.

303: \end{itemize}

304:

305: Conditions~A1, A2, C1 and~C3 are commonly used for NPMLE

306: consistency and identifiability in right-censored

307: transformation models, while conditions~B1, B2, B3 and~C2 are

308: needed for change-point identifiability. As pointed out by a

309: referee, the use of a time-dependent covariate will require

310: that $Z_i(V_j)$ be observed for each individual $i$ and for

311: every $j$ such that $\delta_1=1$ and $V_j\leq V_i$. While this

312: is often assumed in theoretical contexts, it can be unrealistic

313: in practice, where missing values of $Z_i(t)$ are not

314: unusual (see \cite{ly93}).  Frequently, data analysts will simply

315: carry the last observation of $Z_i(t)$ forward to avoid the missingness

316: problem. Unfortunately, this simple solution is not necessarily valid.

317: However, addressing this

318: issue thoroughly is beyond the scope of this paper, and we will

319: only mention it again briefly in section~9,

320: where we develop a test of the null hypothesis that there is no

321: change-point ($H_0:\alpha_0=0$ and $\eta_0=0$). Also in section~9,

322: we will relax condition~C2

323: to allow for a sequence of contiguous alternative hypotheses that

324: includes $H_0$. Condition~B2(ii) is also needed to obtain weak convergence

325: for the NPMLE of $\zeta_0$. The continuity requirements

326: at each point $y$ can be restated in the following way:

327: ${\cal L}_{\zeta}$ converges

328: weakly to ${\cal L}_y$, as $\zeta\uparrow y$; and

329: ${\cal L}_{\zeta}$ converges weakly to ${\cal L}_{y+}$,

330: as $\zeta\downarrow y$, for some law ${\cal L}_{y+}$.

331: It would require a fairly pathological relationship among

332: the variables $(C,Z,Y)$ for this not to hold. Condition~B4 will also be

333: needed for the change-point test developed in section~9.

334:

335: Conditions~D1 and~D2

336: are also needed for asymptotic normality. Condition D1~is quite similar to

337: conditions~(G.1) through~(G.4) in \cite{sv04} who

338: use the condition for developing asymptotic theory

339: for transformation models without a change-point. Condition~D2

340: is slightly weaker than conditions~D2 and~D3 of \cite{klf04}

341: who use the condition to obtain asymptotic theory

342: for frailty regression models without a change-point.

343: The following are several instances that satisfy conditions~D1 and D2:

344: \begin{enumerate}

345: \item $\Lambda(u)=e^{-u}$ corresponds to the extreme value distribution

346: and results in the Cox model.

347: \item $\Lambda(u)=(1+c u)^{-1/c}$, for any $c\in(0,\infty)$,

348: corresponds to the family of log-Pareto distributions and results

349: in the odds-rate transformation family. Taking the limit as

350: $c\downarrow 0$ yields the Cox model, while $c=1$

351: yields the proportional odds model.

352: \item $\Lambda(u)=\Exp{e^{-Wu}}$, where $W$ is a positive frailty with

353: $\Exp{W^{-c}}<\infty$, for some $c>0$, and

354: $\Exp{W^4}<\infty$, corresponds to the family of frailty transformations.

355: In addition to the odds-rate family, these conditions are

356: satisfied by both the inverse Gaussian and log-normal families

357: (see \cite{klf04}), as well as many other frailty families.

358: \item $\Lambda(u)=[1+2cu+u^2]^{-1}$,

359: where $c\in(1/2,1)$. Because this is the Laplace transform of

360: $t\mapsto e^{-ct}$ $\times\sin\left(t\sqrt{1-c^2}\right)/\sqrt{1-c^2}$,

361: it is not the Laplace transform of a density. Hence this family

362: is not a member of the family of frailty transformations. Note, however, that

363: taking the limit as $c\uparrow 1$ results in the Laplace transform

364: of the frailty density $te^{-t}$.

365: \end{enumerate}

366:

367: Verification of these conditions is routine for examples~1, 2 and~4 above, but

368: verification for example~3 is slightly more involved:

369: \begin{lemma}\label{l.v1}

370: Conditions~D1 and~D2 are satisfied for example~3 above.

371: \end{lemma}

372:

373: \section{Nonparametric Maximum log-likelihood estimation} The

374: nonparametric log-likelihood has the form $L_n(\psi,\zeta)\equiv$

375: \begin{eqnarray}

376: &&\mathbb{P}_n \left\{

377: \delta\log(a(V))+l_1^{\psi}(V,\delta,Z)

378: \ind\{Y \le \zeta \} + l_2^{\psi}(V,\delta,Z)

379: \ind\{Y > \zeta \} \right \}, \label{s3.e1}

380: \end{eqnarray}

381: where

382: \begin{eqnarray*}

383: l_1^{\psi}(V,\delta,Z)&\equiv&\int_0^{\tau}\left[\log\dot{G}

384: \left(H^{\psi}_1(s)\right)+\beta' Z(s)\right]

385: dN(s)-G(H^{\psi}_1(V)),\\

386: l_2^{\psi}(V,\delta,Z)

387: &\equiv&\int_0^{\tau}\left[\log\dot{G}\left(H^{\psi}_2(s)

388: \right)+\beta'Z(s)+\alpha+\eta'Z_2(s)\right]dN(s)\\

389: &&-G(H^{\psi}_2(V)),

390: \end{eqnarray*}

391: where $N(t)\equiv\ind\{V\leq t\}\delta$, $\tilde{Y}(s)\equiv\ind\{V\geq s\}$,

392: $a \equiv dA/dt$,

393: $H^{\psi}_1(t)\equiv\int_0^t\tilde{Y}(s)e^{\beta' Z(s)}dA(s)$,

394: $H^{\psi}_2(t)\equiv\int_0^t\tilde{Y}(s)e^{\beta'Z(s)+\alpha + \eta'Z_2(s)}

395: dA(s)$, and $\mathbb{P}_n$ is the

396: empirical probability measure.

397:

398: As discussed by \cite{mrv97}, the

399: maximum likelihood estimator for $a$ does not exist, since any

400: unrestricted maximizer of~(\ref{s3.e1}) puts mass only at observed failure

401: times and is thus not a continuous hazard. We replace $a(u)$

402: in $L_n(\psi,\zeta)$ with $n\Delta A(u)$ as suggested in \cite{p98}

403: who remarked that

404: this form of the empirical log-likelihood function is asymptotically

405: equal to the true log-likelihood function in certain instances.

406: Let $\tilde{L}_n(\psi,\zeta)$ be this modified log-likelihood.

407: Note that the maximum likelihood

408: estimator for $\zeta$ is not unique, since the likelihood is constant

409: in $\zeta$ over the intervals $[Y_{(r)},Y_{(r+1)})$, where $Y_{(1)}

410: <\cdots<Y_{(r)}<\cdots<Y_{(n)}$ are the order statistics of~$Y$. For this

411: reason, we only need to consider $\zeta$ at the values of the

412: $Y$ order statistics.

413:

414: The estimators are obtained in the following way: For fixed

415: $\zeta$, we maximize the fully nonparametric log-likelihood over

416: $\psi$, to obtain the profile log-likelihood

417: $pL_n(\zeta)\equiv\sup_{\psi}\tilde{L}_n(\psi,\zeta)$. We then maximize

418: $pL_n(\zeta)$ over $\zeta$, to obtain $\hat{\zeta}_n$; and then

419: compute $\hat{\psi}_n=\argmax_{\psi}\tilde{L}_n(\psi,\hat{\zeta}_n)$.

420: This yields the NPMLE $\hat{\theta}_n=(\hat{\psi}_n,\hat{\zeta}_n)$

421: for $\theta_0$. Hence we obtain an estimator for $A_0$ but not for $a_0$.

422:

423: \section{Consistency}

424: To study consistency, we first

425: characterize the NPMLE $\hat\theta_n$.

426: Consider the following one-dimensional submodels for A:

427: \begin{eqnarray*}

428: t \mapsto A_t \equiv \int_0 ^{(\cdot)} (1 + tg(s) )dA(s),

429: \end{eqnarray*}

430: where $g$ is an arbitrary non-negative bounded function. A score

431: function for $A$, defined as the derivative of $\tilde{L}_n(\xi, A_t)$

432: with respect to $t$ at $t=0$, is

433: \begin{eqnarray}

434: &&\mathbb{P}_n \left \{ \delta g(X) - \left[

435: \dot{G}(H^{\theta}(V))-

436: \delta\frac{ \ddot{G}(H^{\theta}(V))}

437: { \dot{G}(H^{\theta}(V))}\right]

438: \int_0^{\tau}\tilde{Y}(s)

439: e^{r_{\xi}(s;Z,Y)}g(s)dA(s) \right\}, ~\label{c4:e1}

440: \end{eqnarray}

441: where $H^{\theta}(t)\equiv\int_0^t\tilde{Y}(s)

442: e^{r_{\xi}(s;Z,Y)}dA(s)$.

443: For any fixed $\xi$, let $\hat A_{\xi}$ denote the maximizer of

444: $A\mapsto\tilde{L}_n(\xi, A)$, and let $\hat \theta_{\xi} \equiv (\xi, \hat

445: A_{\xi})$. Then the score function~(\ref{c4:e1}) is equal to zero

446: when evaluated at $\hat \theta_{\xi}$. We select $g(u) = \ind

447: \{u \le t \}$, insert this into~(\ref{c4:e1}), and equate the

448: resulting expression to zero: $\hat{A}_{\xi}(u)=$

449: \begin{eqnarray}

450: \label{c4:e2}&&\\

451: \int_0^u\left(\pp_n \left[\tilde{Y}(s)e^{r_{\xi}(s;Z,Y)}\left(

452: \dot{G}\left\{H^{\hat{\theta}_{\xi}}(V)\right\}-\delta\frac{

453: \ddot{G}\left\{H^{\hat{\theta}_{\xi}}(V)\right\}}{\dot{G}

454: \left\{H^{\hat{\theta}_{\xi}}(V)\right\}}

455: \right)\right]\right)^{-1}\pp_n\{dN(s)\}&&\nonumber\\

456: \equiv\int_0^u\{\mathbb{P}_nW(s;

457: \hat{\theta}_{\xi})\}^{-1}\pp_n\{dN(s)\}.&&\nonumber

458: \end{eqnarray}

459: Now the profile likelihood has the form

460: $pL_n(\zeta)=\argmax_{\gamma}\tilde{L}_n

461: \left((\gamma,\hat{A}_{(\gamma,\zeta)}),\zeta\right)$.

462:

463: The above characterization facilitates the following consistency

464: results for $\hat{\theta}_n$:

465: \begin{lemma}\label{l1}

466: Under the regularity conditions of section~2,

467: the transformation model with a change-point based on a covariate

468: threshold is identifiable.

469: \end{lemma}

470: \begin{lemma}\label{l2}

471: Under the regularity conditions of section~2,

472: $\hat{A}_n$ is asymptotically bounded, and thus the NPMLE

473: $\hat{\theta}_n$ exists.

474: \end{lemma}

475: Using these results, we can establish the uniform consistency of

476: $\hat \theta_n$:

477: \begin{theorem}\label{t1}

478: Under the regularity conditions of section~2,

479: $\hat {\theta}_n$ converges outer almost surely to $\theta_0$

480: in the uniform norm.

481: \end{theorem}

482:

483: \section{Score and information operators for regular parameters}

484: In this section, we derive the score and information operators

485: for the collected parameters $\psi$. We refer to these parameters

486: as the regular parameters because, as we will see in section~6,

487: these parameters converge at the $\sqrt{n}$ rate. On the other

488: hand, $\hat{\zeta}_n$ converges at the $n$ rate and thus

489: the parameter $\zeta$ is not regular. The score and

490: information operators for $\psi$ are needed for the convergence

491: rate and weak limit results of sections~6 and~7.

492:

493: Let $\mathcal{H}$ denote the space of the elements $h = (h_1, h_2, h_3, h_4)$

494: such that $h_1 \in \mathbb{R}$, $h_2\in\re^q$, $h_3 \in \mathbb{R}^d$,

495: and $h_4 \in D[0,\tau]$, where $D[0,\tau]$ is the space of cadlag

496: functions (right-continuous with left-hand limits) on $[0,\tau]$.

497: We denote by $BV$ the subspace of $D[0,\tau]$ consisting of functions

498: that are of bounded variation over the

499: interval $[0,\tau]$. Define, for future use,

500: the following linear functional for

501: each $\theta=(\psi,\zeta)$ and each $t \in [0,\tau]$:

502: \begin{eqnarray}

503: R^t_{\zeta,\psi}(f) \equiv \int_0^t f(u)\tilde{Y}(u)

504: e^{r_{\xi}(u;Z,Y)}dA(u),

505: \end{eqnarray}

506: where $f$ is an element or vector of elements in $BV$.

507: Also let $\rho_1 (h) \equiv ( | h_1 |^2 +

508: \|h_2 \|^2+ \| h_3\|^2+  \| h_4 \|_v^2)^{1/2}$ and  $\mathcal{H}_r

509: \equiv \{h \in \mathcal{H}: \rho_1(h)\leq r \}$, where $\|\cdot\|_v$

510: is the total variation norm on $BV$ and $r\in(0,\infty)$.

511:

512: The parameter

513: $\psi\in\Psi\equiv\Upsilon\times B_2\times B_1\times{\cal A}$

514: can be considered a linear functional on $\mathcal{H}_r$ by defining

515: $\psi(h) \equiv h_1\alpha + h_2' \eta + h_3' \beta + \int_0^{\tau}

516: h_4(u)dA(u)$, $h \in \mathcal{H}_r$.

517: Viewed this way, $\Psi$ is a subset of $\ell^{\infty}(H_r)$ with uniform

518: norm $\|\psi\|_{(r)}\equiv\sup_{h\in{\cal H}_r}|\psi(h)|$, where

519: $\ell^{\infty}(B)$ is the space of bounded functionals on~$B$. Note that

520: ${\cal H}_1$ is rich enough to extract all components of $\psi$. This

521: is easy to see for the Euclidean components; and, for the $A$ component,

522: it works by using the elements $\{h:h_1=0,h_2=0,h_3=0,

523: h_4(u)=\ind\{u\leq t\},t\in[0,\tau]\}\subset{\cal H}_1$.

524:

525: In section~5.1, we derive the score operator; while in section~5.2

526: we derive the information operator and establish its continuous invertibility.

527:

528: \subsection{The score operator} Using the one-dimensional submodel

529: \begin{eqnarray*}

530:  t \rightarrow \psi_t \equiv \psi + t(h_1, h_2, h_3,

531: \int_0^{(\cdot)}h_4(u)dA(u)), ~~~ h \in \mathcal{H}_r,

532: \end{eqnarray*}

533: the score operator takes the form

534: \[U^{\tau}_{n\zeta}(\psi)(h) \equiv

535: \left.\frac{\partial}{\partial t}L_n(\psi_t, \zeta)  \right|_{t=0}

536: =\mathbb{P}_n U^{\tau}_{\zeta}(\psi)(h),\]

537: where $U_{\zeta}^{\tau}(\psi)(h)\equiv U^{\tau}_{\zeta,1}(\psi)(h_1) +

538: U^{\tau}_{\zeta,2}(\psi)(h_2)+

539: U^{\tau}_{\zeta,3}(\psi)(h_3)+U^{\tau}_{\zeta,4}(\psi)(h_4)$, and

540: \begin{eqnarray*}

541:  U^{\tau}_{\zeta,1}(\psi)(h_1)&\equiv&

542: \ind\{Y>\zeta\}\left\{\int_0^{\tau}h_1dN(u)-\hat{\Xi}_{\theta}^{(0)}(\tau)

543: R^{\tau}_{\zeta,\psi}(h_1)\right\},\\

544: U^{\tau}_{\zeta,2}(\psi)(h_2)&\equiv&\ind(Y>\zeta)\left\{

545: \int_0^{\tau}Z_2'(u)h_2dN(u)-\hat{\Xi}_{\theta}^{(0)}(\tau)

546: R^{\tau}_{\zeta,\psi}(Z_2'h_2)\right\},\\

547: U^{\tau}_{\zeta,3}(\psi)(h_3)&\equiv&

548: \int_0^{\tau}Z'(u)h_3dN(u)-\hat{\Xi}_{\theta}^{(0)}(\tau)

549: R^{\tau}_{\zeta,\psi}(Z'h_3),\\

550: U^{\tau}_{\zeta,4}(\psi)(h_4)&\equiv&\int_0^{\tau}h_4(u)dN(u)

551: -\hat{\Xi}_{\theta}^{(0)}(\tau)R^{\tau}_{\zeta,\psi}(h_4),\\

552: \hat{\Xi}_{\theta}^{(0)}(\tau)&\equiv&\ind\{Y\leq\zeta\}

553: \hat{\Xi}_{\psi,1}^{(0)}(\tau)+\ind\{Y>\zeta\}

554: \hat{\Xi}_{\psi,2}^{(0)}(\tau),

555: \end{eqnarray*}

556: and where, for $j=1,2$,

557: \[\hat{\Xi}_{\psi,j}^{(0)}(\tau)\equiv

558: \left[\dot{G}(H^{\psi}_j(V \wedge \tau))  -

559: \delta \frac{ \ddot{G}(H^{\psi}_j(V \wedge \tau))}

560: {\dot{G}(H^{\psi}_j(V \wedge \tau))}\right].\]

561: The dependence in the notation on $\tau$ will prove useful in later

562: developments.

563:

564: \subsection{The information operator} To obtain the information

565: operator, we can differentiate the expectation of the score operator

566: using the map $t \rightarrow \psi+ t\psi_1$,

567: where $\psi,\psi_1\in\Psi$. The information operator,

568: $\sigma_{\theta}:\mathcal{H}_{\infty} \rightarrow

569: \mathcal{H}_{\infty}$, where $\mathcal{H}_{\infty}\equiv\{h:\mbox{$h\in

570: \mathcal{H}_r$ for some $r<\infty$}\}$, satisfies

571: \begin{eqnarray}

572: \psi_1(\sigma_{\theta}(h))

573: &=&\left. -\frac{\partial}{\partial t}

574: PU^{\tau}_{\zeta}(\psi+t\psi_1)(h) \right|_{t=0},\label{new.j12.e1}

575: \end{eqnarray}

576: for every $h\in{\cal H}_{\infty}$.

577: Taking the G\^{a}teaux derivative in~(\ref{new.j12.e1}), we obtain

578: $\sigma_{\theta}(h)=$

579: \begin{eqnarray}

580: \label{c5.e2}&&\\

581: \left(\begin{array}{cccc}\sigma_{\theta}^{11}&

582: \sigma_{\theta}^{12}& \sigma_{\theta}^{13} & \sigma_{\theta}^{14} \\

583: \sigma_{\theta}^{21}&\sigma_{\theta}^{22}&

584: \sigma_{\theta}^{23}& \sigma_{\theta}^{24} \\

585: \sigma_{\theta}^{31}&\sigma_{\theta}^{32}

586: &\sigma_{\theta}^{33}&\sigma_{\theta}^{34}\\

587: \sigma_{\theta}^{41}&\sigma_{\theta}^{42}&\sigma_{\theta}^{43}

588: &\sigma_{\theta}^{44}

589: \end{array}\right)

590: \left(\begin{array}{c}h_1\\ h_2\\ h_3\\ h_4\end{array}\right)

591: \equiv P\left(\begin{array}{cccc}\hat{\sigma}_{\theta}^{11}&

592: \hat{\sigma}_{\theta}^{12}& \hat{\sigma}_{\theta}^{13}&

593: \hat{\sigma}_{\theta}^{14} \\

594: \hat{\sigma}_{\theta}^{21}&

595: \hat{\sigma}_{\theta}^{22}&\hat{\sigma}_{\theta}^{23}&

596: \hat{\sigma}_{\theta}^{24} \\

597: \hat{\sigma}_{\theta}^{31}&\hat{\sigma}_{\theta}^{32}&

598: \hat{\sigma}_{\theta}^{33}& \hat{\sigma}_{\theta}^{34}\\

599: \hat{\sigma}_{\theta}^{41}&\hat{\sigma}_{\theta}^{42}&

600: \hat{\sigma}_{\theta}^{43}& \hat{\sigma}_{\theta}^{44}

601: \end{array}\right)

602: \left(\begin{array}{c}h_1\\ h_2\\ h_3\\ h_4\end{array}\right)&&

603: \nonumber

604: \end{eqnarray}

605: $\equiv P\hat{\sigma}_{\theta}(h)$, where

606: \begin{eqnarray*}

607: \hat{\sigma}_{\theta}^{11}(h_1)&\equiv&\ind\{Y>\zeta\}\left\{

608: \hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)

609: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(h_1),\\

610: \hat{\sigma}_{\theta}^{12}(h_2)&\equiv&\ind\{Y>\zeta\}

611: \left\{\hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)

612: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(Z_2'h_2),\\

613: \hat{\sigma}_{\theta}^{13}(h_3)&\equiv&\ind\{Y>\zeta\}

614: \left\{\hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)

615: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(Z'h_3),\\

616: \hat{\sigma}_{\theta}^{14}(h_4)&\equiv&\ind\{Y>\zeta\}

617: \left\{\hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)

618: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(h_4),\\

619: \hat{\sigma}_{\theta}^{21}(h_1)&\equiv&\ind\{Y>\zeta\}\left\{

620: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2h_1)

621: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)

622: R_{\zeta,\psi}^{\tau}(h_1)\right\}\\

623: \hat{\sigma}_{\theta}^{22}(h_2)&\equiv&\ind\{Y>\zeta\}\left\{

624: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2Z_2'h_2)

625: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)

626: R_{\zeta,\psi}^{\tau}(Z_2'h_2)\right\},\\

627: \hat{\sigma}_{\theta}^{23}(h_3)&\equiv&

628: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2Z'h_3)

629: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)

630: R_{\zeta,\psi}^{\tau}(Z'h_3),\\

631: \hat{\sigma}_{\theta}^{24}(h_4)&\equiv&

632: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2h_4)

633: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)

634: R_{\zeta,\psi}^{\tau}(h_4),\\

635: \hat{\sigma}_{\theta}^{31}(h_1)&\equiv&\ind\{Y>\zeta\}\left\{

636: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Zh_1)

637: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)

638: R_{\zeta,\psi}^{\tau}(h_1)\right\},\\

639: \hat{\sigma}_{\theta}^{32}(h_2)&\equiv&\ind\{Y>\zeta\}\left\{

640: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(ZZ_2'h_2)

641: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)

642: R_{\zeta,\psi}^{\tau}(Z_2'h_2)\right\},\\

643: \hat{\sigma}_{\theta}^{33}(h_3)&\equiv&

644: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(ZZ'h_3)

645: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)

646: R_{\zeta,\psi}^{\tau}(Z'h_3),\\

647: \hat{\sigma}_{\theta}^{34}(h_4)&\equiv&

648: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Zh_4)

649: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)

650: R_{\zeta,\psi}^{\tau}(h_4),

651: \end{eqnarray*}

652: \begin{eqnarray*}

653: \hat{\sigma}_{\theta}^{41}(h_1)(u)&\equiv&\ind\{Y>\zeta\}

654: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{

655: \hat{\Xi}_{\theta}^{(0)}(\tau)h_1+

656: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(h_1)\right\},\\

657: \hat{\sigma}_{\theta}^{42}(h_2)(u)&\equiv&\ind\{Y>\zeta\}

658: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{

659: \hat{\Xi}_{\theta}^{(0)}(\tau)Z_2'(u)h_2+

660: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2'h_2)\right\},\\

661: \hat{\sigma}_{\theta}^{43}(h_3)(u)&\equiv&

662: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{

663: \hat{\Xi}_{\theta}^{(0)}(\tau)Z'(u)h_3+

664: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z'h_3)\right\},\\

665: \hat{\sigma}_{\theta}^{44}(h_4)(u)&\equiv&

666: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{

667: \hat{\Xi}_{\theta}^{(0)}(\tau)h_4(u)+

668: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(h_4)\right\},

669: \end{eqnarray*}

670: and where

671: \begin{eqnarray*}

672: \hat{\Xi}_{\theta}^{(1)}(\tau)&\equiv&\ddot{G}(H^{\theta}(V\wedge\tau))

673: -\delta\left[\frac{\dddot{G}(H^{\theta}(V\wedge\tau))}

674: {\dot{G}(H^{\theta}(V\wedge\tau))}-\left\{

675: \frac{\ddot{G}(H^{\theta}(V\wedge\tau))}

676: {\dot{G}(H^{\theta}(V\wedge\tau))}\right\}^2\right].

677: \end{eqnarray*}

678: Note that all of the above operators are clearly bounded

679: whenever $\theta$ is bounded.

680:

681: The following lemma strengthens

682: the above G\^{a}teaux derivative to a Fr\'{e}chet derivative.  We will

683: need this strong differentiability to obtain weak convergence of

684: our estimators.

685: \begin{lemma}\label{l3}

686: Under the regularity conditions of section~2 and for any $\zeta\in[a,b]$

687: and $\psi_1\in\Psi$, the operator

688: $\psi\mapsto P U_{\zeta}^{\tau}(\psi)$

689: is Fr\'{e}chet differentiable at $\psi_1$,

690: with derivative $-\psi(\sigma_{\psi_1}(h))$, where

691: $h$ ranges over ${\cal H}_r$ and is the index

692: for $P_{\zeta}^{\tau}(\psi)(\cdot)$, $\psi$ ranges over the linear span

693: $\mbox{lin}\,\Psi$ of $\Psi$, and $0<r<\infty$.

694: \end{lemma}

695:

696: The following lemma gives

697: us the desired continuous invertibility of both

698: $\sigma_{\theta_0}$ and the operator

699: $\psi\mapsto\psi(\sigma_{\theta_0}(\cdot))$. This last operator will

700: be needed for weak convergence of regular parameters.

701: \begin{lemma}\label{l4}

702: Under the regularity conditions of section~2,

703: the linear operator $\sigma_{\theta_0}: \mathcal{H}_{\infty}

704: \rightarrow \mathcal{H}_{\infty}$ is

705: continuously invertible and onto, with inverse $\sigma_{\theta_0}^{-1}$.

706: Moreover, the linear operator $\psi\mapsto \psi(\sigma_{\theta_0}(\cdot))$,

707: as a map from and to $\mbox{lin}\,\Psi$,

708: is also continuously invertible and onto, with inverse

709: $\psi\mapsto\psi(\sigma_{\theta_0}^{-1}(\cdot))$.

710: \end{lemma}

711:

712: \section{The convergence rates of the estimators}

713:

714: To determine the convergence rates of the estimators, we need to

715: study closely the log-likelihood process $\tilde{L}_n(\theta)$

716: near its maximizer. In the parametric setting, this process

717: can be approximated by its expectation which can be shown to

718: be locally concave.  For the Cox model, as in \cite{p03}, this

719: same procedure can be applied to the partial likelihood which

720: shares the local concavity features of a parametric likelihood.

721: Unfortunately, in our present set-up, studying the expectation

722: of $\tilde{L}_n(\theta)$ will lead to problems since $A_0$ has

723: a density and thus $\Delta A_0(t)=0$ for all $t\in[0,\tau]$.

724: Hence $\tilde{L}_n(\theta_0)=-\infty$, and

725: a new approach is needed. The approach we take involves a

726: careful reparameterization of $\hat{A}_n$.

727:

728: From section~4, we know that the maximizer $\hat{A}_n(t)=

729: \int_0^t\left\{\pp_n W(s;\hat{\theta}_n)\right\}^{-1}$

730: $\times d\tilde{G}_n(s)$,

731: where $\tilde{G}_n(t)\equiv\pp_n N(t)$ and $W(\cdot;\cdot)$ is

732: as defined in~(\ref{c4:e2}). It is easy to see that for all $n$

733: large enough and all $\theta$ sufficiently close to $\theta_0$,

734: $t\mapsto\pp_n W(t;\theta)$ is bounded below and above and in

735: total variation, with large probability.

736: Thus, if we use the reparameterization

737: $\Gamma(\cdot)\mapsto A^{(\Gamma)}_n(\cdot)\equiv\int_0^{(\cdot)}

738: \exp\{-\Gamma(s)\}d\tilde{G}_n(s)$, and

739: maximize $\tilde{L}_n(\xi,A^{(\Gamma)}_n)$ over $\xi$ and $\Gamma$, where

740: $\Gamma\in BV$, we will achieve the same NPMLE as before. Note that

741: the $\Gamma$ component of the maximizer of $\tilde{L}(\xi,A^{(\Gamma)}_n)$

742: is therefore just $\hat{\Gamma}_n(\cdot)\equiv-\log\pp_n W(\cdot;

743: \hat{\theta}_n)$.

744:

745: Define $\Gamma_0(\cdot)\equiv-\log(PW(\cdot;\theta_0))$ and

746: $\theta_n(\zeta,\gamma,\Gamma)\equiv(\zeta,\gamma,A^{(\Gamma)}_n)$,

747: and note that

748: the reparameterized NPMLE $(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)$

749: is the maximizer of the process

750: \begin{eqnarray*}

751: \lefteqn{

752: (\zeta,\gamma,\Gamma)\mapsto\tilde{X}_n(\zeta,\gamma,\Gamma)

753: \;\equiv\;\tilde{L}_n(\zeta,\gamma,A^{(\Gamma)}_n)-\tilde{L}_n(\zeta_0,

754: \gamma_0,A^{(\Gamma_0)}_n)}&&\\

755: &&\mbox{\hspace{-0.1in}}=\pp_n\left\{\int_0^{\tau}\left[-\Gamma(t)+\Gamma_0(t)

756: +\log\frac{\dot{G}(H^{\theta_n(\zeta,\gamma,\Gamma)}(t))}

757: {\dot{G}(H^{\theta_n(\zeta_0,\gamma_0,\Gamma_0)}(t))} +

758: (r_{\xi}-r_{\xi_0})(t;Z,Y)\right]\right.\\

759: &&\mbox{\hspace{0.2in}}

760: \left.\rule[-0.3cm]{0cm}{1.0cm}\times dN(t)

761: -(G(H^{\theta_n(\zeta,\gamma,\Gamma)}(V))

762: -G(H^{\theta_n(\zeta_0,\gamma_0,\Gamma_0)}(V)))

763: \right\}.

764: \end{eqnarray*}

765: We will argue shortly that $\tilde{X}_n$ is uniformly consistent for

766: the function

767: \begin{eqnarray*}

768: \lefteqn{(\zeta,\gamma,\Gamma)\mapsto\tilde{X}(\zeta,\gamma,\Gamma)}&&\\

769: &\equiv&P\left\{\int_0^{\tau}\left[-\Gamma(t)+\Gamma_0(t)

770: +\log\frac{\dot{G}(H^{\theta_0(\zeta,\gamma,\Gamma)}(t))}

771: {\dot{G}(H^{\theta_0}(t))} +

772: (r_{\xi}-r_{\xi_0})(t;Z,Y)\right]dN(t)\right.\\

773: &&\left.\rule[-0.3cm]{0cm}{1.0cm}-(G(H^{\theta_0(\zeta,\gamma,\Gamma)}(V))

774: -G(H^{\theta_0}(V)))

775: \right\},

776: \end{eqnarray*}

777: where $\theta_0(\zeta,\gamma,\Gamma)\equiv(\zeta,\gamma,

778: A^{(\Gamma)}_0)$, $A^{(\Gamma)}_0(\cdot)\equiv\int_0^{(\cdot)}

779: \exp\{-\Gamma(s)\}d\tilde{G}_0(s)$, and $\tilde{G}_0(t)\equiv P N(t)$.

780: It will occasionally be useful to use the shorthand

781: $\lambda\equiv(\gamma,\Gamma)$,

782: $\hat{\lambda}_n\equiv(\hat{\gamma}_n,\hat{\Gamma}_n)$ and

783: $\lambda_0\equiv(\gamma_0,\Gamma_0)$.

784:

785: Define the modified parameter space

786: $\Theta^{\ast}\equiv (a,b)\times\Upsilon\times B_2\times B_1

787: \times BV$; and, for each

788: $h=(h_1,h_2,h_3,h_4,h_5)\in\re\times{\cal H}_{\infty}$, define

789: the metric $\rho_2(h)\equiv(|h_1|+|h_2|^2+\|h_3\|^2+\|h_4\|^2

790: +\|h_5\|_{\infty}^2)^{1/2}$, where $\|\cdot\|_{\infty}$ is the uniform

791: norm. Note that $|h_1|$ is deliberately not squared.

792: For each $\epsilon>0$ and $k<\infty$, define $B_{\epsilon}^{\ast k}

793: \equiv\{(\zeta,\lambda)\in\Theta^{\ast}:

794: \rho_2((\zeta,\lambda)-(\zeta_0,\lambda_0))<\epsilon,\|\Gamma\|_v\leq k\}$.

795: Note that for some $k_0<\infty$ and

796: any $\epsilon>0$, $(\hat{\zeta}_n,\hat{\lambda}_n)$

797: is eventually in $B_{\epsilon}^{\ast k_0}$ for all

798: $n$ large enough by theorem~\ref{t1} above combined with

799: lemma~\ref{l5} below:

800: \begin{lemma}\label{l5}

801: There exists a $k_0<\infty$ such that

802: $\limsup_{n\rightarrow\infty}\|\hat{\Gamma}_n\|_v\leq k_0$ and

803: $\lim_{n\rightarrow\infty}\|\hat{\Gamma}_n-\Gamma_0\|_{\infty}=0$

804: outer almost surely.

805: \end{lemma}

806:

807: Now we study the local behavior of $\tilde{X}$. First

808: fix $\zeta\in(a,b)$. Since, for any $g\in BV$,

809: \[\left.\frac{\partial A^{(\Gamma+t g)}_0(\cdot)}{\partial t}\right|_{t=0}

810: =-\int_0^{(\cdot)}g(s)dA^{(\Gamma)}_0(s),\]

811: we obtain that the first derivative of $(\gamma,\Gamma)\mapsto

812: \tilde{X}(\zeta,\gamma,\Gamma)$ in the direction $h\in{\cal H}_{\infty}$,

813: is precisely $-PU_{\zeta}^{\tau}(\gamma,A^{(\Gamma)}_0)(h)$. Moreover,

814: by definition of the score and information operators,

815: the second derivative in the same direction is

816: $-\psi^h_{\Gamma}\left(\sigma_{\left(\zeta,\gamma,

817: A^{(\Gamma)}_0\right)}(h)\right)$,

818: where $\psi^h_{\Gamma}\equiv\left(h_1,h_2,h_3,\int_0^{(\cdot)}

819: h_4(s)dA^{(\Gamma)}_0(s)\right)$. At the point $(\zeta,\gamma,\Gamma)=

820: (\zeta_0,\gamma_0,\Gamma_0)$, the first derivative is $0$,

821: while the second derivative is $<0$, by lemma~\ref{l4}.

822: By the smoothness of the score

823: and information operators ensured by condition~D1 and~D2,

824: and by the arbitrariness

825: of $h$, we now have that the function

826: $(\gamma,\Gamma)\mapsto\tilde{X}(\zeta,\gamma,\Gamma)$

827: is concave for every $(\zeta,\gamma,\Gamma)\in B_{\epsilon}^{\ast k_0}$, for

828: sufficiently small $\epsilon$.

829:

830: Now note that $\tilde{X}(\zeta,\gamma,\Gamma)=P l^{\ast}(\zeta,\gamma,

831: \Gamma)-P l^{\ast}(\zeta_0,\gamma_0,\Gamma_0)$, where

832: $l^{\ast}(\zeta,\gamma,\Gamma)\equiv$

833: \begin{eqnarray}

834: &&\;\;-\int_0^{\tau}\Gamma(t)dN(t)

835: +l_1^{\psi(\gamma,\Gamma)}(V,\delta,Z)\ind\{Y\leq\zeta\}

836: +l_2^{\psi(\gamma,\Gamma)}(V,\delta,Z)

837: \ind\{Y>\zeta\},\label{new.j14.e1}

838: \end{eqnarray}

839: and where $l_j^{\psi}$, $j=1,2$, are as defined in section~3, and

840: $\psi(\gamma,\Gamma)\equiv(\gamma,A^{(\Gamma)}_0)$.

841: By condition~B2, we now have that for small enough $\epsilon>0$,

842: $\zeta\mapsto\tilde{X}(\zeta,\gamma,\Gamma)$

843: is right and left continuously differentiable for all

844: $(\zeta,\gamma,\Gamma)\in B_{\epsilon}^{\ast k_0}$,

845: with left partial derivative

846: \[\dot{X}_{\zeta}^{-}(\gamma,\Gamma)

847: \equiv P\left\{\left.l_1^{\psi(\gamma,\Gamma)}(V,\delta,Z)

848: -l_2^{\psi(\gamma,\Gamma)}(V,\delta,Z)\right|Y=\zeta\right\}\]

849: and right partial derivative

850: \[\dot{X}_{\zeta}^{+}(\gamma,\Gamma)

851: \equiv P\left\{\left.l_1^{\psi(\gamma,\Gamma)}(V,\delta,Z)

852: -l_2^{\psi(\gamma,\Gamma)}(V,\delta,Z)\right|Y=\zeta+\right\}.\]

853:

854: We now have the following lemmas on the local behavior of $\tilde{X}$

855: with respect to $\zeta$:

856: \begin{lemma}\label{l6}

857: Under the conditions of section~2,

858: $\dot{X}_{\zeta_0}^{-}(\gamma_0,\Gamma_0)>0$ and

859: $\dot{X}_{\zeta_0}^{+}(\gamma_0,\Gamma_0)<0$.

860: \end{lemma}

861: \begin{lemma}\label{l7}

862: There exists $\epsilon_1,k_1>0$ such that

863: $\tilde{X}(\zeta,\gamma,\Gamma)\leq -k_1|\zeta-\zeta_0|$

864: for all $(\zeta,\gamma,\Gamma)\in B_{\epsilon_1}^{\ast k_0}$.

865: \end{lemma}

866:

867: The two previous lemmas can be combined with the next lemma,

868: lemma~\ref{l8}, to yield $\sqrt{n}$ rates for all of the parameters

869: (theorem~\ref{t.l9}):

870: \begin{lemma}\label{l8}

871: There exists an $\epsilon_2>0$ such that $D_n\equiv\sqrt{n}(\tilde{X}_n

872: -\tilde{X})$ converges weakly to a tight mean zero Gaussian process

873: $D_0$, in $\ell^{\infty}(B_{\epsilon_2}^{\ast k_0})$, for which

874: $D_0(\zeta,\gamma,\Gamma)\rightarrow 0$ in probability, as

875: $\rho_2((\zeta,\gamma,\Gamma)-(\zeta_0,\gamma_0,\Gamma_0))

876: \rightarrow 0$.$\Box$

877: \end{lemma}

878:

879: \begin{theorem}\label{t.l9}

880: Under the conditions of section~2,

881: $\sqrt{n}|\hat{\zeta}_n-\zeta_0|=O_P(1)$,

882: $\sqrt{n}\|\hat{\psi}_n-\psi_0\|_{\infty}=O_P(1)$, and

883: $\sqrt{n}\|\hat{\Gamma}_n-\Gamma_0\|_{\infty}=O_P(1)$.

884: \end{theorem}

885:

886: To refine the rate for $\hat{\zeta}_n$, we need two more lemmas,

887: lemmas~\ref{l10} and~\ref{l11} below. We will

888: also need to define the process $\zeta\mapsto\tilde{X}_n^{\ast}(\zeta)

889: \equiv$

890: \begin{eqnarray*}

891: &&\pp_n\left\{\int_0^{\tau}\left[

892: \log\frac{\dot{G}(H^{\theta_0(\zeta,\gamma_0,\Gamma_0)}(t))}

893: {\dot{G}(H^{\theta_0}(t))} +

894: (r_{(\zeta,\gamma_0)}-r_{\xi_0})(t;Z,Y)\right]dN(t)\right.\\

895: &&\mbox{\hspace{0.4in}}

896: \left.\rule[-0.3cm]{0cm}{1.0cm}-(G(H^{\theta_0(\zeta,\gamma_0,\Gamma_0)}(V))

897: -G(H^{\theta_0}(V)))\right\}.

898: \end{eqnarray*}

899: \begin{lemma}\label{l10}

900: $0\leq \tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)-

901: \tilde{X}_n^{\ast}(\hat{\zeta}_n)\leq O_P(n^{-1})$.

902: \end{lemma}

903: \begin{lemma}\label{l11}

904: There exists an $\epsilon_3>0$ and $k_2<\infty$ such that,

905: for all $0\leq\epsilon\leq\epsilon_3$ and $n\geq 1$,

906: $\Exp{\sup_{|\zeta-\zeta_0|\leq\epsilon}|\tilde{D}_n(\zeta)|}

907: \leq k_2\sqrt{\epsilon}$,

908: where $\tilde{D}_n(\zeta)\equiv\sqrt{n}(\tilde{X}_n^{\ast}(\zeta)

909: -\tilde{X}(\zeta,\lambda_0))$.

910: \end{lemma}

911:

912: We now have the following theorem about the convergence rate for

913: $\hat{\zeta}_n$:

914: \begin{theorem}\label{t2}

915: Under the conditions of section~2, $n|\hat{\zeta}_n-\zeta_0|=O_P(1)$.

916: \end{theorem}

917:

918: {\it Proof.} The method of proof involves a ``peeling device'' (see,

919: for example, the proof of theorem~5.1 of \cite{ih81},

920: or the proof of theorem~2 of \cite{p03}).

921: Fix $\epsilon>0$. By consistency and lemma~\ref{l5},

922: $P((\hat{\zeta}_n,\hat{\lambda}_n)\in B_{\epsilon_4}^{\ast k_0})

923: \geq 1-\epsilon$ for

924: all $n$ large enough, where $\epsilon_4=

925: \epsilon_1\wedge \epsilon_2\wedge\epsilon_3$.

926: By lemma~\ref{l10}, there exists an $M_1^{\ast}<\infty$ such that

927: $P(\tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)-

928: \tilde{X}_n^{\ast}(\hat{\zeta}_n)>M_1^{\ast}/n)\leq\epsilon$.

929: For integers $k\geq 1$, let $m_k\equiv k^4$. We now have, for any

930: integer $k\geq 1$, that

931: $\limsup_{n\rightarrow\infty}P\left(n|\hat{\zeta}_n-\zeta_0|>m_k\right)$

932: \begin{eqnarray}

933: &\leq&\limsup_{n\rightarrow\infty}P\left(

934: n|\hat{\zeta}_n-\zeta_0|>m_k,\;(\hat{\zeta}_n,\hat{\lambda}_n)

935: \in B_{\epsilon_4}^{\ast k_0},\right.\nonumber\\

936: &&\left.\tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)

937: -\tilde{X}_n^{\ast}(\hat{\zeta}_n)

938: \leq \frac{M_1^{\ast}}{n}\right)+2\epsilon\nonumber\\

939: &\leq&\limsup_{n\rightarrow\infty}P\left(\sup_{\zeta:\,m_k/n<|\zeta-\zeta_0|

940: \leq\epsilon_4}\tilde{X}_n^{\ast}(\zeta)\geq -\frac{M_1^{\ast}}{n}\right)

941: +2\epsilon\nonumber\\

942: &\leq&\limsup_{n\rightarrow\infty}\sum_{j=k}^{k_{\epsilon_4}}

943: P\left(\sup_{\zeta:\,m_j/n<|\zeta-\zeta_0|\leq (m_{j+1}/n)

944: \wedge\epsilon_4}\tilde{D}_n(\zeta)\right.\label{t2.e1}\\

945: &&\left.\mbox{\hspace{0.4in}}\geq\sqrt{n}\left(\frac{k_1m_j}{n}

946: -\frac{M_1^{\ast}}{n}\right)\right)+2\epsilon,\nonumber

947: \end{eqnarray}

948: by lemma~\ref{l7}, where $k_{\epsilon_4}=

949: \min\{k:\,m_{k+1}\geq n\epsilon_4\}$. But, by lemma~\ref{l11},

950: \[\mbox{(\ref{t2.e1})}\leq\limsup_{n\rightarrow\infty}

951: \sum_{j=k}^{k_{\epsilon_4}}\frac{k_2\sqrt{m_{j+1}}}{k_1m_j-M_1^{\ast}}

952: +2\epsilon\leq\sum_{j=k}^{\infty}\frac{k_2(j+1)^2}

953: {k_1j^4-M_1^{\ast}}+2\epsilon.\]

954: We can now choose $k<\infty$ large enough so that this last term

955: $\leq 3\epsilon$. Since $\epsilon>0$ was arbitrary, we now have that

956: $\lim_{m\rightarrow\infty}\limsup_{n\rightarrow\infty}

957: P(n|\hat{\zeta}_n-\zeta_0|>m)=0$, and the desired conclusion follows.$\Box$

958:

959: \section{Weak convergence of the estimators}

960:

961: \subsection{The asymptotic distribution of the change-point

962: estimator}

963:

964: Denote $\mathbb{U}_{n,M}\equiv\{u=n(\zeta-\zeta_0):\zeta\in[a,b],|u|\leq M\}$

965: and $\zeta_{n,u}\equiv\zeta_0+u/n$.

966: The limiting distribution of $n(\hat{\zeta}_n-\zeta_0)$

967: will be deduced from the behavior of the restriction of

968: the process $u \rightarrow n[\tilde{L}_n(\hat{\psi}_n,\zeta_{n,u})

969: -\tilde{L}_n(\hat{\psi}_n,\zeta_0)]$ to

970: the compact set $\mathbb{U}_{n,M}$, for $M$ sufficiently large.

971: \begin{theorem}\label{t3}

972: The following approximation holds for all $M > 0$, as $n \rightarrow \infty$:

973: \[u\mapsto n[\tilde{L}_n(\hat{\psi}_n,\zeta_{n,u})

974: -\tilde{L}_n (\hat{\psi}_n,\zeta_0)]=

975: Q_n(u)+o_P^{\mathbb{U}_{n,M}}(1),\]

976: where $o_P^B(1)$ denotes a term going to zero in probability uniformly

977: over the set $B$ and $\mbox{Q}_n(u)=$

978: \begin{eqnarray*}

979: n\mathbb{P}_n

980: \left\{\left(\ind\{\zeta_{n,u}<Y \le \zeta_0\}

981: - \ind\{\zeta_0<Y \le \zeta_{n,u}\}\right)

982: \left[l_2^{\psi_0}(V,\delta,Z)-l_1^{\psi_0}(V,\delta,Z)\right]\right\}.

983: \end{eqnarray*}

984: \end{theorem}

985:

986: Let $Q_n(u)=Q_n^+(u)\ind\{u>0\}-Q_n^-(u)\ind\{u<0\}$.

987: We now study the weak convergence of $Q_n$ as a random variable on the

988: space of cadlag functions $D$ with the Skorohod topology, and on

989: its restriction to the space $D_M$ of cadlag functions on $[-M,

990: M]$, for any $M > 0$, similar to the approach taken in \cite{p03}.

991: In order to describe the asymptotic distribution of $Q_n$,

992: let $\nu^+$ and $\nu^-$ be two independent jump processes on~$\mathbb{R}$

993: such that $\nu^+ (s)$ is a Poisson variable with parameter

994: $s^+\tilde{h}(\zeta_0)$ and $\nu^- (s)$ is a Poisson variable with

995: parameter $(-s)^+\tilde{h}(\zeta_0)$. Here,

996: $u^+$ denotes $u\vee 0$. Let $(\check{V}_k^+)_{k\ge 1}$

997: and $(\check{V}_k^-)_{k \ge 1}$ be independent sequences of i.i.d.

998: random variables with characteristic functions

999: \[\phi^+(t)=P\left[e^{it\check{V}_k^+}\right]

1000: =P\left[\left.e^{it\left\{l_1^{\psi_0}(V,\delta,Z)-l_2^{\psi_0}(V,\delta,Z)

1001: \right\}}\right|Y=\zeta_0^+\right],\]

1002: and

1003: \[\phi^-(t)=P\left[e^{it\check{V}_k^-}\right]

1004: =P\left[\left.e^{it\left\{l_1^{\psi_0}(V,\delta,Z)-l_2^{\psi_0}(V,\delta,Z)

1005: \right\}}\right|Y=\zeta_0\right],\]

1006: respectively, where $(\check{V}_k^+)_{k \ge 1}$ and

1007: $(\check{V}_k^-)_{k \ge 1}$ are independent of $\nu^+$ and $\nu^-$.

1008:

1009: Let $Q(s) = Q^+(s)\ind\{s > 0\} - Q^-(s)\ind\{s < 0\}$ be the

1010: right-continuous jump process defined by

1011: \[Q^+(s)=\sum_{0 \le k \le {\nu}^+(s)}\check{V}_k^+, ~ ~ ~

1012: Q^-(s)=\sum_{0 \le k \le {\nu}^-(s+)}\check{V}_k^-, \]

1013: where $\check{V}_0^+=\check{V}_0^-=0$.

1014: Using a modification of the arguments in \cite{p03}, we obtain:

1015: \begin{theorem}\label{t4}

1016: Under the regularity conditions of section~2,

1017: the process $Q_n$ converges weakly to $Q$ in $D_M$, for every $M > 0$;

1018: $n(\hat{\zeta}_n - \zeta_0) = \argmax_{u}Q_n(u) + o_p(1)$ which

1019: converges weakly to $\hat{v}_Q \equiv \argmin\{|v| : Q(v) =

1020: \argmax\,Q\}$; and $n(\hat{\zeta}_n-\zeta_0)$ and

1021: $\sqrt{n}\pp_nU_{\zeta_0}^{\tau}(\psi_0)(h)$ are asymptotically

1022: independent for all $h\in{\cal H}_{\infty}$.

1023: \end{theorem}

1024:

1025: \subsection{Asymptotic normality of the regular parameters} We use

1026: Hoffmann-J{\o}rgensen weak convergence as described in \cite{vw96}.

1027: We have the

1028: following result:

1029: \begin{theorem}\label{t5}

1030: Under the conditions of theorem 1,  $\sqrt{n}(\hat \psi_n -

1031: \psi_0)$ is asymptotically linear, with influence function $\tilde

1032: l(h) = U_{\zeta_0}^{\tau}(\psi_0)(\sigma_{\theta_0}^{-1}(h))$, $h

1033: \in{\cal H}_1$, converging weakly in the uniform norm to a tight, mean

1034: zero Gaussian process $\mathbb{Z}$ with covariance

1035: $E[\tilde l(g) \tilde l(h)]$, for all $g, h \in H_1$. Thus

1036: $n(\hat{\zeta}_n-\zeta_0)$ and $\sqrt{n}(\hat{\psi}_n-\psi_0)$

1037: are asymptotically independent.

1038: \end{theorem}

1039:

1040: \begin{remark}\label{r1}

1041: Since $\sqrt{n}(\hat \psi_n - \psi_0)$ is asymptotically linear,

1042: with influence function contained in

1043: the closed linear span of the tangent space (since

1044: $\sigma_{\theta_0}$ is continuously invertible), $\hat\psi_n$ is

1045: regular and hence as efficient as if $\zeta_0$ were known, by

1046: Theorem 5.2.3 and Theorem 5.2.1 of \cite{bkrw98}.

1047: \end{remark}

1048:

1049: \section{Inference when $\alpha_0\neq 0$ or $\eta_0\neq 0$}

1050: In this section we develop Monte Carlo methods for inference for the parameter estimators when

1051: it is known that either $\alpha_0\neq 0$ or $\eta_0\neq 0$, i.e., it is known that condition~C2

1052: is satisfied. In section~9,

1053: we develop a hypothesis testing procedure to assess whether

1054: $H_0:\alpha_0=0=\eta_0$ holds (i.e., that~C2 does not hold). When it is known that $H_0$ holds,

1055: the model reduces to the usual transformation model

1056: (see \cite{sv04}),

1057: and thus validity of the bootstrap will follow from arguments

1058: similar to those used in the proof of

1059: corollary~1 of \cite{klf04}.

1060:

1061: \subsection{Inference for the change-point} One possibility for

1062: inference for $\zeta$ is to use the subsampling bootstrap \cite{pr94}

1063: which is guaranteed to work, provided the subsample

1064: sizes $\ell_n$ satisfy $\ell_n\rightarrow\infty$ and $\ell_n/n\rightarrow 0$.

1065: However, this approach is very computationally intense since,

1066: for each subsample, the likelihood must be maximized over the entire

1067: parameter space. To ameliorate the computational strain, we propose as

1068: an alternative the following specialized parametric bootstrap.

1069: Let $\tilde{F}_+$ and $\tilde{F}_-$ be the

1070: distribution functions corresponding to the moment generating functions

1071: $\phi^+$ and $\phi^-$, respectively. We need to make the following

1072: additional assumption:

1073: \begin{enumerate}

1074: \item[B5:] Both $\tilde{F}_+$ and $\tilde{F}_-$ are continuous.

1075: \end{enumerate}

1076: Now let $\tilde{m}_n$ be the minimum of the number of $Y$ observations

1077: in the sample $>\hat{\zeta}_n$ and the number of $Y$ observations $<\hat{\zeta}_n$. Now

1078: choose sequences of possibly data dependent integers $1\leq C_{1,n}<C_{2,n}\leq \tilde{m}_n$

1079: such that $C_{1,n}\rightarrow\infty$,

1080: $C_{2,n}-C_{1,n}\rightarrow\infty$, and

1081: $C_{2,n}/n\rightarrow 0$, in probability, as $n\rightarrow\infty$.

1082: Note that if one

1083: chooses $C_{1,n}$ to be the closest integer to $\tilde{m}_n^{1/4}$ and $C_{2,n}$ to be

1084: the closest integer to $\tilde{m}_n^{3/4}$,

1085: the given requirements will be satisfied since

1086: $\tilde{m}_n\rightarrow\infty$, in probability, by assumption~B1. Let

1087: $X_{(1)},\ldots,X_{(n)}$ be the complete data observations corresponding

1088: to the order statistics $Y_{(1)},\ldots,Y_{(n)}$ of the $Y$ observations.

1089: Also let $\tilde{k}_n\equiv C_{2,n}-C_{1,n}+1$, and define $\tilde{l}_n$

1090: to be the integer satisfying $\hat{\zeta}_n=Y_{(\tilde{l}_n)}$.

1091: The existence of this integer follows from the form of the MLE.

1092:

1093: Now, for $j=1,\ldots,\tilde{k}_n$, and any $\psi\in\Psi$, define

1094: \begin{eqnarray*}

1095: \check{V}_{j,\psi}^+&\equiv& l_1^{\psi}(

1096: V_{(\tilde{l}_n+C_{1,n}+j-1)},\delta_{(\tilde{l}_n+C_{1,n}+j-1)},Z_{(\tilde{l}_n+C_{1,n}+j-1)})\\

1097: &&-l_2^{\psi}(V_{(\tilde{l}_n+C_{1,n}+j-1)},\delta_{(\tilde{l}_n+C_{1,n}+j-1)},

1098: Z_{(\tilde{l}_n+C_{1,n}+j-1)}),\\

1099: \check{V}_{j,\psi}^-&\equiv&l_1^{\psi}(

1100: V_{(\tilde{l}_n-C_{1,n}-j)},\delta_{(\tilde{l}_n-C_{1,n}-j)},Z_{(\tilde{l}_n-C_{1,n}-j)})\\

1101: &&-l_2^{\psi}(V_{(\tilde{l}_n-C_{1,n}-j)},\delta_{(\tilde{l}_n-C_{1,n}-j)},

1102: Z_{(\tilde{l}_n-C_{1,n}-j)}),

1103: \end{eqnarray*}

1104: $Y^+_j\equiv Y_{(\tilde{l}_n+C_{1,n}+j-1)}$, and $Y^-_j\equiv Y_{(\tilde{l}_n-C_{1,n}-j)}$. Also let $\hat{F}_+^n$ be

1105: the data-dependent distribution function

1106: for a random variable drawn with replacement from

1107: $\{\check{V}_{1,\hat{\psi}_n}^+,\ldots,\check{V}_{\tilde{k}_n,

1108: \hat{\psi}_n}^+\}$, and let $\hat{F}_-^n$ be

1109: the data-dependent distribution function

1110: for a random variable drawn with replacement from

1111: $\{\check{V}_{1,\hat{\psi}_n}^-,\ldots,$ $\check{V}_{\tilde{k}_n,

1112: \hat{\psi}_n}^-\}$. By the smoothness

1113: of the terms involved, it is easy to verify

1114: that both $\sup_{1\leq j\leq\tilde{k}_n}$ $\left|\check{V}_{j,\hat{\psi}_n}^+

1115: -\check{V}_{j,\psi_0}^+\right|=o_P(1)$ and

1116: $\sup_{1\leq j\leq\tilde{k}_n}\left|\check{V}_{j,\hat{\psi}_n}^-

1117: -\check{V}_{j,\psi_0}^-\right|=o_P(1)$. Moreover, by assumption~B2(i), the

1118: fact that $n(\hat{\zeta}_n-\zeta_0)=O_P(1)$, and the conditions on $C_{1,n}$

1119: and $C_{2,n}$, we have that both $P(Y^-_{1}<\zeta_0<Y^+_{1})\rightarrow 1$ and

1120: $Y^+_{\tilde{k}_n}-Y^-_{\tilde{k}_n}=o_P(1)$. Thus, by assumption~B2(ii),

1121: the collection $\{\check{V}^+_{1,\psi_0},\ldots,$

1122: $\check{V}^+_{\tilde{k}_n,\psi_0}\}$ converges

1123: in distribution to an i.i.d. sample of random

1124: variables with characteristic function

1125: $\phi^+$, while the collection $\{\check{V}^-_{1,\psi_0},\ldots,

1126: \check{V}^-_{\tilde{k}_n,\psi_0}\}$ is

1127: independent of the first collection and converges

1128: in distribution to an i.i.d. sample of random

1129: variables with characteristic function $\phi^-$.  By assumption~B5

1130: and the fact that $\tilde{k}_n\rightarrow\infty$, in probability,

1131: we now have that both

1132: $\sup_{v\in\re}|\hat{F}_+^n(v)-\tilde{F}_+(v)|=o_P(1)$ and

1133: $\sup_{v\in\re}|\hat{F}_-^n(v)-\tilde{F}_-(v)|=o_P(1)$.

1134:

1135: Now let $\hat{h}_n$ be a consistent estimator of $\tilde{h}(\zeta_0)$.

1136: Such an estimator can be obtained from a kernel density estimator of

1137: $\tilde{h}$ based on the $Y$ observations and evaluated at $\hat{\zeta}_n$.

1138: The basic idea of our parametric bootstrap is to create a stochastic

1139: process $\hat{Q}_n$ defined similarly to the process $Q$ described

1140: in section~7.1. To this end,

1141: let $\hat{\nu}^+$ and $\hat{\nu}^-$ be two independent jump processes

1142: defined on the interval $\tilde{B}_n\equiv

1143: [-n(\hat{\zeta}_n-a),n(b-\hat{\zeta}_n)]$

1144: such that $\hat{\nu}^+(s)$ is Poisson with parameter $s^+\hat{h}_n$

1145: and $\hat{\nu}^-(s)$ is Poisson with parameter $(-s)^+\hat{h}_n$.

1146: Also let $(\check{V}_{\ast,k}^+)_{k\geq 1}$ and

1147: $(\check{V}_{\ast,k}^-)_{k\geq 1}$ be two independent sequences of

1148: i.i.d. random variables drawn from $\hat{F}_+^n$ and $\hat{F}_-^n$

1149: and independent of the Poisson processes. Now construct

1150: $u\mapsto\hat{Q}_n(u)\equiv\hat{Q}_n^+(u)\ind\{u>0\}

1151: -\hat{Q}_n^-(u)\ind\{u<0\}$ on the interval $\tilde{B}_n$,

1152: where $\hat{Q}_n^+(u)\equiv

1153: \sum_{0\leq k\leq\hat{\nu}^+(u)}\check{V}_{\ast,k}^+$ and

1154: $\hat{Q}_n^-(u)\equiv\sum_{0\leq k\leq\hat{\nu}^-(u+)}\check{V}_{\ast,k}^-$.

1155: Finally, we compute $\hat{v}_{\ast}\equiv\argmin_{\tilde{B}_n}\left\{|v|:

1156: \hat{Q}_n(v)=\argmax_{\tilde{B}_n}\hat{Q}_n\right\}$.

1157: The following proposition now follows from

1158: the fact that $P(K\in\tilde{B}_n)\rightarrow 1$ for all

1159: compact $K\subset\re$:

1160: \begin{proposition}\label{p1}

1161: The conditional distribution of $\hat{v}_{\ast}$ given the data is

1162: asymptotically equal to the distribution of $\hat{v}_Q$ defined

1163: in theorem~\ref{t4}.

1164: \end{proposition}

1165:

1166: Hence for any $\pi>0$, we can consistently estimate the

1167: $\pi/2$ and $1-\pi/2$ quantiles of $\hat{v}_Q$ based

1168: on a large number of independent draws from $\hat{v}_{\ast}$,

1169: which estimates we will denote by $\hat{q}_{\pi/2}$ and

1170: $\hat{q}_{1-\pi/2}$, respectively. Thus an asymptotically

1171: valid $1-\pi$ confidence interval for $\zeta_0$ is

1172: $[\hat{\zeta}_n-\hat{q}_{1-\pi/2},\hat{\zeta}_n-\hat{q}_{\pi/2}]$.

1173:

1174: \subsection{Inference for regular parameters} Because $\hat{\zeta}_n$

1175: is $n$-consistent for $\zeta_0$, $\zeta_0$ can be treated as known

1176: in constructing inference for the regular parameters.

1177: Accordingly, we propose bootstrapping the likelihood and maximizing

1178: over $\psi$ while holding $\zeta$ fixed at $\hat{\zeta}_n$. This will

1179: significantly reduce the computational demands of the bootstrap.

1180: Also, to avoid the occurrence of ties during resampling,

1181: we suggest the following weighted bootstrap alternative

1182: to the usual nonparametric bootstrap. First generate

1183: $n$ i.i.d. positive random variables $\kappa_1,\ldots,\kappa_n$,

1184: with mean $0<\mu_{\kappa}<\infty$, variance

1185: $0<\sigma_{\kappa}^2<\infty$, and with

1186: $\int_0^{\infty}\sqrt{P(\kappa_1>u)}du<\infty$. Divide each weight

1187: by the sample average of the weights $\bar{\kappa}$, to obtain

1188: ``standardized weights'' $\kappa_1^{\circ},\ldots,\kappa_n^{\circ}$

1189: which sum to~$n$. For a real, measurable function $f$, define the

1190: weighted empirical measure $\pp_n^{\circ}f\equiv n^{-1}

1191: \sum_{i=1}^n\kappa_i^{\circ}f(X_i)$. Recall that the nonparametric bootstrap

1192: empirical measure $\pp_n^{\bullet}f\equiv n^{-1}\sum_{i=1}^n

1193: \kappa_i^{\bullet}f(X_i)$ uses multinomial weights

1194: $\kappa_1^{\bullet},\ldots,\kappa_n^{\bullet}$,

1195: where $\Exp{\kappa_i^{\bullet}}=1$, $i=1,\ldots,n$, and

1196: $\sum_{i=1}^n\kappa_i^{\bullet}=n$ almost surely.

1197:

1198: The proposed weighted bootstrap estimate $\hat{\psi}_n^{\circ}$

1199: is obtained by maximizing $\tilde{L}_n^{\circ}(\psi,\hat{\zeta}_n)$ over

1200: $\psi\in\Psi$, where $\tilde{L}_n^{\circ}$ is obtained by replacing

1201: $\pp_n$ with $\pp_n^{\circ}$ in the definition of $\tilde{L}_n$

1202: from section~3. We can similarly defined a modified nonparametric

1203: bootstrap $\hat{\psi}_n^{\bullet}$ as the $\argmax$ of

1204: $\psi\mapsto\tilde{L}_n^{\bullet}(\psi,\hat{\zeta}_n)$, where

1205: $\tilde{L}_n^{\bullet}$ is obtained by replacing $\pp_n$ with

1206: $\pp_n^{\bullet}$ in the definition of $\tilde{L}_n$. The following

1207: corollary establishes the validity of both kinds

1208: of bootstraps:

1209: \begin{corollary}\label{c1}

1210: Under the conditions of theorem~\ref{t5}, the conditional

1211: bootstrap of $\hat{\psi}_n$, based on either

1212: $\hat{\psi}_n^{\bullet}$ or $\hat{\psi}_n^{\circ}$,

1213: is asymptotically consistent for

1214: the limiting distribution $\mathbb{Z}$ in the following sense:

1215: Both $\sqrt{n}(\hat{\psi}_n^{\bullet}-\hat{\psi}_n)$ and

1216: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\hat{\psi}_n^{\circ}

1217: -\hat{\psi}_n)$ are asymptotically measurable, and both

1218: \begin{enumerate}

1219: \item[(i)] $\sup_{g\in BL_1}\left|E_{\bullet}g\left(

1220: \sqrt{n}(\hat{\psi}_n^{\bullet}-\hat{\psi}_n)\right)

1221: -Eg(\mathbb{Z})\right|\rightarrow 0$ in outer probability and

1222: \item[(ii)] $\sup_{g\in BL_1}\left|E_{\circ}g\left(

1223: \sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})

1224: (\hat{\psi}_n^{\circ}-\hat{\psi}_n)\right)

1225: -Eg(\mathbb{Z})\right|\rightarrow 0$ in outer probability,

1226: \end{enumerate}

1227: where $BL_1$ is the space of functions mapping

1228: $\re^{d+q+1}\times\ell^{\infty}[0,\tau]\mapsto\re$ which are

1229: bounded in absolute value by~1 and have Lipschitz norm $\leq 1$.

1230: Here, $E_{\bullet}$ and $E_{\circ}$ are expectations that are

1231: taken over the multinomial and standardized weights, respectively,

1232: conditional on the data.

1233: \end{corollary}

1234:

1235: \begin{remark}\label{r2}

1236: As discussed in remark~15 of \cite{klf04}, the

1237: choice of weights $\kappa_1,\ldots,\kappa_n$ in this kind of

1238: setting does not effect the first order asymptotics. However,

1239: it may have an effect on finite samples. In our experience, we

1240: have found that both exponential and truncated exponential weights

1241: perform quite well.

1242: \end{remark}

1243:

1244: \section{Test for the presence of a change-point}

1245: Constructing a valid

1246: test of the null hypothesis that there is no change-point,

1247: $H_0:\alpha_0=0=\eta_0$, poses an interesting challenge.

1248: Since the location of the change-point is no longer identifiable

1249: under $H_0$, this is an example of the issue studied in

1250: \cite{a01}. The test statistic we propose is

1251: a functional of the $\alpha$ and $\eta$ components of the score process,

1252: $\zeta\mapsto

1253: \hat{S}_{1}(\zeta)\equiv\sqrt{n}\pp_n

1254: (U^{\tau}_{\zeta, 1}(\hat \psi_0),

1255: U^{\tau}_{\zeta,2}(\hat \psi_0)')'$, where $\zeta\in[a,b]$,

1256: $\hat\psi_0 \equiv (0,0, \hat\beta_0,\hat A_0)$,

1257: and where $(\hat{\beta}_0,\hat{A}_0)$

1258: is the restricted MLE of $(\beta_0, A_0)$ under the

1259: assumption that $\alpha=0$ and $\eta=0$. This MLE is relatively easy to

1260: compute since estimation of $\zeta$ is not needed. Specifically,

1261: we have from section~3, that $\hat{\psi}_0$ is the maximizer of

1262: \begin{eqnarray}

1263: \psi&\mapsto&\pp_n\left\{\delta\log(n\Delta A(V))+l_1^{\psi}(V,\delta,Z)

1264: \right\}.\label{s9.e1}

1265: \end{eqnarray}

1266: We also define for future use

1267: $h\mapsto\hat{S}_{2}(h)\equiv\sqrt{n}\pp_n

1268: (U^{\tau}_{\zeta,3}(\hat\psi_0)(h_3),U^{\tau}_{\zeta,4}(\hat\psi_0)(h_4))'$,

1269: where $h\in{\cal H}_1$. The statistic we propose using is $\hat T_n\equiv\sup

1270: _{\zeta \in [a, b]}\left\{\hat{S}_{1}'(\zeta)\hat V_n^{-1}(\zeta)\right.$

1271: $\left.\times\hat{S}_{1} (\zeta)\right\}$, where

1272: $\hat V_n (\zeta)$ is a consistent

1273: estimator of the covariance of $\hat{S}_{1}(\zeta)$.

1274:

1275: There are several reasons for us to consider the sup functional of score

1276: statistics instead of wald or likelihood ratio statistics. Firstly, the score

1277: statistic is much less computational intense which makes the bootstrap

1278: implementation feasible. Secondly, we choose the sup functional because of

1279: its guarantee to have some power under local alternatives, as argued in

1280: \cite{d87} and which we prove below. We note, however, that \cite{ap94}

1281: argue that certain weighted averages of score statistics are optimal

1282: tests in some settings. A careful analysis of the relative merits of

1283: the two approaches in our setting is beyond the scope of the current paper

1284: but is an interesting topic for future research. However, as a step in

1285: this direction, we will compare $\hat{T}_n$ with the integrated statistic

1286: $\tilde{T}_n\equiv\int_{[a,b]}\left\{\hat{S}_{1} ' (\zeta)

1287: \hat V_n^{-1}(\zeta) \hat{S}_{1} (\zeta)\right\}d\zeta$.

1288:

1289: In this section, we first discuss a Monte Carlo technique which

1290: enables computation of $\hat{V}_n(\zeta)$, so that

1291: $\hat{T}_n$ and $\tilde{T}_n$ can be calculated in the first place,

1292: as well as computation of critical values for hypothesis testing.

1293: We then discuss the asymptotic properties of the statistics

1294: under a sequence of contiguous alternatives so that power can be

1295: verified. Specifically, we assume that all the conditions of section~2

1296: hold except for C2 which we replace with

1297: \begin{enumerate}

1298: \item[C2':] For each $n\geq 1$, $\alpha_0=\alpha_{\ast}/\sqrt{n}$ and

1299: $\eta_0=\eta_{\ast}/\sqrt{n}$, for some fixed $\alpha_{\ast}\in\re$

1300: and $\eta_{\ast}\in\re^q$. The joint distribution of $(C,Z,Y)$ does

1301: not change with $n$.

1302: \end{enumerate}

1303: Note that when $\alpha_{\ast}\neq 0$ or $\eta_{\ast}\neq 0$,

1304: condition~C2' will cause the distribution of the failure time $T$,

1305: given the covariates $(Z,Y)$, to change with $n$, and the

1306: value of $\zeta_0$ will affect this distribution.

1307:

1308: \subsection{Monte Carlo computation and inference}

1309: While the nonparametric bootstrap may be a reasonable approach,

1310: it is unclear how to verify its theoretical properties in this context.

1311: We will use instead the weighted bootstrap, based on the multipliers

1312: $\kappa_1^{\circ},\ldots,\kappa_n^{\circ}$ defined in section~8.2.

1313: Let $\pp_n^{\circ}$ be the corresponding weighted empirical measure,

1314: and define $\hat{\psi}_0^{\circ}$ to be the maximizer of~(\ref{s9.e1})

1315: after replacing $\pp_n$ with $\pp_n^{\circ}$. Also let

1316: $\hat{S}_1^{\circ}(\zeta)\equiv\sqrt{n}\pp_n^{\circ}(U_{\zeta,1}^{\tau}

1317: (\hat{\psi}_0^{\circ}),U_{\zeta,2}^{\tau}(\hat{\psi}_0^{\circ})')'$.

1318: Note that the same sample of weights

1319: $\kappa_1^{\circ},\ldots,\kappa_n^{\circ}$ are used for computing

1320: both $\hat{\psi}_0^{\circ}$ and the process

1321: $\{\hat{S}_1^{\circ}(\zeta),\zeta\in[a,b]\}$, so that the proper dependence

1322: between the score statistic and $\hat{\psi}_0$ will be captured.

1323: The structure of the set-up

1324: only requires considering values of $\zeta$ in the set

1325: $\{Y_{(1)},\ldots,Y_{(n)}\}\cap[a,b]$,

1326: since $\zeta\mapsto\hat{S}_{1}^{\circ}(\zeta)$

1327: does not change over the intervals $[Y_{(j)},Y_{(j+1)})$, $1\leq j\leq n-1$.

1328: Now repeat the bootstrap

1329: procedure a large number of times $\tilde{M}_n$, to obtain

1330: the bootstrapped score processes $\hat{S}_{1,1}^{\circ},\ldots,

1331: \hat{S}_{1,\tilde{M}_n}^{\circ}$. Note that we are allowing

1332: the number of bootstraps to depend on~$n$. Define

1333: $\zeta\mapsto\hat{\mu}_n(\zeta)\equiv\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}

1334: \hat{S}_{1,k}^{\circ}(\zeta)$ and let

1335: \[\zeta\mapsto\hat{V}_n(\zeta)

1336: =\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}\left\{

1337: \hat{S}_{1,k}^{\circ}(\zeta)

1338: -\hat{\mu}_n(\zeta)\right\}\left\{

1339: \hat{S}_{1,k}^{\circ}(\zeta)

1340: -\hat{\mu}_n(\zeta)\right\}'.\]

1341: Now we can compute the test statistics $\hat{T}_n$ and $\tilde{T}_n$

1342: with this choice for $\hat{V}_n$.

1343:

1344: To estimate critical values,

1345: we compute the standardized bootstrap test statistics

1346: $\hat{T}_{n,k}^{\circ}\equiv\sup_{\zeta\in[a,b]}\left\{

1347: \left[\hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]'

1348: \hat{V}_n^{-1}(\zeta)\left[

1349: \hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]\right\}$ and

1350: $\tilde{T}_{n,k}^{\circ}\equiv\int_{[a,b]}\left\{

1351: \left[\hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]'

1352: \hat{V}_n^{-1}(\zeta)\left[

1353: \hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]\right\}d\zeta$,

1354: for $1\leq k\leq\tilde{M}_n$. For a test of size $\pi$, we compare

1355: the test statistics with the $(1-\pi)$th quantile of the

1356: corresponding $\tilde{M}_n$ standardized bootstrap statistics.

1357: The reason we subtract off the sample mean when computing

1358: the bootstrapped test statistics is to make sure that we

1359: are approximating the null distribution even when the

1360: null hypothesis may not be true. What is a little unusual about this

1361: procedure is that the bootstrap must be performed before the

1362: statistics $\hat{T}_n$ and $\tilde{T}_n$ can be calculated in

1363: the first place. We also reiterate again that we are assuming the

1364: covariates $Z_i(\cdot)$ are observed at all time points

1365: $V_j\leq V_i$ for which $\delta_j=1$. As noted in section~2, we are

1366: aware that this is not necessarily valid in practice. As pointed out by

1367: a referee this is an important issues and it would be worth investigating

1368: whether the bootstrap weighting scheme could be

1369: modified to perform and account

1370: for imputation of the missing covariate values. Nevertheless, this issue

1371: is beyond the scope of this paper and we do not pursue it further here.

1372:

1373: \subsection{Asymptotic properties} In this section we establish

1374: the asymptotic validity of the proposed test procedure.

1375: Let $P$ denote the fixed probability distribution under the null

1376: hypothesis $H_0$, and let $P_n$ be the sequence of probability

1377: distributions under the contiguous sequence of

1378: alternatives $H_1^n$ defined in C2'.  Note that $P$ and $P_n$

1379: can be equal if $\alpha_{\ast}=0=\eta_{\ast}$.

1380: We need to study the

1381: proposed procedure under general $P_n$ to determine both its size under

1382: the null and its power under the alternative.

1383: We will use the notation $\weakpn$ to denote

1384: weak convergence under $P_n$. We need the following

1385: lemmas and theorem:

1386:

1387: \begin{lemma}\label{s9.l1}

1388: The sequence of probability measures $P_n$ satisfies

1389: \begin{eqnarray}

1390: \label{c8.e1}\\

1391: \int \left[ \sqrt{n}(d P_n ^{1/2} - d P ^{1/2})-

1392: \frac{1}{2}\left(U_{\zeta_0,1}^{\tau}(\psi_0^{\ast})(\alpha_{\ast})

1393: +U_{\zeta_0,2}^{\tau}(\psi_0^{\ast})(\eta_{\ast})\right)dP^{1/2}

1394: \right]^2 \rightarrow 0,\nonumber

1395: \end{eqnarray}

1396: where $\psi_0^{\ast}\equiv(0,0,\beta_0,A_0)$.

1397: \end{lemma}

1398:

1399: \begin{lemma}\label{s9.l2}

1400: $\|\hat{\psi}_0-\psi_0^{\ast}\|_{\infty}

1401: \rightarrow 0$ in probability under $P_n$.

1402: \end{lemma}

1403:

1404: \begin{theorem}\label{s9.t1}

1405: Under the conditions of section~2, with condition C2 replaced by

1406: C2', $\hat{S}_1$ converges

1407: under $P_n$ in distribution in $l^{\infty}([a,b]^{q+1})$ to the

1408: $(q+1)$-vector process $\zeta\mapsto\mathbb{Z}_{\ast}(\zeta)

1409: +\nu_{\ast}(\zeta)$,

1410: where $\mathbb{Z}_{\ast}$ is a tight, mean zero Gaussian

1411: $(q+1)$-vector process with

1412: $\mbox{cov}[\mathbb{Z}_{\ast}(\zeta_1),\mathbb{Z}_{\ast}(\zeta_2)]

1413: =\Sigma_{\ast}(\zeta_1,\zeta_2)\equiv

1414: \sigma_{\ast}^{11}(\zeta_1\vee\zeta_2)-\sigma_{\ast}^{12}(\zeta_1)

1415: [\sigma_{\ast}^{22}]^{-1}\sigma_{\ast}^{21}(\zeta_2)$, for all

1416: $\zeta_1,\zeta_2\in[a,b]$, where, for each $\zeta\in[a,b]$,

1417: \begin{eqnarray*}

1418: \nu_{\ast}(\zeta)&\equiv&\left\{\sigma_{\ast}^{11}

1419: (\zeta\vee\zeta_0)

1420: -\sigma_{\ast}^{12}(\zeta)[\sigma_{\ast}^{22}]^{-1}

1421: \sigma_{\ast}^{21}(\zeta_0)\right\}

1422: \left(\begin{array}{c}\alpha_{\ast}\\ \eta_{\ast}\end{array}\right),\\

1423: \sigma_{\ast}^{11}(\zeta)

1424: &\equiv&\left(\begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta}^{11}

1425: &\sigma_{\psi_0^{\ast},\zeta}^{12}\\ \\ \sigma_{\psi_0^{\ast},\zeta}^{21}

1426: &\sigma_{\psi_0^{\ast},\zeta}^{22}\end{array}\right),\;\;\;\;

1427: \sigma_{\ast}^{12}(\zeta)\;\;\equiv\;\;\left(

1428: \begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta}^{13}

1429: &\sigma_{\psi_0^{\ast},\zeta}^{14}\\ \\ \sigma_{\psi_0^{\ast},\zeta}^{23}

1430: &\sigma_{\psi_0^{\ast},\zeta}^{24}\end{array}\right),\\

1431: \sigma_{\ast}^{21}(\zeta)&\equiv&\left(

1432: \begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta}^{31}

1433: &\sigma_{\psi_0^{\ast},\zeta}^{32}\\ \\ \sigma_{\psi_0^{\ast},\zeta}^{41}

1434: &\sigma_{\psi_0^{\ast},\zeta}^{42}\end{array}\right),\;\;\;\;

1435: \sigma_{\ast}^{22}\;\;\equiv\;\;\left(

1436: \begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta_0}^{33}

1437: &\sigma_{\psi_0^{\ast},\zeta_0}^{34}\\ \\ \sigma_{\psi_0^{\ast},\zeta_0}^{43}

1438: &\sigma_{\psi_0^{\ast},\zeta_0}^{44}\end{array}\right),

1439: \end{eqnarray*}

1440: and where $\sigma_{\theta}^{jk}$, for $1\leq j,k\leq 4$, is as defined

1441: in section~5.2.

1442: \end{theorem}

1443:

1444: The following is the main result on the limiting distribution of the

1445: test statistics. For the remainder of this section, we require

1446: condition~B4 to hold.  As will be shown in the proof of corollary~\ref{c2},

1447: condition~B4 implies that $V_{\ast}(\zeta)\equiv\Sigma_{\ast}(\zeta,\zeta)$

1448: is positive definite

1449: for all $\zeta\in[a,b]$. Note that we will establish consistency of

1450: $\hat{V}_n$ after we verify the validity of the proposed bootstrap.

1451: \begin{corollary}\label{c2}

1452: Assume~B4 holds and $\hat{V}_n(\zeta)\rightarrow V_{\ast}(\zeta)$

1453: in probability under $P_n$, uniformly over $\zeta\in[a,b]$.

1454: Then $\hat{T}_n\weakpn\sup_{\zeta\in[a,b]}\left\{

1455: \left[\mathbb{Z}_{\ast}(\zeta)+\nu_{\ast}(\zeta)\right]'\right.$

1456: $\left.\times V_{\ast}^{-1}(\zeta)

1457: \left[\mathbb{Z}_{\ast}(\zeta)+\nu_{\ast}(\zeta)\right]\right\}$

1458: and $\tilde{T}_n\weakpn\int_{[a,b]}\left\{

1459: \left[\mathbb{Z}_{\ast}(\zeta)+\nu_{\ast}(\zeta)\right]'

1460: V_{\ast}^{-1}(\zeta)

1461: \left[\mathbb{Z}_{\ast}(\zeta)\right.\right.$\newline

1462: $\left.\left.+\nu_{\ast}(\zeta)\right]\right\}$.

1463: Thus the limiting null distributions of $\hat{T}_n$ and

1464: $\tilde{T}_n$ are

1465: $\hat{\mathbb{T}}_{\ast}\equiv\sup_{\zeta\in[a,b]}$ $\left\{

1466: \mathbb{Z}_{\ast}'(\zeta)V_{\ast}^{-1}(\zeta)

1467: \mathbb{Z}_{\ast}(\zeta)\right\}$ and

1468: $\tilde{\mathbb{T}}_{\ast}\equiv\int_{[a,b]}\left\{

1469: \mathbb{Z}_{\ast}'(\zeta)V_{\ast}^{-1}(\zeta)

1470: \mathbb{Z}_{\ast}(\zeta)\right\}d\zeta$, respectively.

1471: \end{corollary}

1472:

1473: \begin{remark}\label{r3}

1474: Note that $\nu_{\ast}(\zeta_0)$ equals the matrix

1475: $\Sigma_{\ast}(\zeta_0,\zeta_0)$ times $(\alpha_{\ast},\eta_{\ast}')'$.

1476: By arguments in the proof of lemma~\ref{l4}, we know that

1477: $\Sigma_{\ast}(\zeta_0,\zeta_0)$ is positive definite. Thus

1478: $\nu_{\ast}(\zeta_0)$ will be strictly nonzero whenever

1479: $(\alpha_{\ast},\eta_{\ast}')'\neq 0$. Thus both $\hat{T}_n$

1480: and $\tilde{T}_n$ will have power to reject~$H_0$ under

1481: strictly non-null contiguous alternatives $H_1^n$.

1482: \end{remark}

1483:

1484: The following theorem is the first step in establishing the validity

1485: of the bootstrap. For brevity, we will use the notation

1486: $\weakpnboot$ to denote conditional convergence of the bootstrap,

1487: either weakly in the sense of corollary~\ref{c1} or in probability,

1488: but under $P_n$ rather than $P$.

1489: \begin{theorem}\label{s9.t2}

1490: Under the conditions of theorem~\ref{s9.t1},

1491: $\hat{S}_1^{\circ}-\hat{S}_1\;\weakpnboot\;\mathbb{Z}_{\ast}$

1492: in $\ell^{\infty}([a,b]^{q+1})$.

1493: \end{theorem}

1494:

1495: The following corollary yields the desired consistency of $\hat{V}_n$

1496: and the validity of the proposed bootstrap for obtaining

1497: critical values. Define

1498: $\hat{\mathbb{F}}(u)\equiv\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}

1499: \ind\left\{\hat{T}_{n,k}^{\circ}\leq u\right\}$ and

1500: $\tilde{\mathbb{F}}(u)\equiv\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}

1501: \ind\left\{\tilde{T}_{n,k}^{\circ}\leq u\right\}$.

1502: \begin{corollary}\label{c3}

1503: There exists a sequence $\tilde{M}_n\rightarrow\infty$, as

1504: $n\rightarrow\infty$, such that

1505: $\hat{V}_n\weakpn\Sigma_{\ast}$, $\hat{V}_n\weakpnboot\Sigma_{\ast}$,

1506: and both $\sup_{u\in\re}\left|

1507: \hat{\mathbb{F}}(u)-P\left\{\hat{\mathbb{T}}_{\ast}\leq u\right\}\right|

1508: \weakpnboot 0$ and $\sup_{u\in\re}$ $\left|

1509: \tilde{\mathbb{F}}(u)-P\left\{\tilde{\mathbb{T}}_{\ast}\leq u\right\}\right|

1510: \weakpnboot 0$.

1511: \end{corollary}

1512:

1513: \section{Implementation and simulation study}

1514: We have implemented the proposed estimation

1515: and inference procedures for both the proportional hazards and proportional

1516: odds models. The maximum likelihood estimates were computed using the

1517: profile likelihood $pL_{n}(\zeta)$ defined in section~4. A line search

1518: over the order statistics of $Y$ is used to maximize over $\zeta$,

1519: while Newton's method is used to maximize over $\psi$.

1520: The stationary point equation~(\ref{c4:e2}) can be

1521: used to profile over $A$ for each value of $\zeta$ and $\gamma$. In

1522: our experience, the computational time of the entire procedure is

1523: reasonable. A thorough simulation study to validate the moderate

1524: sample size performance of this procedure and the proposed bootstrap

1525: procedures of section~8 is underway and will be presented elsewhere.

1526:

1527: Because of the unusual form of

1528: the statistical tests proposed in section~9, we feel it is worthwhile

1529: at this point to present a small simulation study evaluating their

1530: moderate sample size performance. Both the proportional hazards and

1531: proportional odds models were considered. A single time-independent

1532: covariate with

1533: a standard normal distribution was used, so that $d=q=1$, and the

1534: change-point $Y$ also had a standard normal distribution. The

1535: parameter values were set at $\zeta_0=0$, $\alpha_0=0$, $\beta_0=1$,

1536: $\eta_0\in\{0,-0.5,-1,-2,-3\}$, and $A_0(t)=t$. The range of

1537: $\eta_0$ values includes the null hypothesis $H_0$ (when $\eta_0=0$)

1538: and several alternative hypotheses. The censoring time was

1539: exponentially distributed with rate $0.1$ and truncated at 10.

1540: This resulted in a censoring rate of about 25\%. The sample

1541: size for each simulated data set was 300. For each simulated

1542: data set, 250 bootstraps were generated with standard exponential

1543: weights truncated at 5, to compute $\hat{V}_n$ and the critical

1544: values for the two test statistics, $\hat{T}_n$ (the ``sup score

1545: test'') and $\tilde{T}_n$ (the ``mean score test''). The range

1546: for $\zeta$ was restricted to the inner 80\% of the $Y$ values.

1547: Each scenario was replicated 250 times.

1548:

1549: The results of the simulation study are presented in table~\ref{table1} on

1550: page~\pageref{table1}.

1551: The type~I error (the $\eta_0=0$ column) is quite close to

1552: the targeted 0.05 level, and the power increases with the magnitude

1553: of $\eta_0$. Also, the sup test is notably more powerful than the mean

1554: test for all alternatives.

1555: We also tried the nonparametric bootstrap and found that

1556: it did not work nearly as well. While it is difficult to make sweeping

1557: generalizations with this small of a numerical study, it appears as if the

1558: proposed test statistics match the theoretical predictions and have

1559: reasonable power. More simulation studies into the properties of these

1560: statistics would be worthwhile, especially studies of the impact of

1561: time-dependent covariates.

1562:

1563: \begin{table*}

1564: \caption{Results from the simulation study of the

1565: sup and mean score test statistics in

1566: the proportional hazards and proportional odds models.

1567: The sample size is 300, the level of

1568: censoring approximately 25\%, and the nominal type~I error

1569: is 0.05. 250 replicates were generated for each

1570: configuration. The parameters were set at

1571: $\zeta_0=0$, $\alpha_0=0$, $\beta_0=1$, and $A_0(t)=t$, with

1572: the value of $\eta_0$ varying. The worst-case Monte Carlo standard error

1573: for the power estimates is $0.03=0.50/\sqrt{250}$.}

1574: \label{table1}

1575: \begin{tabular}{c|c|c|c|c|c}

1576: \hline\hline

1577: \multicolumn{6}{c}{Proportional hazards model}\\ \hline

1578: \scriptsize{Sup score test statistic}& Null $\eta_0=0$ & $\eta_0=-0.5$

1579: &  $\eta_0=-1$ & $\eta_0=-2 $ & $\eta_0=-3$ \\ \hline

1580: \scriptsize{mean}&5.078&5.590 &7.874 &13.524&35.507 \\ \hline

1581: \scriptsize{Standard Deviation}&2.728&2.859 &3.919 &6.992& 11.337 \\\hline

1582: \scriptsize{power}&0.044&0.076 &0.180 &0.536&0.980\\ \hline\hline

1583: \scriptsize{Mean score test statistic}& Null $\eta_0=0$

1584: & $\eta_0=-0.5$ & $\eta_0=-1$&  $\eta_0=-2$ &  $\eta_0=-3$ \\\hline

1585: \scriptsize{mean}&1.403 &1.694 &2.560 &5.412 & 5.529 \\ \hline

1586: \scriptsize{Standard Deviation} &1.206 &1.104 &1.597 &2.492  & 2.683\\ \hline

1587: \scriptsize{power}&0.040 &0.050 &0.120 &0.236  &0.304 \\ \hline \hline

1588: \multicolumn{6}{c}{Proportional odds model}\\ \hline

1589: \scriptsize{Sup score test statistic}& Null $\eta_0=0$ & $\eta_0=-0.5$

1590: &  $\eta_0=-1$ & $\eta_0=-2 $ & $\eta_0=-3$ \\ \hline

1591: \scriptsize{mean}

1592: &3.950&4.762 &5.693 & 8.327  &13.956 \\\hline \scriptsize{Standard

1593: Deviation}&2.390&1.610 & 1.255 &2.901 & 4.244\\ \hline

1594: \scriptsize{power}&0.043&0.068 &0.112 &0.364 &0.660 \\ \hline\hline

1595: \scriptsize{Mean score test statistic}& Null

1596: $\eta_0=0$ & $\eta_0=-0.5$ & $\eta_0=-1$&  $\eta_0=-2$

1597: &  $\eta_0=-3$ \\ \hline

1598: \scriptsize{mean}&1.177 &1.912 &2.848 &3.265 &4.349 \\ \hline

1599: \scriptsize{Standard Deviation} &0.946 & 1.078 &1.360 &1.498  &1.718 \\ \hline

1600: \scriptsize{power}&0.048 &0.056 &0.116 &0.167  &0.285 \\ \hline\hline

1601: \end{tabular}

1602: \end{table*}

1603:

1604: \section{Proofs}

1605:

1606: {\it Proof of lemma~\ref{l.v1}.} Verification of~D1 is straightforward.

1607: For~D2, we have for all $u\geq 0$,

1608: \[\left|\frac{\ddot{\Lambda}(u)}{\dot{\Lambda}(u)}\right|=

1609: \frac{\Exp{W^2e^{-uW}}}{\Exp{We^{-uW}}}\leq\frac{\Exp{W^2}}{\Exp{W}}<\infty.\]

1610: The second-to-last inequality requires some justification. Note that the

1611: probability measure $Qf(W)\equiv\Exp{f(W)W}/\Exp{W}$ is well-defined

1612: for functions $f$ bounded by $O(W^3)$ by the positivity of $W$ and the

1613: existence of a fourth moment. Now we have

1614: \[\frac{\Exp{W^2e^{-uW}}}{\Exp{We^{-uW}}}=\frac{Q[We^{-uW}]}

1615: {Q[e^{-uW}]}\leq Q[W]=\frac{\Exp{W^2}}{\Exp{W}},\]

1616: since $e^{-uW}$ uniformly down-weights larger values of $W$ and thus forces

1617: the left term of the inequality to be decreasing in~$u$. This proves

1618: the first part.

1619:

1620: For the second part, take $c_0=c$, and note that

1621: \[|u^c\Lambda(u)|=\Exp{u^ce^{-uW}}=\Exp{W^{-c}(uW)^ce^{-uW}}

1622: \leq k\Exp{W^{-c}},\]

1623: where $k=\sup_{x\geq 0}x^ce^{-x}=c^ce^{-c}<\infty$. Similarly,

1624: \[|u^{1+c}\dot{\Lambda}(u)|=\Exp{u^{1+c}We^{-uW}}

1625: =\Exp{W^{-c}(uW)^{1+c}e^{-uW}}\leq k'\Exp{W^{-c}},\]

1626: where $k'=\sup_{x\geq 0}x^{1+c}e^{-x}=(1+c)^{1+c}e^{-1-c}<\infty$.

1627: This concludes the proof.$\Box$

1628:

1629: {\it Proof of lemma~\ref{l1}.} Suppose that

1630: \begin{eqnarray}

1631: \;\;\;\;G\left(\int_0^t \tilde{Y}(s)e^{r_{\xi}(u;Z,Y)}dA(u) \right) = G

1632:   \left(\int_0^t \tilde{Y}(s)e^{r_{\xi_0}(u;Z,Y)}dA_0(u)

1633:   \right)~\label{c12:e1}

1634: \end{eqnarray}

1635: for all $t \in [0, \tau]$ almost surely under $P$. The target is

1636: to show that~(\ref{c12:e1}) implies that $\xi = \xi_0$ and $A =

1637: A_0$ on $[0, \tau]$. By condition~A1, (\ref{c12:e1})~implies

1638: \[\int_0^t e^{r_{\xi}(u;Z,Y)}dA(u)=

1639: \int_0^t e^{r_{\xi_0}(u:Z,Y)}dA_0(u)\]

1640: for all $t \in [0, \tau]$ almost surely.

1641: Taking the Radon-Nikodym derivative of both

1642: sides with respect to $A_0$, and taking logarithms, we obtain

1643: \begin{eqnarray}

1644: &&\beta'Z(t)+(\alpha+\eta'Z_2(t))\ind\{Y>\zeta\}-\beta_0'Z(t)\label{c12:e2}\\

1645: &&\mbox{\hspace{1.5in}}

1646: -(\alpha_0+\eta_0'Z_2(t))\ind\{Y>\zeta_0\} +\log(\tilde{a}(t))=0,\nonumber

1647: \end{eqnarray}

1648: almost surely, where $\tilde{a} \equiv dA/dA_0$.

1649:

1650: Assume that $\zeta>\zeta_0$. Now choose $y<\zeta_0$ such

1651: that $y\in\tilde{V}(\zeta_0)$ and

1652: $\mbox{var}[Z(t_1)|Y=y]$ is positive definite, where $t_1$ is

1653: as defined in~B3.  Note that this is possible by assumptions~B2

1654: and~B3. Conditioning the left-hand side of~(\ref{c12:e2})

1655: on $Y=y$ and evaluating at $t=t_1$ yields that $\beta=\beta_0$.

1656: Now choose $\zeta_0<y<\zeta$ such that $y\in\tilde{V}(\zeta_0)$ and

1657: $\mbox{var}[Z(t_2)|Y=y]$ is positive definite. Conditioning

1658: the left-hand side of~(\ref{c12:e2}) on $Y=y$, and evaluating

1659: at $t=t_2$ yields that $\eta_0=0$. Because the density of $Y$ is

1660: positive in $\tilde{V}(\zeta_0)$, we also see that $\alpha_0=0$.

1661: But this is not possible by condition~C2. A similar argument can

1662: be used to show that $\zeta<\zeta_0$ is impossible. Thus $\zeta=\zeta_0$.

1663: Now it is not hard to argue that condition~B3 forces

1664: $\beta=\beta_0$, $\eta=\eta_0$ and $\alpha=\alpha_0$.

1665: Hence $\log(\tilde{a}(t))=0$ for all $t\in[0,\tau]$,

1666: and the proof is complete.$\Box$

1667:

1668: {\it Proof of lemma~\ref{l2}.} Note that for each $n$, maximizing

1669: the log-likelihood

1670: over $A$ is equivalent to maximizing over a fixed number of parameters

1671: since the number of jumps $K\leq n$.  Thus maximizing over the

1672: whole parameter $\theta$ involves maximizing an empirical average of

1673: functions that are smooth over $\psi$ and cadlag over $\zeta$.

1674: Note also that

1675: \[\|\hat{A}_n-A_0\|_{[0,\tau]}=\sum_{j=1}^{K}

1676: \left(\left|\hat{A}_n(T_j-)-A_0(T_j)\right|\vee

1677: \left|\hat{A}_n(T_j)-A_0(T_j)\right|\right),\]

1678: where $\|\cdot\|_B$ is the uniform norm over the set $B$,

1679: and thus $\|\hat{A}_n-A_0\|_{[0,\tau]}$ is measurable.

1680: Hence the uniform distance between

1681: $\hat{\theta}_n$ and $\theta_0$ is also measurable.

1682: Thus almost sure convergence of

1683: $\hat{\theta}_n$ is equivalent to outer almost sure convergence.

1684: Now we return to the proof. Assume

1685: \begin{eqnarray}

1686: \lim\sup_{n \rightarrow \infty} \hat{A}_n(\tau) =

1687: \infty,\label{c12:e4}

1688: \end{eqnarray}

1689: with probability $>0$.

1690: We will show that this leads to a contradiction. It is now possible to

1691: choose a data sequence such that~(\ref{c12:e4}) holds and

1692: $\tilde{G}_n\equiv\pp_n N\rightarrow \tilde{G}_0\equiv P_0 N$

1693: uniformly, since the latter happens with probability~1. Fix one such

1694: sequence $\{n\}$, and define $\theta_n=(\xi_0,A_n)$, where $A_n=\tilde{G}_n$.

1695: Note that the log-likelihood difference,

1696: $\tilde{L}_n(\hat\theta_n)-\tilde{L}_n(\theta_n)$,

1697: should be non-negative for all $n$, since $\hat{\theta}_n$

1698: maximizes the log-likelihood. We are going to show that the

1699: difference is asymptotically negative under the

1700: assumption~(\ref{c12:e4}).

1701:

1702: Now choose a subsequence $\{n_k\}$ such that $\hat{A}_{n_k}(\tau)

1703: \rightarrow \infty$, as $k\rightarrow\infty$. We now have,

1704: for $c_0>0$ from assumption~D2, that

1705: $L_{n_k}(\hat{\theta}_{n_k})-L_{n_k}(\theta_{n_k})$

1706: \begin{eqnarray}

1707: \mbox{\hspace{0.7in}} &\le& O(1)

1708: +\mathbb{P}_{n_k} \delta \left[\log

1709: \left(n_k \Delta \hat{A}_{n_k}(V) \right)+\log\left(

1710: -\dot{\Lambda}(H^{\hat{\theta}_n}(V))\right)\right]\nonumber\\

1711: &&-\mathbb{P}_{n_k}(1-\delta)G(H^{\hat{\theta}_n}(V))\nonumber\\

1712: &\leq& O(1)+\mathbb{P}_{n_k} \delta\log

1713: \left(n_k \Delta \hat{A}_{n_k}(V) \right)

1714: -\mathbb{P}_{n_k}(\delta+c_0)\log\hat{A}_n(V),\label{c12:e5}

1715: \end{eqnarray}

1716: since, for all $u>0$,

1717: $\log\dot{G}(u)=\log[-\dot{\Lambda}(u)]-\log[\Lambda(u)]$;

1718: $\log[-\dot{\Lambda}(u)]=\log[-u^{1+c_0}$

1719: $\dot{\Lambda}(u)]-(1+c_0)\log(u)

1720: \leq O(1)-(1+c_0)\log(u)$ by condition~D2; and since

1721: $\log\Lambda(u)=\log[u^{c_0}\Lambda(u)]-c_0\log(u)\leq O(1)-c_0\log(u)$

1722: also by condition~D2.

1723:

1724: Next we take a partition of $[0, \tau]$, $0=v_0<v_1<\cdots<v_M=\tau$,

1725: for some finite $M$. The right hand side of~(\ref{c12:e5}) is now

1726: dominated by

1727: \begin{eqnarray}\label{c12:e6}

1728: O(1)+\log \hat{A}_{n_k}(\tau) \mathbb{P}_{n_k} \left(\delta

1729: \ind\{V \in [v_{M-1}, \infty] \} - (\delta + c_0)\ind\{V \in

1730: [\tau, \infty \}  \right)&&\\

1731: +\sum_{m=1}^{M-1}\log \hat{A}_{n_k}(v_m) \mathbb{P}_{n_k}

1732: \left(\delta \ind\{V \in [v_{m-1}, v_m] \} - (\delta + c_0)

1733: \ind\{V \in [v_{m}, v_{m+1}] \}  \right).&&\nonumber

1734: \end{eqnarray}

1735: For a fixed constant $c > 1$, we can choose this partition such that

1736: \begin{eqnarray*}

1737: P_0N(\tau)\ind\{V \in [v_{M-1}, \infty]\} = P_0[N(\tau) +

1738: c_0/c]\ind\{V \in [\tau, \infty] \},

1739: \end{eqnarray*}

1740: and, for $m = 1,...,M-1,$

1741: \begin{eqnarray*}

1742: P_0N(\tau)\ind\{V \in [v_{m-1}, v_m]\} = P_0[N(\tau) +

1743: c_0/c]\ind\{V \in [v_m, v_{m+1}] \}.

1744: \end{eqnarray*}

1745: Recalling that $\tilde{G}_n\rightarrow \tilde{G}_0$ uniformly, we

1746: obtain that~(\ref{c12:e6}) tends to $-\infty$ as $k \rightarrow \infty$,

1747: which is the intended contradiction. Thus,  $\limsup_{n

1748: \rightarrow \infty} \hat{A}_n(\tau) < \infty$ almost surely.$\Box$

1749:

1750: {\it Proof of theorem~\ref{t1}.}

1751: By the opening arguments in the proof of lemma~\ref{l2}, we have that

1752: outer almost sure convergence is equivalent to the usual almost

1753: sure convergence in this instance. Note that $\{\hat{A}_n(\tau)\}$ is bounded

1754: almost surely, $\tilde{G}_n\rightarrow\tilde{G}_0$ almost

1755: surely, and the class

1756: \[{\cal F}_{(k)}\equiv\left\{W(t;\theta):t\in[0,\tau],\xi\in{\cal X},

1757: A\in{\cal A}_{(k)}\right\},\]

1758: where ${\cal A}_{(k)}\equiv\{A\in{\cal A}:A(\tau)\leq k\}$,

1759: is Donsker (and hence also Glivenko-Cantelli) for every $k<\infty$

1760: by lemma~\ref{l.t1.1} below. By similar arguments to those used

1761: in lemma~\ref{l.t1.1}, we have that the class

1762: $\{G(H^{\theta}(V)):\xi\in{\cal X},A\in{\cal A}_{(k)}\}$ is also

1763: Glivenko-Cantelli for all $k<\infty$. We therefore

1764: have the following with probability~1:

1765: $\{\hat{A}_n(\tau)\}$ is bounded asymptotically, $\tilde{G}_n\rightarrow

1766: \tilde{G}_0$ uniformly, $(\mathbb{P}_n-P)W(\cdot;\hat{\theta}_n)

1767: \rightarrow 0$ uniformly, and $(\mathbb{P}_n-P)\left[

1768: G(H^{\hat{\theta}_n}(V))-G(H^{\theta_n}(V))\right]\rightarrow 0$.

1769: Now, fix a sequence $\{n\}$ for which these last four asymptotic events

1770: hold. We can now use the Helly selection theorem to find a

1771: subsequence $\{n_k\}$ and a function $A$ such that

1772: $\hat{A}_{n_k}(t) \rightarrow A(t)$ for all $t \in [0, \tau]$ at

1773: which $A$ is continuous. From~(\ref{c4:e2}), we obtain

1774: \[|\hat{A}_{n_k}(s)-\hat{A}_{n_k}(t)| \le

1775: O(1)\mathbb{P}_{n_k}|N(s)-N(t)|\rightarrow O(1)|\tilde{G}_0(s)

1776: -\tilde{G}_0(t)|,\]

1777: for all $s,t \in [0, \tau]$. Since $\tilde{G}_0$ is continuous by

1778: condition~C3, we know that $A$ must be continuous on all of $[0,\tau]$.

1779: Thus $\hat{A}_{n_k} \rightarrow A$ uniformly.

1780: Without loss of generality, we can also assume that along this

1781: subsequence $\hat{\xi}_{n_k} \rightarrow \xi$ for some $\xi \in

1782: {\cal X}\equiv\Upsilon\times B_1\times B_2\times(a,b)$. Denote

1783: $\theta=(\xi,A)$.

1784:

1785: Consider now $\theta_n \equiv (\xi_0, A_n)$, where

1786: \begin{eqnarray*}

1787: A_n(t) \equiv \int_0^t \frac{d\tilde{G}_n(u)}{PW(u;

1788: \theta_0)}.

1789: \end{eqnarray*}

1790: We can use the same technique as in the derivation

1791: of~(\ref{c4:e2}) to show that $A_0$ satisfies

1792: \begin{eqnarray*}

1793: A_0(t)\equiv\int_0^t \frac{d\tilde{G}_0(u)}{PW(u;\theta_0)},

1794: \end{eqnarray*}

1795: for all $t\in[0,\tau]$. Thus $A_{n_k}\rightarrow A_0$ uniformly,

1796: as $k\rightarrow\infty$. At this point, we have

1797: \begin{eqnarray*}

1798: 0&\leq&\tilde{L}_{n_k}(\hat{\theta}_{n_k})-\tilde{L}_{n_k}(\theta_{n_k})\\

1799: &=&\int_0^{\tau}\log\left[\frac{PW(u;\theta_0)}

1800: {\mathbb{P}_{n_k}W(u;\hat{\theta}_{n_k})}\right]

1801: d\tilde{G}_{n_k}(u)-\mathbb{P}_{n_k}\left[G(H^{\hat{\theta}_{n_k}}(V))-

1802: G(H^{\theta_{n_k}}(V))\right]\\

1803: &\rightarrow&\int_0^{\tau}\log\frac{dA(u)}{dA_0(u)}d\tilde{G}_0(u)

1804: -P\left[G(H^{\theta}(V))-G(H^{\theta_0}(V))\right]\\

1805: &=&\int\log\frac{dP_{\theta}}{dP}dP\\

1806: &\leq&0.

1807: \end{eqnarray*}

1808: But this forces $\theta=\theta_0$ by the identifiability of

1809: the model as given in lemma~\ref{l1}. Thus all convergent

1810: subsequences of $\hat{\theta}_n$,

1811: on a set of probability~1, converge to $\theta_0$. The desired

1812: result now follows.$\Box$

1813:

1814: \begin{lemma}\label{l.t1.1}

1815: $\forall k<\infty$, the class

1816: ${\cal F}_{(k)}\equiv\left\{W(t;\theta):t\in[0,\tau],\xi\in{\cal X},

1817: \right.$ $\left.A\in{\cal A}_{(k)}\right\}$,

1818: is $P$-Donsker.

1819: \end{lemma}

1820:

1821: {\it Proof.} Routine arguments can be used to establish that

1822: the class ${\cal F}_1\equiv

1823: \{e^{r_{\xi}(t;Z,Y)}:t\in[0,\tau],\xi\in{\cal X}\}$ is

1824: Donsker. Consider the map

1825: \[h\in D[0,\tau]\mapsto

1826: \left\{\int_0^th(s)dA(s):t\in[0,\tau],A\in{\cal A}_{(k)}\right\}

1827: \in\ell^{\infty}([0,\tau]\times{\cal A}_{(k)}),\]

1828: and note that it is uniformly equicontinuous and linear.

1829: Thus the class

1830: \[{\cal F}_2\equiv\left\{\int_0^t e^{r_{\xi}(s;Z,Y)}dA(s):

1831: t\in[0,\tau],\xi\in{\cal X},A\in{\cal A}_{(k)}\right\}\]

1832: is Donsker by the continuous mapping theorem.

1833: Now condition~D1 ensures that both $\dot{G}$ and

1834: $\ddot{G}/\dot{G}$ are Lipschitz on compacts.  This fact,

1835: combined with the facts that sums of Donsker classes

1836: are Donsker and products of bounded Donsker classes are

1837: Donsker, yields the desired results.$\Box$

1838:

1839: {\it Proof of lemma~\ref{l3}.} By the smoothness assumed in~D1 of the involved

1840: derivatives, we have for each $\zeta\in[a,b]$ and $\psi^{\ast}\in\Psi$,

1841: \[\lim_{t\downarrow 0}\sup_{h^{\ast}\in\mbox{lin}\,\Psi:

1842: \rho_1(h^{\ast})\leq 1}

1843: \sup_{h\in{\cal H}_r}\left|\int_0^1 h^{\ast}

1844: \left(\sigma_{\psi^{\ast}+st h^{\ast}}(h)-\sigma_{\psi^{\ast}}(h)\right)ds

1845: \right|=0.\]

1846: Thus, $\sup_{h\in{\cal H}_r}\left|PU_{\zeta}^{\tau}(\psi^{\ast}+h^{\ast})(h)

1847: -PU_{\zeta}^{\tau}(\psi^{\ast})(h)+h^{\ast}\left(\sigma_{\psi^{\ast}}(h)

1848: \right)\right|=o(\rho_1(h^{\ast}))$, as $\rho_1(h^{\ast})\rightarrow 0$.$\Box$

1849:

1850: {\it Proof of lemma~\ref{l4}.}

1851: First note that for any $h=(h_1,h_2,h_3,h_4)\in{\cal H}_{\infty}$,

1852: $\sigma_{\theta_0}(h)=\mathbb{A}(h)+\mathbb{B}(h)$, where

1853: $\mathbb{A}(h)=\left(h_1,h_2,h_3,g_0h_4\right)$,

1854: $\mathbb{B}(h)=\sigma_{\theta_n}(h)-\mathbb{A}(h)$,

1855: and $g_0(u)=P\left[\tilde{Y}(u)e^{r_{\xi_0}(u;Z,Y)}

1856: \hat{\Xi}_{\theta_0}^{(0)}(\tau)\right]$. It is not hard to verify that

1857: since $g_0$ is bounded below, $\mathbb{A}$

1858: is one-to-one and onto with continuous

1859: inverse defined by $\mathbb{A}^{-1}(h)=(h_1,h_2,h_3,h_4/g_0)$.

1860: It is also not hard to

1861: verify that the operator $\mathbb{B}$

1862: is compact as an operator on ${\cal H}_r$

1863: for any $0<r<\infty$. Thus the first part of the theorem is proved

1864: by lemma~25.93 of \cite{v98}, if

1865: we can show that $\sigma_{\theta_0}$ is one-to-one. This will then imply

1866: that for each $r>0$, there is an $s>0$ with

1867: $\sigma_{\theta_0}^{-1}({\cal H}_s)\subset{\cal H}_r$. Now we have

1868: \[\inf_{\psi\in\mbox{lin}\,\Psi}

1869: \frac{\|\psi(\sigma_{\theta_0}(\cdot))\|_{(r)}}

1870: {\|\psi\|_{(r)}}\geq \inf_{\psi\in\mbox{lin}\,\Psi}

1871: \frac{\sup_{h\in\sigma_{\theta_0}^{-1}({\cal H}_s)}

1872: |\psi(\sigma_{\theta_0}(h))|}{\|\psi\|_{(r)}}=

1873: \inf_{\psi\in\mbox{lin}\,\Psi}\frac{\|\psi\|_{(s)}}{\|\psi\|_{(r)}}\]

1874: $\geq s/(4r)$, since $\|\psi\|_{(r)}\leq 4(r/s)\|\psi\|_{(s)}$.

1875: Thus $\psi\mapsto\psi(\sigma_{0}(\cdot))$ is continuously invertible

1876: on its range by proposition~A.1.7 of \cite{bkrw98}. That it is also onto with

1877: inverse $\psi\mapsto\psi(\sigma_{\theta_0}^{-1})$ follows from

1878: $\sigma_{\theta_0}$ being onto. All that remains is verifying that

1879: $\sigma_{\theta_0}$ is one-to-one.

1880:

1881: Let $h \in \mathcal{H}_{\infty}$ such that $\sigma_{\theta_0}(h)=0$.

1882: For the one-dimensional submodel defined by the map $s \rightarrow\psi_{0s}

1883: \equiv \psi_0 + s(h_1, h_2, h_3, \int_0^{(\cdot)}h_4(u)dA_0(u))$, we have

1884: \begin{eqnarray}

1885: P \{ \frac{ \partial}{\partial s}L_1(\psi_{0s},\zeta_0)|_{s=0}\}^2 = P

1886: \{U^{\tau}_{\zeta_0}(\psi_0)(h)\}^2=0.~\label{c12:e9}

1887: \end{eqnarray}

1888: Define the random set

1889: $\mathcal{S}(n,\tilde{y},t) \equiv \{(N,\tilde{Y}): N(u) = n(u),

1890: \tilde{Y}(u) = \tilde{y}(u), u \in [t, \tau] \}$.

1891: The equality~(\ref{c12:e9}) implies that

1892: $P\{U_{\zeta_0}^{\tau}(\psi_0)(h)|\mathcal{S}(n,y,t)\}^2=0$

1893: for all $\mathcal{S}$ such that $P\{\mathcal{S}(n,y,t)\} > 0$, which

1894: implies that $U^t_{\zeta_0}(\psi_0)(h)=0$ almost surely for all $t \in [0,

1895: \tau]$. Consider the set on which the observation $(X, \delta, Z,

1896: Y)$ is censored at a time $t \in [0, \tau]$. From (\ref{c12:e9})

1897: and the preceding argument,

1898: \begin{eqnarray}

1899: R_{\zeta_0,\psi_0}^t(h_1\ind(Y > \zeta_0) +

1900: h_2'Z_2(t)\ind(Y > \zeta_0)+h_3'Z(t)+h_4)=0.~\label{c12:e11}

1901: \end{eqnarray}

1902: Taking the Radon-Nikodym derivative of (\ref{c12:e11}) with respect to $A_0$

1903: and dividing throughout by $e^{r_{\xi_0}(t;Z,Y)}$ yields

1904: \begin{eqnarray}

1905: \tilde{Y}(t)(h_1\ind(Y > \zeta_0) +

1906: h_2'Z_2(t)\ind(Y > \zeta_0)+h_3'Z(t)+h_4(t))=0.~\label{c12:e12}

1907: \end{eqnarray}

1908: Arguments quite similar to those used in the proof of lemma~\ref{l1}

1909: can now be used to verify that~(\ref{c12:e12}) forces $h=0$.

1910: Hence $\sigma_{\theta_0}(h)=0$ implies $h=0$, and thus

1911: $\sigma_{\theta_0}$ is one-to-one.$\Box$

1912:

1913: {\it Proof of lemma~\ref{l5}.} For the first part, note that

1914: $t\mapsto\tilde{Y}(t)$ has total variation bounded by~1;

1915: and, by the model assumptions, the total variation of

1916: $t\mapsto e^{r_{\xi}(t;Z,Y)}$ is bounded by a universal constant that doesn't

1917: depend on $\theta$. Thus there exists a universal constant $k_{\ast}$

1918: such that $\|\mathbb{P}_n W(\cdot;\hat{\theta}_n)\|_v\leq

1919: k_{\ast}\mathbb{P}_n|\hat{\Xi}^{(0)}_{\hat{\theta}_n}|$.  By the smoothness of

1920: the functions involved, and the fact that $u\mapsto\log(u)$ is Lipschitz

1921: on compacts bounded above zero, we obtain the first result of the lemma.

1922: The consistency part follows from

1923: lemma~\ref{l.t1.1} combined with theorem~\ref{t1}, the

1924: continuity of $\theta\mapsto PW(\cdot;\theta)$, and reapplication

1925: of the Lipschitz continuity of $u\mapsto\log(u)$.$\Box$

1926:

1927: {\it Proof of lemma~\ref{l6}.} The right-hand derivative of

1928: $P(L_1(\psi,\zeta))$ with respect to $\zeta$ at $\zeta=\zeta_0$ is:

1929: $\left.(\partial^{+}/(\partial\zeta))

1930: P(L_1(\psi, \zeta))\right|_{\zeta=\zeta_0}$

1931: \begin{eqnarray*}

1932: &=&\int\left

1933: \{P[l_1^{\psi}(V,\delta,Z)|Y=y+]-P[l_2^{\psi}(V,\delta,Z)|Y=y+] \right \}

1934: \tilde{\delta}_{\zeta_0}(y)\tilde{h}(y)dy \\

1935: &=&\left(P[l_1^{\psi}(V,\delta,Z) |Y=\zeta_0+]-

1936: P[l_2^{\psi}(V,\delta,Z)|Y=\zeta_0+]\right)\tilde{h}(\zeta_0),

1937: \end{eqnarray*}

1938: where the superscript~$+$ denotes differentiating from the right

1939: and $\tilde{\delta}_{\zeta_0}(y)$ is the Dirac delta function

1940: assigning counting measure~1 to the event $\{y=\zeta_0\}$. Now,

1941: $P[l_1^{\psi}(V,\delta,Z)|Y=\zeta_0+]-

1942: P[l_2^{\psi}(V,\delta,Z)|Y=\zeta_0+]$

1943: \[=\int\left[ l_1^{\psi}(v,d,z)-l_2^{\psi}(v,d,z)\right]

1944: \ell_2(v,d,z)\ell_0^{+}(v,d,z)d\mu(v,d,z)\]

1945: $\equiv\tilde{R}^{+}(\psi)$,

1946: where $\ell_j(v,d,z)\equiv\exp\{l_j^{\psi_0}(v,d,z)\}$, for $j=1,2$;

1947: $\mu(v,d,z)$ is the dominating measure; and $\ell_0^{+}(v,d,z)$

1948: consists of the remaining components of the conditional distribution

1949: of $(V,\delta,Z)$ given $Y=\zeta_0+$. Note that under the model

1950: assumptions, $\ell_0^{+}$ does not depend on the parameters. Thus

1951: \begin{eqnarray*}

1952: \tilde{R}^{+}(\psi_0)&=&\int\left[ l_1^{\psi_0}(v,d,z)

1953: -l_2^{\psi_0}(v,d,z)\right]\ell_2(v,d,z)\ell_0^{+}(v,d,z)d\mu(v,d,z)\\

1954: &=&\int\log\left[\frac{\ell_1\ell_0^{+}}{\ell_2\ell_0^{+}}\right]

1955: \ell_2\ell_0^{+}d\mu

1956: \;\;<\;\;\log\int\left[\frac{\ell_1\ell_0^{+}}

1957: {\ell_2\ell_0^{+}}\right]\ell_2\ell_0^{+}d\mu\\

1958: &=&\log\int\ell_1(v,d,z)\ell_0^{+}(v,d,z)d\mu(v,d,z)\;\;=\;\;0,

1959: \end{eqnarray*}

1960: since the integral of a density is~1. Thus

1961: $\dot{X}_{\zeta_0}^{+}(\gamma_0,\Gamma_0)<0$.

1962:

1963: A similar argument is

1964: used for the left-hand derivative. In this case, the true density

1965: of $(V,\delta,Z)$ given $Y=\zeta_0$ is $\ell_1^{\psi_0}(v,d,z)

1966: \ell_0^{-}(v,d,z)$, where $\ell_0^{-}$ does not involve the parameters.

1967: We now have

1968: \begin{eqnarray*}

1969: \lefteqn{P[l_1^{\psi}(V,\delta,Z)|Y=\zeta_0]-

1970: P[l_2^{\psi}(V,\delta,Z)|Y=\zeta_0]}\mbox{\hspace{1.0cm}}&&\\

1971: &=&\int\left[ l_1^{\psi_0}(v,d,z)

1972: -l_2^{\psi_0}(v,d,z)\right]\ell_2(v,d,z)\ell_0^{-}(v,d,z)d\mu(v,d,z)\\

1973: &=&-\int\log\left[\frac{\ell_2\ell_0^{-}}{\ell_1\ell_0^{-}}\right]

1974: \ell_1\ell_0^{-}d\mu

1975: \;\;>\;\;-\log\int\left[\frac{\ell_2\ell_0^{-}}{\ell_1\ell_0^{-}}\right]

1976: \ell_1\ell_0^{-}d\mu\\

1977: &=&\log\int\ell_2(v,d,z)\ell_0^{-}(v,d,z)d\mu(v,d,z)\;\;=\;\;0,

1978: \end{eqnarray*}

1979: and thus we conclude that $\dot{X}_{\zeta_0}^{-}(\gamma_0,\Gamma_0)>0$.$\Box$

1980:

1981: {\it Proof of lemma~\ref{l7}.} This follows from lemma~\ref{l6}, the local

1982: concavity of $\tilde{X}$, and the

1983: smoothness of the derivatives involved.$\Box$

1984:

1985: {\it Proof of lemma~\ref{l8}.}

1986: Note that $\tilde{X}_n(\zeta,\eta,\Gamma)$

1987: \[=\mathbb{P}_n\left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)

1988: +\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})

1989: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})\right],\]

1990: where

1991: $\tilde{W}(\zeta,\gamma,A)\equiv l_1^{\psi}(V,\delta,Z)\ind\{Y\leq\zeta\}

1992: +l_2^{\psi}(V,\delta,Z)\ind\{Y>\zeta\}$. The classes

1993: \[\left\{\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t):

1994: \|\Gamma-\Gamma_0\|_{\infty}\leq\epsilon,\|\Gamma\|_v\leq k_0\right\},\]

1995: for any $\epsilon>0$, and

1996: $\left\{\tilde{W}(\zeta,\lambda):(\zeta,\lambda)

1997: \in B_{\epsilon_2}^{\ast k_0}\right\}$, for some $\epsilon_2>0$,

1998: can be shown to be Donsker. That this holds for the second class

1999: follows from arguments similar to those used in the proof

2000: of lemma~\ref{l.t1.1}. For the first class, note that

2001: $\int_0^{\tau}\Gamma(t)dN(t)=\delta\Gamma(V)$. Since $\|\Gamma\|_v\leq k_0$,

2002: $\Gamma$ can be written as the

2003: difference between two monotone increasing functions,

2004: each with total variation bounded by $k_0$. By theorem~2.7.5 of \cite{vw96},

2005: the class of all monotone functions with a given compact range is universally

2006: Donsker. Since sums of Donsker classes are Donsker, we have that the class

2007: $\{\Gamma(V):\|\Gamma\|_v\leq k_0\}$ is Donsker. That the first class

2008: is Donsker now follows since products of bounded Donsker classes are Donsker.

2009: Since we also have that

2010: $\sqrt{n}(\tilde{G}_n-\tilde{G}_0)$ converges to a Gaussian process,

2011: we have that

2012: \[\sqrt{n}(\mathbb{P}_n-P)

2013: \left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)

2014: +\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})

2015: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})\right]\]

2016: converges weakly in $\ell^{\infty}(B_{\epsilon_2}^{\ast k_0})$

2017: to the tight Gaussian process

2018: \[\mathbb{G}\left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)

2019: +\tilde{W}(\zeta,\eta,A_0^{(\Gamma)})

2020: -\tilde{W}(\zeta_0,\eta_0,A_0^{(\Gamma_0)})\right],\]

2021: where $\mathbb{G}$ is the Brownian bridge measure.

2022:

2023: By the smoothness of the functions and derivatives involved,

2024: we also have

2025: $\sqrt{n}\left\{P\left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)

2026: +\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})

2027: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})\right]-\right.$

2028: \newline $\left.\tilde{X}(\zeta,\eta,\Gamma)\right\}\;\;=\;\;

2029: \sqrt{n}P\left[\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})

2030: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})

2031: -\tilde{W}(\zeta,\eta,A_0^{(\Gamma)})\right.$ \newline

2032: $\left.+\tilde{W}(\zeta_0,\eta_0,A_0)\right]\;\;=\;\;$

2033: $-\sqrt{n}\int_0^{\tau}\left\{P[W(t;\theta_0(\zeta,\lambda))]e^{-\Gamma(t)}

2034: -P[W(t;\theta_0)]e^{-\Gamma_0(t)}\right\}$ $\times

2035: \left[d\tilde{G}_n(t)-d\tilde{G}_0(t)\right]+\epsilon_n(\zeta,\lambda)

2036: \equiv-\int_0^{\tau}\tilde{C}(t;\zeta,\lambda)d{\cal Z}_n(t)+

2037: \epsilon_n(\zeta,\lambda)$,

2038: where $\|\epsilon_n\|_{\infty}$ $=o_P(1)$. The fact that the class

2039: of functions $\{\tilde{C}(\cdot;\zeta,\lambda):(\zeta,\lambda)\in

2040: B_{\epsilon_2}^{\ast k_0}\}$ has uniformly bounded total variation yields

2041: asymptotic linearity and normality of $\left\{

2042: \int_0^{\tau}\tilde{C}(t;\zeta,\lambda)d{\cal Z}_n(t):(\zeta,\lambda)

2043: \in B_{\epsilon_2}^{\ast k_0}\right\}$,

2044: and the desired result follows.$\Box$

2045:

2046: {\it Proof of theorem~\ref{t.l9}.}  By lemma~\ref{l8},

2047: \[-\tilde{X}(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)

2048: =(\tilde{X}_n-\tilde{X})(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)

2049: -\tilde{X}_n(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)\leq O_P(n^{-1/2}).\]

2050: Combining this with lemma~\ref{l7}, we obtain

2051: $\sqrt{n}|\hat{\zeta}_n-\zeta_0|$

2052: \begin{eqnarray*}

2053: &=&\sqrt{n}|\hat{\zeta}_n-\zeta_0|\ind\{(\hat{\zeta}_n,\hat{\gamma}_n,

2054: \hat{\Gamma}_n)\in B_{\epsilon_1}^{\ast k_0}\}+

2055: \sqrt{n}|\hat{\zeta}_n-\zeta_0|\ind\{(\hat{\zeta}_n,\hat{\gamma}_n,

2056: \hat{\Gamma}_n)\not\in B_{\epsilon_1}^{\ast k_0}\}\\

2057: &\leq&-\sqrt{n}k_1^{-1}\tilde{X}(\hat{\zeta}_n,\hat{\gamma}_n,

2058: \hat{\Gamma}_n)+o_P(1)\\

2059: &\leq& O_P(1).

2060: \end{eqnarray*}

2061: Thus the first part of the lemma is proved.

2062:

2063: For the second part, denote $U_{0\zeta}^{\tau}(\psi)\equiv

2064: P U_{\zeta}^{\tau}(\psi)$. By arguments similar to those used

2065: in the proof of lemma~\ref{l.t1.1}, we can verify that for some $e_1>0$,

2066: ${\cal F}\equiv

2067: \{U_{\zeta}^{\tau}(\psi)(h):\|\theta-\theta_0\|_{\infty}\leq e_1,

2068: h\in{\cal H}_1\}$ is Donsker. Moreover, the continuity of the

2069: functions involved also yields that, as

2070: $\|\theta-\theta_0\|_{\infty}\rightarrow 0$,

2071: $\sup_{h\in{\cal H}_1}

2072: P\left(U_{\zeta}^{\tau}(\psi)(h)-U_{\zeta_0}^{\tau}(\psi_0)(h)\right)^2

2073: \rightarrow 0$. Thus

2074: \begin{eqnarray}

2075: \sqrt{n}\left(U_{n\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)

2076: -U_{0\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)-U_{n\zeta_0}^{\tau}(\psi_0)

2077: +U_{0\zeta_0}^{\tau}(\psi_0)\right)&=&o_P^{{\cal H}_1}(1).\label{l9.e1}

2078: \end{eqnarray}

2079: Note also that $\sqrt{n}|\hat{\zeta}_n-\zeta_0|=O_P(1)$ implies that

2080: $\sqrt{n}\left(U_{0\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)

2081: -U_{0\zeta_0}^{\tau}(\hat{\psi}_n)\right)=o_P^{{\cal H}_1}(1)$.

2082: Thus, since

2083: $U_{n\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)=0$, (\ref{l9.e1})~implies

2084: $\sqrt{n}U_{0\zeta_0}^{\tau}(\hat{\psi}_n)=$

2085: \[\sqrt{n}U_{0\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)+o_P^{{\cal H}_1}(1)

2086: =-\sqrt{n}\left(U_{n\zeta_0}^{\tau}(\psi_0)-U_{0\zeta_0}^{\tau}(\psi_0)\right)

2087: +o_P^{{\cal H}_1}(1)=O_P^{{\cal H}_1}(1),\]

2088: where $O_P^{B}(1)$ denotes a term bounded in probability

2089: uniformly over the set $B$. By lemma~\ref{l4}, we know that there

2090: exists a constant $e_2>0$ such that

2091: \[\|U_{0\zeta_0}^{\tau}(\psi)-

2092: U_{0\zeta_0}^{\tau}(\psi_0)\|_{{\cal H}_1}\geq

2093: e_2\|\psi-\psi_0\|_{\infty}

2094: +o(\|\psi-\psi_0\|_{\infty}),\]

2095: as $\|\psi-\psi_0\|_{\infty}\rightarrow 0$.

2096: Hence $\sqrt{n}\|\hat{\psi}_n-\psi_0\|_{\infty}(e_2-o_P(1))\leq O_P(1)$,

2097: and we obtain the second conclusion of the lemma.

2098:

2099: For the third part, we have

2100: \[\sqrt{n}\sup_{t\in[0,\tau]}\left|

2101: \pp_n W(t;\hat{\theta}_n)-PW(t;\hat{\theta}_n)

2102: \right|=\sqrt{n}\sup_{t\in[0,\tau]}|(\pp_n-P)W(t;\theta_0)|+o_P(1)\]

2103: $=O_P(1)$ and $\sqrt{n}\sup_{t\in[0,\tau]}|PW(t;\hat{\theta}_n)-

2104: PW(t;\theta_0)|=O_P(1)$ by the first two parts of this lemma.

2105: Hence $\sqrt{n}\sup_{t\in[0,\tau]}

2106: \left|\pp_n W(t;\hat{\theta}_n)-PW(t;\theta_0)\right|=O_P(1)$.

2107: The result now follows by the Lipschitz continuity of $\log(u)$

2108: over strictly positive compact intervals.$\Box$

2109:

2110: {\it Proof of lemma~\ref{l10}.}

2111: The first inequality follows from the definitions.

2112: For the second inequality, we use a Taylor's expansion around

2113: $(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)$ to obtain

2114: $\tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)-\tilde{X}_n(\hat{\zeta}_n,

2115: \lambda_0)=$

2116: \[-\pp_n U_{\hat{\zeta}_n}^{\tau}(\hat{\gamma}_n,

2117: A_n^{(\hat{\Gamma}_n)})(\lambda_0-\hat{\lambda}_n)

2118: -\psi_{n,t}^{(\lambda_0-\hat{\lambda}_n)}\left(\pp_n

2119: \hat{\sigma}_{

2120: \left(\hat{\zeta}_n,\hat{\gamma}_{n,t},A_n^{(\hat{\Gamma}_{n,t})}\right)}

2121: \right)(\lambda_0-\hat{\lambda}_n),\]

2122: for some $t\in[0,1]$, where $\hat{\lambda}_{n,t}\equiv

2123: (\hat{\gamma}_{n,t},\hat{\Gamma}_{n,t})$; $\hat{\gamma}_{n,t}\equiv

2124: t\hat{\gamma}_n+(1-t)\gamma_0$; $\hat{\Gamma}_{n,t}\equiv

2125: t\hat{\Gamma}_n+(1-t)\Gamma_0$; and,

2126: for any $h\in{\cal H}_{\infty}$,

2127: $\psi_{n,t}^{(h)}\equiv\left(h_1,h_2,h_3,\int_0^{(\cdot)}h_4(s)

2128: dA_n^{(\hat{\Gamma}_{n,t})}(s)\right)$.

2129: The score term is zero by definition of the NPMLE, and the second term

2130: has absolute value bounded by $\hat{K}_n

2131: \|\hat{\lambda}_n-\lambda_0\|_{\infty}^2$,

2132: where $\hat{K}_n$ is bounded in probability

2133: by the uniform consistency of $\hat{\lambda}_n$ and by

2134: the form of the information terms listed in section~5.2.

2135:

2136: Now, letting $\psi_n(\gamma,\Gamma)\equiv(\gamma,A_n^{(\Gamma)})$, we have

2137: $\tilde{X}_n(\hat{\zeta}_n,\lambda_0)-\tilde{X}_n^{\ast}(\hat{\zeta}_n)$

2138: \begin{eqnarray}

2139: &&\label{l10.e1}\\

2140: &&=\pp_n\left\{\left(\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right)

2141: \right.\nonumber\\

2142: &&\left.\mbox{\hspace{0.1in}}

2143: \times\left[l_1^{\psi_n(\gamma_0,\Gamma_0)}(V,\delta,Z)

2144: -l_2^{\psi_n(\gamma_0,\Gamma_0)}(V,\delta,Z)

2145: -l_1^{\psi_0}(V,\delta,Z)+l_2^{\psi_0}(V,\delta,Z)\right]\right\}\nonumber

2146: \end{eqnarray}

2147: $=\int_0^{\tau}\pp_n\left\{

2148: \left(\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right)

2149: \tilde{Y}(s)\tilde{K}_n(s)\right\}e^{-\Gamma_0(s)}

2150: \left[d\tilde{G}_n(s)-d\tilde{G}_0(s)\right]$,

2151: where

2152: \begin{eqnarray*}

2153: \tilde{K}_n(s)&=&\left[\dot{G}(H_1^{\psi_{n,t}}(V))

2154: -\delta\frac{\ddot{G}(H_1^{\psi_{n,t}}(V))}{\dot{G}(H_1^{\psi_{n,t}}(V))}

2155: \right]e^{\beta_0'Z(s)}\\

2156: &&-\left[\dot{G}(H_2^{\psi_{n,t}}(V))

2157: -\delta\frac{\ddot{G}(H_2^{\psi_{n,t}}(V))}{\dot{G}(H_2^{\psi_{n,t}}(V))}

2158: \right]e^{\beta_0'Z(s)+\alpha_0+\eta_0'Z_2(s)}

2159: \end{eqnarray*}

2160: and $\psi_{n,t}\equiv\left(\gamma,\int_0^{(\cdot)}\Gamma_0(u)\left[

2161: td\tilde{G}_n(u)+(1-t)d\tilde{G}_0(u)\right]\right)$, for

2162: some $t\in[0,1]$, by the mean value theorem.

2163: By the conditions given in section~2, we have that there is a

2164: constant $k^{\ast}<\infty$ such that

2165: $\|\tilde{K}_n(s)\Gamma_0(s)\|_{v}\leq k^{\ast}$

2166: with probability~1 for all $n\geq 1$. Thus the absolute value

2167: of~(\ref{l10.e1}) is bounded above by

2168: $k^{\ast}\|\tilde{G}_n-\tilde{G}_0\|_{\infty}\times\pp_n

2169: \left|\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right|

2170: =O_P(n^{-1})$.

2171: This last statement follows because $\|\tilde{G}_n-\tilde{G}_0\|_{\infty}

2172: =O_P(n^{-1/2})$,

2173: $(\pp_n-P)\left|\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right|

2174: =o_P(n^{-1/2})$, and $P\left|

2175: \ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right|=O_P(n^{-1/2})$

2176: by theorem~\ref{t.l9}. Now the desired result follows.$\Box$

2177:

2178: {\it Proof of lemma~\ref{l11}.} Note first that

2179: \[\tilde{D}_n(\zeta)=\sqrt{n}(\mathbb{P}_n-P)\left\{\left[

2180: \ind\{Y\leq \zeta\}-\ind\{Y\leq\zeta_0\}\right]\times

2181: \left[l_1^{\psi_0}-l_2^{\psi_0}\right](V,\delta,Z)\right\}.\]

2182: Denote $\tilde{H}\equiv[l_1^{\psi_0}-l_2^{\psi_0}](V,\delta,Z)$,

2183: and note that $|\tilde{H}|\leq c_{\ast}$ almost surely

2184: for a fixed constant $c_{\ast}<\infty$. Thus

2185: $F_{\epsilon}\equiv\ind\{\zeta_0-\epsilon\leq Y

2186: \leq\zeta_0+\epsilon\}c_{\ast}$

2187: serves as an envelope for

2188: the class of functions

2189: \[{\cal F}_{\epsilon}\equiv\{\left[\ind\{Y\leq\zeta\}

2190: -\ind\{Y\leq\zeta_0\}\right]\tilde{H}:|\zeta-\zeta_0|\leq\epsilon\},\]

2191: for each $\epsilon>0$.

2192: Note that by the assumptions on the density $\tilde{h}$ in a neighborhood

2193: of $\zeta_0$, we have for some $\epsilon_3>0$ that there exists

2194: $0<k_{\ast},k_{\ast\ast}<\infty$ such that $k_{\ast}\epsilon\leq

2195: \tilde{p}(\epsilon)\equiv

2196: P[\zeta_0-\epsilon\leq Y\leq\zeta_0+\epsilon]\leq k_{\ast\ast}\epsilon$

2197: for all $0\leq\epsilon\leq\epsilon_3$.

2198: Thus the bracketing entropy

2199: \[N_{[]}(u\|F_{\epsilon}\|_{P,2},{\cal F}_{\epsilon},L_2(P))\leq

2200: O\left(\frac{\epsilon}{u^2\tilde{p}(\epsilon)}\right)\leq

2201: O\left(\frac{1}{c_{\ast}u^2}\right),\]

2202: for all $u>0$ and $0\leq\epsilon\leq\epsilon_3$;

2203: and thus, by theorem~2.14.2 of \cite{vw96},

2204: there exists a $c_{\ast\ast}<\infty$

2205: such that

2206: \[E\left[\sup_{|\zeta-\zeta_0|\leq\epsilon}

2207: |\tilde{D}(\zeta)|\right]\leq c_{\ast\ast}\|F_{\epsilon}\|_{P,2}

2208: \leq c_{\ast\ast}c_{\ast}\sqrt{k_{\ast\ast}\epsilon},\]

2209: for all $0\leq\epsilon\leq\epsilon_3$.

2210: The result now follows for $k_2=c_{\ast\ast}c_{\ast}

2211: \sqrt{k_{\ast\ast}}$.$\Box$

2212:

2213: {\it Proof of theorem~\ref{t3}.} We can deduce from section~3 that

2214: \begin{eqnarray*}

2215: \lefteqn{\tilde{L}_n(\hat{\psi}_n,\zeta_{n,u})

2216: -\tilde{L}_n(\hat{\psi}_n,\zeta_0)}

2217: &&\\

2218: &\mbox{\hspace{0.5cm}}=&

2219: \mathbb{P}_n\left\{\left(\ind\{\zeta_{n,u}<Y\leq\zeta_0\}

2220: -\ind\{\zeta_0<Y\leq\zeta_{n,u}\}\right)

2221: \left[l_2^{\hat{\psi}_n}-l_1^{\hat{\psi}_n}\right](V,\delta,Z)\right\}\\

2222: &\mbox{\hspace{0.5cm}}=&n^{-1}Q_n(u)+\hat{E}_n(u),\;\;\;\;\mbox{where}

2223: \end{eqnarray*}

2224: $\hat{E}_n(u)\equiv\mathbb{P}_n\left\{\left(\ind\{Y\leq\zeta_0\}

2225: -\ind\{Y\leq\zeta_{n,u}\}\right)\left[l_2^{\hat{\psi}_n}-l_2^{\psi_0}

2226: -l_1^{\hat{\psi}_n}+l_1^{\psi_0}\right](V,\delta,Z)\right\}$.

2227: By arguments similar to those used in the proof of lemma~\ref{l10},

2228: we can obtain constants $0<F_1,F_2<\infty$ such that

2229: $\left|l_j^{\hat{\psi}_n}(V,\delta,Z)-l_j^{\psi_0}(V,\delta,Z)\right|

2230: \leq F_j\|\hat{\psi}_n-\psi_0\|_{\infty}$ almost surely, for $j=1,2$. Hence

2231: \[|\hat{E}_n(u)|\leq\pp_n\left|\ind\{Y\leq\zeta_0\}

2232: -\ind\{Y\leq\zeta_{n,u}\}\right|O_P(n^{-1/2}).\]

2233: By arguments given in the proof of lemma~\ref{l11}, we know that

2234: \[(\pp_n-P)\left|\ind\{Y\leq\zeta_0\}

2235: -\ind\{Y\leq\zeta_{n,u}\}\right|=O_P^{\mathbb{U}_{n,M}}(n^{-1}).\]

2236: Since also $\sup_{u\in\mathbb{U}_{n,M}}P\left|\ind\{Y\leq\zeta_0\}

2237: -\ind\{Y\leq\zeta_{n,u}\}\right|=O(n^{-1})$ by condition B2(i),

2238: we now have that $\hat{E}_n=O_P^{\mathbb{U}_{n,M}}(n^{-3/2})$.

2239: The desired result now follows.$\Box$

2240:

2241: {\it Proof of theorem~\ref{t4}.} Fix $h\in{\cal H}_{\infty}$.

2242: We first establish that

2243: $\left(Q_n^+,{\cal Z}^n(h)\equiv\right.$

2244: $\left.\sqrt{n}\pp_n U_{\zeta_0}^{\tau}(\psi_0)(h)\right)$

2245: converges weakly to $(Q^+,{\cal Z}(h))$, on $D_M\times\re$,

2246: where $Q^+$ and ${\cal Z}(h)$ are independent, for each

2247: fixed $M<\infty$, and ${\cal Z}(h)$ is mean zero Gaussian with

2248: variance $\tilde{\sigma}_h^2\equiv\mbox{var}[U_{\zeta_0}^{\tau}(\psi_0)(h)]$.

2249: Accordingly, fix $M$, and

2250: let $0=u_0<u_1<u_2<\cdots<u_J\leq M$ be a finite collection of

2251: points and $q_1,\ldots,q_J,\tilde{q}$ be arbitrary real numbers. Our plan

2252: is to first show that the characteristic function of

2253: $(Q_n^+(u_1),\ldots,Q_n^+(u_J),{\cal Z}^n(h))$ converges to that of

2254: $(Q^+(u_1),\ldots,Q^+(u_J))$ times that of ${\cal Z}(h)$.

2255: Since the choice of points

2256: $u_1,\ldots,u_J$ is arbitrary, this will imply convergence

2257: of all finite-dimensional distributions. We will then show

2258: that $Q_n^+$ is asymptotically tight, and this will imply

2259: the desired weak convergence.

2260:

2261: Let $y\mapsto I_{nj}(y)\equiv\ind\{\zeta_0+u_{j-1}/n<y\leq\zeta_0+u_j/n\}$,

2262: $j=1,\ldots,J$; and

2263: $F_i\equiv[l_1^{\psi_0}-l_2^{\psi}](V_i,\delta_i,Z_i)$ and

2264: ${\cal Z}_i\equiv U_{\zeta_0}^{\tau}(\psi_0)(h)(X_i)$,

2265: $i=1,\ldots,n$. In other words, ${\cal Z}_i$ is the score contribution

2266: from the $i$th observation. Thus

2267: \begin{eqnarray}

2268: \lefteqn{P\exp\left[i\left\{\sum_{j=1}^Jq_j[Q_n^+(u_j)-Q_n^+(u_{j-1})]

2269: +\tilde{q}{\cal Z}^n(h)\right\}\right]}\mbox{\hspace{1.0in}}&&\label{t4.e1}\\

2270: &=&\prod_{k=1}^n P\left[\exp\left\{\sum_{j=1}^J i q_jI_{nj}(Y_k)F_k\right\}

2271: e^{i\tilde{q}{\cal Z}_k/\sqrt{n}}\right].\nonumber

2272: \end{eqnarray}

2273: However, using the facts that

2274: $e^{\sum_j w_j}-1=\sum_j(e^{w_j}-1)$ when

2275: only one of the $w_j$'s differs from zero and

2276: $e^{uv}-1=u(e^{v}-1)$ when $u$ is dichotomous, we have

2277: $\exp\left\{\sum_{j=1}^J iq_jI_{nj}(Y_k)F_k\right\}

2278: =1+\sum_{j=1}^J\left(e^{iq_j I_{nj}(Y_k)F_k}-1\right)

2279: =1+\sum_{j=1}^JI_{nj}(Y_k)\left(e^{iq_jF_k}-1\right)$.

2280: Combining this with condition~B2 and the boundedness of

2281: $F_k$ and ${\cal Z}_k$, we obtain

2282: $P\left[\exp\left\{\sum_{j=1}^J i q_jI_{nj}(Y_k)F_k\right\}

2283: e^{i\tilde{q}{\cal Z}_k/\sqrt{n}}\right]$

2284: \begin{eqnarray*}

2285: &=&Pe^{i\tilde{q}{\cal Z}_k/\sqrt{n}}+

2286: \sum_{j=1}^J\frac{(u_j-u_{j-1})\tilde{h}(\zeta_0)}{n}

2287: P\left[\left.\left(e^{iq_jF_k}-1\right)e^{i\tilde{q}{\cal Z}_k/\sqrt{n}}

2288: \right|Y=\zeta_0+\right]\\

2289: &&+o(n^{-1})\\

2290: &=&1+n^{-1}\left[-\frac{\tilde{q}^2\tilde{\sigma}_h^2}{2}

2291: +\tilde{h}(\zeta_0)\sum_{j=1}^J(u_j-u_{j-1})\{\phi^+(q_j)-1\}\right]

2292: +o(n^{-1}),

2293: \end{eqnarray*}

2294: where $o(1)$ denotes a quantity going to zero uniformly

2295: over $k=1,\ldots,n$. Thus the right-hand side of~(\ref{t4.e1}) is

2296: \[\exp\left[\frac{-\tilde{q}^2\tilde{\sigma}_h^2}{2}+

2297: \tilde{h}(\zeta_0)\sum_{j=1}^J(u_j-u_{j-1})\{\phi^+(q_j)-1\}\right],\]

2298: which is precisely

2299: \[P\exp\left[i\tilde{q}{\cal Z}(h)+i\sum_{j=1}^jq_j\left\{

2300: Q^+(u_j)-Q^+(u_{j_1})\right\}\right].\]

2301: Thus the finite dimensional distributions converge as desired.

2302:

2303: We next need to verify that $Q_n^+$ is asymptotically tight

2304: on~$[0,M]$. Since there exists

2305: a constant $c_{\ast}<\infty$ such that $\max_{1\leq i\leq n}

2306: |F_i|\leq c_{\ast}<\infty$ almost surely, we have that

2307: $|Q_n^+(u_2)-Q_n^+(u_1)|\leq c_{\ast}n\pp_n

2308: \ind\{\zeta_0+u_1/n<Y\leq\zeta_0+u_2/n\}$,

2309: for all $0\leq u_1<u_2\leq M$. Thus we are done if we can show

2310: that $u\mapsto\tilde{R}_n(u)\equiv n\pp_n\ind\{\zeta_0<Y\leq\zeta_0+u/n\}$ is

2311: tight on $[0,M]$. To this end, fix $0\leq u_1<u_2\leq M$. Now,

2312: the expectation of $|\tilde{R}_n(u_2)-\tilde{R}_n(u_1)|$ is

2313: $nP\{\zeta_0+u_1/n<Y\leq\zeta_0+u_2/n\}\rightarrow

2314: |u_2-u_1|\tilde{h}(\zeta_0)$, as $n\rightarrow\infty$.

2315: This implies the desired tightness since

2316: $u\mapsto\tilde{R}_n(u)$ is monotone. We have now established that

2317: $\left(Q_n^+,{\cal Z}^n(h)\right)$

2318: converges weakly to $(Q^+,{\cal Z}(h))$, on $D_M\times\re$,

2319: where $Q^+$ and ${\cal Z}(h)$ are independent, for each

2320: fixed $M<\infty$. Similar arguments also yield the weak convergence

2321: of $\left(Q_n^-,{\cal Z}^n(h)\right)$ to

2322: $(Q^-,{\cal Z}(h))$, on $D_M\times\re$,

2323: where $Q^-$ and ${\cal Z}(h)$ are again independent, for each

2324: fixed $M<\infty$. Thus also $\left(Q_n,{\cal Z}^n(h)\right)$ converges

2325: weakly to $(Q,{\cal Z}(h))$, on $D_M\times\re$,

2326: where $Q$ and ${\cal Z}(h)$ are independent, for each

2327: fixed $M<\infty$. Since $n(\hat{\zeta}_n-\zeta_0)=O_P(1)$, the

2328: argmax continuous mapping theorem (theorem~3.2.2 of \cite{vw96}) now yields

2329: that $\left(n(\hat{\zeta}_n-\zeta_0),{\cal Z}^n(h)\right)$

2330: converges weakly to $\left(\argmax\,Q,{\cal Z}(h)\right)$, with

2331: the desired asymptotic independence. The remaining

2332: results follow.$\Box$

2333:

2334: {\it Proof of theorem~\ref{t5}.} We have

2335: \begin{eqnarray*}

2336: 0&=&\sqrt{n}\pp_n U_{\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)\\

2337: &=&\sqrt{n}\pp_n U_{\zeta_0}^{\tau}(\hat{\psi}_n)+

2338: \sqrt{n}(\pp_n-P)\left(U_{\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)

2339: -U_{\zeta_0}^{\tau}(\hat{\psi}_n)\right)\\

2340: &&+\sqrt{n}

2341: P\left(U_{\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)

2342: -U_{\zeta_0}^{\tau}(\hat{\psi}_n)\right)\\

2343: &\equiv&\sqrt{n}\pp_n U_{\zeta_0}^{\tau}(\hat{\psi}_n)+B_{1,n}+B_{2,n},

2344: \end{eqnarray*}

2345: where the index set for the score terms is ${\cal H}_1$.

2346: By arguments similar to those used in the proof of theorem~\ref{t.l9},

2347: combined with the fact that $n(\hat{\zeta}_n-\zeta_0)=O_P(1)$, we have

2348: that both $B_{1,n}=o_P^{{\cal H}_1}(1)$ and $B_{2,n}=o_P^{{\cal H}_1}(1)$.

2349: Thus $\sqrt{n}\pp_n U_{\zeta_0}(\hat{\psi}_n)=o_P^{{\cal H}_1}(1)$.

2350: We also have that

2351: \[\sqrt{n}(\pp_n-P)U_{\zeta_0}^{\tau}(\hat{\psi}_n)-

2352: \sqrt{n}(\pp_n-P)U_{\zeta_0}^{\tau}(\psi_0)=o_P^{{\cal H}_1}(1).\]

2353: Combining this with lemma~\ref{l4}, the Z-estimator master theorem

2354: (theorem~3.3.1 of \cite{vw96}) now yields the desired results.$\Box$

2355:

2356: {\it Proof of corollary~\ref{c1}.} We first derive the unconditional

2357: limiting distribution of $\sqrt{n}(\hat{\psi}_n^{\circ}-\psi_0)$.

2358: If a class of measurable functions

2359: ${\cal F}$ is $P$-Glivenko-Cantelli with $\|P\|_{\cal F}<\infty$, then

2360: the class $\kappa\cdot{\cal F}=\{\kappa f:f\in{\cal F}\}$, where

2361: $\kappa$ denotes a generic version of one of the weights $\kappa_i$,

2362: is also $P$-Glivenko-Cantelli, by theorem~3 of \cite{vw00}.

2363: Thus we can apply the

2364: results of theorem~\ref{t1}, with only minor modification, combined

2365: with the simple fact that $\bar{\kappa}\rightarrow\mu_{\kappa}$

2366: almost surely, to yield that $\hat{\psi}_n^{\circ}\rightarrow\psi_0$

2367: outer almost surely. Note that the proof is made somewhat easier than

2368: before since we already know $\hat{\zeta}_n\rightarrow\zeta_0$

2369: almost surely. Furthermore, if a class of measurable functions~${\cal F}$

2370: is $P$-Donsker with $\|P\|_{\cal F}<\infty$, then the multiplier central

2371: limit theorem (theorem~2.9.2 of \cite{vw96})

2372: yields that the class $\kappa\cdot{\cal F}$ is also $P$-Donsker.

2373: Hence we can apply the results of theorem~\ref{t4}, with only minor

2374: modification, to yield that $\sqrt{n}(\hat{\psi}_n^{\circ}-\psi_0)$

2375: is asymptotically linear with influence function

2376: $\tilde{l}^{\circ}(h)=(\kappa/\mu_{\kappa})U_{\zeta_0}^{\tau}

2377: (\sigma_{\theta_0}^{-1}(h))$, $h\in{\cal H}_1$. The factor

2378: $\mu_{\kappa}^{-1}$ occurs because the information operator for

2379: the weighted version of the likelihood is $\mu_{\kappa}\sigma_{\theta_0}$.

2380: We now have that $\sqrt{n}(\hat{\psi}_n^{\circ}-\hat{\psi}_n)

2381: =\sqrt{n}\pp_n(\kappa/\mu_{\kappa}-1)U_{\zeta_0}^{\tau}

2382: (\sigma_{\theta_0}^{-1}(\cdot))+o_P^{{\cal H}_1}(1)$, unconditionally.

2383:

2384: Finally,

2385: the conditional multiplier central limit theorem (theorem~2.9.6

2386: of \cite{vw96})

2387: yields part~(ii) of the theorem. The factor $(\mu_{\kappa}/\sigma_{\kappa})$

2388: arises because $\mbox{var}(\kappa/\mu_{\kappa})=\sigma_{\kappa}^2

2389: /\mu_{\kappa}^2$. Similar arguments establish~(i) by using

2390: parallel Glivenko-Cantelli and Donsker results for the nonparametric

2391: bootstrapped empirical process.$\Box$

2392:

2393: {\it Proof of lemma~\ref{s9.l1}.} Let $\mu(x)$ denote the baseline

2394: measure and $\rho_n(x)$, $\rho(x)$ the density function under

2395: $P_n$ and $P$ respectively. In the general situation,

2396: verifying~(\ref{c8.e1}) is equivalent to finding a function $h$

2397: such that:

2398: \begin{eqnarray*}

2399: \;\;\;\;\lefteqn{\int

2400: \left[ \frac{\left(\frac{dP_n(x)}{d \mu(x)}\right)^{1/2 }-

2401: \left(\frac{dP(x)}{d \mu(x)}\right)^{1/2}}{1/\sqrt{n}} -

2402: \frac{1}{2}h(x)\left(\frac{dP(x)}{d \mu(x)}\right)^{1/2}\right]^2d\mu(x)}&&\\

2403: &=& \int \left[ \frac{\rho_n(x)^{1/2}-\rho(x)^{1/2} }{1/\sqrt{n}}

2404: - \frac{1}{2}h(x)\rho(x)^{1/2} \right]^2 d \mu(x)\\

2405: & \rightarrow & \int \left [ \frac{1}{2}

2406: \frac{\dot {\rho}(x)}{(\rho(x))^{1/2}}-

2407: \frac{1}{2}h(x)\frac{\rho(x)}{(\rho(x))^{1/2}} \right]^2 d\mu(x)\\

2408: &=&\int \left [ \frac{1}{2} \frac{\dot{\rho}(x)}

2409: {\rho(x)}(\rho(x))^{1/2}

2410: -\frac{1}{2}h(x)(\rho(x))^{1/2} \right]^2 d\mu(x)\\

2411: &=&0.

2412: \end{eqnarray*}

2413: Hence the given score function satisfies~(\ref{c8.e1})

2414: by the smoothness of the log-likelihood.$\Box$

2415:

2416: {\it Proof of lemma~\ref{s9.l2}.} Note that a consequence of the

2417: Donsker theorem for contiguous alternatives

2418: (theorem~3.10.12 of \cite{vw96})

2419: is that for any bounded $P$-Donsker class ${\cal F}$,

2420: $\|\pp_n-P\|_{\cal F}\weakpn 0$. Thus the proof of

2421: lemma~\ref{l2} can be reconstituted to yield

2422: that $\|\hat{A}_0\|_{[0,\tau]}$ is bounded in probability

2423: under $P_n$, since all of the classes of functions involved are bounded

2424: $P$-Donsker classes. We can similarly modify the proof of theorem~\ref{t1}

2425: to yield the desired results since, once again, the only classes of

2426: functions involved are bounded and $P$-Donsker. This is true,

2427: in particular, for the key class given in lemma~\ref{l.t1.1}, for

2428: any $k<\infty$. Thus $\|\hat{\psi}_0-\psi_0^{\ast}\|_{\infty}

2429: \weakpn 0$.$\Box$

2430:

2431: {\it Proof of theorem~\ref{s9.t1}.} The basic idea of the proof

2432: is to use the Donsker theorem for contiguous alternatives in

2433: combination with key arguments in the proof of theorem~\ref{t5}

2434: and the form of the score and information operators under

2435: model C2'. Pursuing this course, we obtain for any $(h_1,h_2)\in

2436: \re^{q+1}$,

2437: \begin{eqnarray*}

2438: (h_1,h_2')\hat{S}_1(\zeta)&=&\sqrt{n}\pp_n(1,1)\left[\left(

2439: \begin{array}{c}U_{\zeta,1}^{\tau}\\ U_{\zeta,2}^{\tau}

2440: \end{array}\right)(\psi_0^{\ast})\left(\begin{array}{c}

2441: h_1\\h_2\end{array}\right)\right.\\

2442: &&\left.-\left(

2443: \begin{array}{c}U_{\zeta_0,3}^{\tau}\\ U_{\zeta_0,4}^{\tau}

2444: \end{array}\right)(\psi_0^{\ast})\left([\sigma_{\ast}^{22}]^{-1}

2445: \sigma_{\ast}^{21}(\zeta)\left(\begin{array}{c}h_1\\h_2

2446: \end{array}\right)\right)\right]+o_{P_n}^{[a,b]}(1)\\

2447: &\equiv&\sqrt{n}\pp_n H_{\ast}(\zeta)+o_{P_n}^{[a,b]}(1),

2448: \end{eqnarray*}

2449: where $o_{P_n}^B(1)$ denotes a quantity going to zero in

2450: probability, under $P_n$, uniformly over the set $B$. Now

2451: the Donsker theorem for contiguous alternatives yields that

2452: the right-hand side converges to a tight, Gaussian process with

2453: covariance $P[H_{\ast}(\zeta_1)H_{\ast}(\zeta_2)]$, for all

2454: $\zeta_1,\zeta_2\in[a,b]$, and mean $P\left[H_{\ast}\left\{

2455: U_{\zeta_0,1}^{\tau}(\psi_0^{\ast})(\alpha_{\ast})+U_{\zeta_0,2}^{\tau}

2456: (\psi_0^{\ast})(\eta_{\ast})\right\}\right]$. Note that we

2457: only need to compute the moments under the null distribution~$P$.

2458: Careful calculations verify that this yields the desired results.$\Box$

2459:

2460: {\it Proof of corollary~\ref{c2}.} The limiting results under

2461: $P_n$ follow from theorem~\ref{s9.t1} and the continuous mapping

2462: theorem, provided we can show that

2463: \begin{eqnarray}

2464: \inf_{\zeta\in[a,b],v\in\re^{q+1}:\|v\|=1}v'V_{\ast}(\zeta)v&>&0.

2465: \label{e1.m9}

2466: \end{eqnarray}

2467: The limiting null distribution

2468: results will similarly follow from the

2469: fact that under the null distribution~$P$, $\nu_{\ast}(\zeta)=0$

2470: for all $\zeta\in[a,b]$. Note that in both the null and alternative

2471: settings, $V_{\ast}(\zeta)$ only depends on the null limiting

2472: distribution. It is sufficient to verify that $\sigma_{\psi_0^{\ast},

2473: \zeta_n}$ is one-to-one for all sequences $\zeta_n\in[a,b]$

2474: and $h_n\in{\cal H}_{\infty}$. Note that we can ignore any

2475: differences between $\zeta_0$ and $\zeta$ in calculating

2476: $\zeta\mapsto\sigma_{\psi_0^{\ast},\zeta}^{22}$ because of

2477: the non-identifiability of $\zeta$ under the null

2478: hypothesis, ie., $\zeta\mapsto\sigma_{\psi_0^{\ast},\zeta}^{22}$

2479: is constant. Assume now that there exists sequences $\zeta_n\in[a,b]$

2480: and $h_n\in{\cal H}_{\infty}$ such that

2481: $\sigma_{\psi_0^{\ast},\zeta_n}h_n\rightarrow 0$. We will now

2482: show that this forces $h_n\rightarrow 0$. Without loss of

2483: generality, we can assume $\zeta_n\rightarrow\zeta_{\ast}$ and

2484: $h_n\rightarrow h$. Since the map $h\mapsto\sigma_{\psi_0^{\ast},\zeta}h$

2485: is continuous and since $\zeta\mapsto\sigma_{\psi_0^{\ast},\zeta}h$ is

2486: cadlag, we can further assume without loss of generality that

2487: either $\sigma_{\psi_0^{\ast},\zeta_{\ast}}h=0$ or that

2488: $\sigma_{\psi_0^{\ast},\zeta_{\ast}^-}h=0$ (the $\zeta_{\ast}^-$ denotes

2489: that we are converging to $\zeta_{\ast}$ from below). The arguments for either

2490: case are the same, so we will for brevity only give the proof for

2491: the first case.

2492:

2493: By the arguments surrounding expressions~(\ref{c12:e9}), (\ref{c12:e11})

2494: and~(\ref{c12:e12}), combined with the non-identifiability of $\zeta$

2495: under the null model, we obtain that expression~(\ref{c12:e12}) must now

2496: hold for all $t\in(0,\tau]$ but with $\zeta_{\ast}$ replacing $\zeta_0$.

2497: In ortherwords,

2498: $\tilde{Y}(t)(h_1\ind(Y>\zeta_{\ast})+

2499: h_2'Z_2(t)\ind(Y>\zeta_{\ast})+h_3'Z+h_4(t))=0$, almost surely,

2500: for all $t\in(0,\tau]$.

2501: Since var$[Z(t_4)|Y>\zeta_{\ast}]\geq\mbox{var}[Z(t_4)|Y>b]

2502: \times\pr{Y>b}/\pr{Y>\zeta_{\ast}}$

2503: is positive definite by condition~B4, we have $h_3=0$.

2504: We can similarly use~B4 to verify that var$[Z(t_3)|Y\leq\zeta_{\ast}]$

2505: is positive definite and thus $h_2=0$. Now $h_1=0$ and $h_4=0$

2506: easily follow. Hence $h\mapsto\sigma_{\psi_0^{\ast},\zeta}h$ is

2507: uniformly one-to-one in a manner

2508: which yields the conclusion~(\ref{e1.m9}).$\Box$

2509:

2510: {\it Proof of theorem~\ref{s9.t2}.} The results follow from

2511: arguments similar to those used in the proof of theorem~\ref{s9.t1},

2512: but based on the conditional multiplier central limit theorem

2513: for contiguous alternatives, theorem~\ref{s9.t2.t1} below.$\Box$

2514:

2515: \begin{theorem}\label{s9.t2.t1} (Conditional multiplier central

2516: limit theorem for contiguous alternatives) Let ${\cal F}$ be

2517: a $P$-Donsker class of measurable functions, and let $P_n$ satisfy

2518: \[\int\left[\sqrt{n}(dP_n^{1/2}-dP^{1/2})-\frac{1}{2}hdP^{1/2}

2519: \right]^{1/2}\rightarrow 0\label{s9.t2.t1.e1},\]

2520: as $n\rightarrow\infty$, for some real valued, measurable

2521: function $h$. Also assume

2522: $\lim_{M\rightarrow\infty}$ $\limsup_{n\rightarrow\infty}

2523: P_n(f-Pf)^2\ind\{|f-Pf|>M\}=0$

2524: for all $f\in{\cal F}$, and that the multipliers in

2525: the weighted bootstrap, $\kappa_1,\ldots,\kappa_n$, are i.i.d.

2526: and independent of the data, with mean $0<\mu_{\kappa}<\infty$

2527: and variance $0<\sigma_{\kappa}^2<\infty$, and with

2528: $\int_0^{\infty}\sqrt{P(\kappa_1>u)}du<\infty$.

2529: Then $(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)

2530: \weakpnboot\mathbb{G}$ in $\ell^{\infty}({\cal F})$,

2531: where $\mathbb{G}$ is a tight, mean zero Brownian bridge process.

2532: \end{theorem}

2533:

2534: {\it Proof.} The detailed proof can be found in chapter~11

2535: of Kosorok (To appear). We now present a synopsis of the proof.

2536: Let $\tilde{\kappa}_i\equiv\sigma_{\kappa}^{-1}

2537: (\kappa_i-\mu_{\kappa})$, $i=1,\ldots,n$, and note that

2538: \begin{eqnarray}

2539: \label{s9.t2.t1.e3}&&\\

2540: \pp_n^{\circ}-\pp_n&=&n^{-1/2}\sum_{i=1}^n(\kappa_i/\bar{\kappa}

2541: -1)\Delta_{X_i}\;\;=\;\;

2542: n^{-1/2}\sum_{i=1}^n(\kappa_i/\bar{\kappa}-1)(\Delta_{X_i}-P)

2543: \nonumber\\

2544: &=&\frac{\sigma_{\kappa}}{\mu_{\kappa}}n^{-1/2}\sum_{i=1}^n

2545: \tilde{\kappa}_i(\Delta_{X_i}-P)+

2546: \left(\frac{\sigma_{\kappa}}{\bar{\kappa}}-\frac{\sigma_{\kappa}}

2547: {\mu_{\kappa}}\right)n^{-1/2}\sum_{i=1}^n\tilde{\kappa}_i(\Delta_{X_i}-P)

2548: \nonumber\\

2549: &&+\left(\frac{\mu_{\kappa}}{\bar{\kappa}}-1\right)n^{-1/2}

2550: \sum_{i=1}^n(\Delta_{X_i}-P),\nonumber

2551: \end{eqnarray}

2552: where $\Delta_{X_i}$ is the Dirac measure of the observation $X_i$.

2553: Since ${\cal F}$ is $P$-Donsker, we also have that

2554: $\dot{\cal F}\equiv\{f-Pf:f\in{\cal F}\}$ is $P$-Donsker. Thus

2555: by the unconditional multiplier central limit theorem,

2556: we have that $\tilde{\kappa}\cdot{\cal F}$ is also $P$-Donsker. Now, by

2557: that fact that $\|P(f-Pf)\|_{\cal F}=0$ (trivially) combined with

2558: the central limit theorem under contiguous alternatives, we have that

2559: both $f\mapsto n^{-1/2}\sum_{i=1}^n\tilde{\kappa}_i(\Delta_{X_i}-P)f

2560: \weakpn\mathbb{G}f$ and $f\mapsto n^{-1/2}\sum_{i=1}^n(\Delta_{X_i}

2561: -P)\weakpn\mathbb{G}f+P[(f-Pf)h]$ in $\ell^{\infty}({\cal F})$.

2562: Thus the last two terms in~(\ref{s9.t2.t1.e3})$\;\weakpn 0$, and hence

2563: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)

2564: \weakpn\mathbb{G}$ in $\ell^{\infty}({\cal F})$.

2565: This now implies the unconditional asymptotic

2566: tightness and desired asymptotic measurability of $\sqrt{n}

2567: (\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)$.

2568: Fairly standard arguments can now be used along with the given pointwise

2569: uniform square integrability condition to verify that

2570: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)$

2571: applied to any finite dimensional collection $f_1,\ldots,f_m\in{\cal F}$

2572: converges under $P_n$ in distribution, conditional on the data,

2573: to the appropriate limiting Gaussian process. This now implies

2574: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)\,

2575: \weakpnboot\mathbb{G}$.$\Box$

2576:

2577: {\it Proof of corollary~\ref{c3}.} Assume at first

2578: that $\tilde{M}_n$ is a fixed

2579: number $\tilde{M}<\infty$. Theorem~\ref{s9.t2} now yields that the

2580: collection $\{\hat{S}_{1,1}^{\circ}-\hat{S}_1,\ldots,

2581: \hat{S}_{1,\tilde{M}_n}^{\circ}-\hat{S}_1\}$ converges jointly, conditionally

2582: on the data, to $\tilde{M}$ i.i.d. copies of $\mathbb{Z}_{\ast}$.

2583: Thus $\hat{V}_n$ converges weakly to the sample covariance

2584: process (divided by $\tilde{M}_n$ instead of $\tilde{M}_n-1$)

2585: of an i.i.d. sample of $\tilde{M}_n$ copies of

2586: $\mathbb{Z}_{\ast}$. The same result holds true if we allow

2587: $\tilde{M}_n$ to go to~$\infty$ slowly enough. Since the

2588: Gaussian processes involved are tight, $\hat{V}_n$ will thus

2589: be consistent for $\Sigma_{\ast}$, uniformly over $\zeta\in[a,b]$.

2590: Similar arguments yield pointwise consistency of $\hat{\mathbb{F}}$

2591: and $\tilde{\mathbb{F}}$ at continuity points of

2592: $\hat{\mathbb{T}}_{\ast}$ and $\tilde{\mathbb{T}}_{\ast}$.

2593: Since it is not hard to verify that

2594: both $\hat{\mathbb{T}}_{\ast}$ and $\tilde{\mathbb{T}}_{\ast}$

2595: have continuous distributions, the pointwise consistency extends

2596: to the desired uniform consistency.$\Box$

2597:

2598: \section*{Acknowledgments}

2599: The authors thank Editor Morris Eaton, an associate editor, and two

2600: referees for their extremely

2601: careful review and helpful suggestions that led

2602: to an improved paper.

2603:

2604: \begin{thebibliography}{9}

2605:

2606: \bibitem{a01}

2607: {\sc Andrews, D. W. K.} (2001). Testing when a parameter is on the

2608: boundary of the maintained hypothesis. {\it Econometrica} {\bf

2609: 69}, 683--73.

2610:

2611: \bibitem{ap94}

2612: {\sc Andrews, D. W. K., and Plogerger, W.} (1994). Optimal

2613: tests when a nuisance parameter is present only under the

2614: alternative. {\em Econometrica} {\bf 62}, 1383--1414.

2615:

2616: \bibitem{bn04}{\sc Bagdonavi\v{c}ius, V., and Nikulin, M.} (2004).

2617: Statistical modeling in survival analysis and its influence on the

2618: duration analysis. {\it Advances in survival analysis}, 411--429,

2619: {\it Handbook of Statistics, 23}. Elsevier, Amsterdam.

2620:

2621: \bibitem{bkrw98}

2622: {\sc Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner,

2623: J. A.} (1998). {\it Efficient and Adaptive Estimation for

2624: Semiparametric Models}. Springer-Verlag, New York.

2625:

2626: \bibitem{bd81}{\sc Bickel, P. J., and Doksum, K. A.} (1981). An analysis of

2627: transformations revisited. {\em Journal of the American

2628: Statistical Association} {\bf 76}, 296--311.

2629:

2630: \bibitem{br97}{\sc Bickel, P. J., and Ritov, Y.} (1997). Local asymptotic

2631: normality of ranks and covariates in transformation models.

2632: {\em Festschrift for Lucien Le Cam: Research papers in probability

2633: and statistics}, 43--54.

2634:

2635: \bibitem{bc64}{\sc Box, G. E. P., and Cox, D. R.} (1964).

2636: An analysis of transformations. (With discussion)

2637: {\em Journal of the Royal Statistical Society}, Series B

2638: {\bf 26}, 211--252.

2639:

2640: \bibitem{bc82}{\sc Box, G. E. P., and Cox, D. R.} (1982).

2641: An analysis of transformations revisited, rebutted.

2642: {\em Journal of the American Statistical Association}

2643: {\bf 77}, 209--210.

2644:

2645: \bibitem{c89}{\sc Chappell, R.} (1989). Fitting bent lines to data, with

2646: applications to allometry. {\it Journal of Theoretical Biology}

2647: {\bf 138}, 235-256.

2648:

2649: \bibitem{cwy95}{\sc Cheng, S. C., Wei, L. J., and Ying, Z.} (1995). Analysis

2650: of transformation models with censored data. {\em Biometrika} {\bf 82},

2651: 835--845.

2652:

2653: \bibitem{cwy97}{\sc Cheng, S. C., Wei, L. J., and Ying, Z.} (1997). Predicting

2654: survival probabilities with semiparametric transformation models. {\it Journal

2655: of the American Statistical Association} {\bf 92}, 227--235.

2656:

2657: \bibitem{dd88}{\sc Dabrowska, D.M. and Doksum, K.A.} (1988). Estimation

2658: and Testing in the Two-sample Generalized Odds-Rate Model. {\it

2659: Journal of the American Statistical Association} {\bf 83}, 1--23.

2660:

2661: \bibitem{d87}{\sc Davies, R. B.} (1987). Hypothesis testing when a nuisance

2662: parameter is present only under the alternative. {\em Biometrika}

2663: {\bf 74}, 33--43.

2664:

2665: \bibitem{fyw98}

2666: {\sc Fine, J. P., Ying, Z., and Wei, L. J.} (1998). On the linear

2667: transformation model for censored data. {\em Biometrika} {\bf 85}, 980--986.

2668:

2669: \bibitem{ih81}{\sc Ibragimov, I. A., and Has'minskii, R. Z.} (1981). {\em

2670: Statistical estimation: Asymptotical theory}. Springer, New York.

2671:

2672: \bibitem{kta}{\sc Kosorok, M. R.} (To appear). {\em Introduction to Empirical

2673: Processes and Semiparametric Inference}. Springer, New York.

2674:

2675: \bibitem{klf04}{\sc Kosorok, M. R., Lee, B. L. and Fine, J. P.} (2004). Robust

2676: Inference for Univariate Proportional Hazards Frailty Regression

2677: Models. {\it The Annals of Statistics} {\bf 32}, 1448-1491.

2678:

2679: \bibitem{lsl90}{\sc Liang, K.-Y., Self, S. G., and Liu, X.} (1990). The Cox

2680: proportional hazards model with change point: An epidemiologic

2681: application. {\em Biometrics} {\bf 46}, 783--793.

2682:

2683: \bibitem{ly93}{\sc Lin, D. Y. and Ying, Z.} (1993). Cox regression with

2684: incomplete covariate measurements. {\it Journal of the American Statistical

2685: Association} {\bf 88}, 1341--1349.

2686:

2687: \bibitem{lb97}{\sc Luo, X. and Boyett, J. M.} (1997). Estimation of a

2688: threshold parameter in cox regression. {\it Communication in

2689: Statistics--Theory and Methods} {\bf 26}, 2329--2346.

2690:

2691: \bibitem{ltc97}{\sc Luo, X., Turnbull, B.W. and Clark, L.C.} (1997).

2692: Likelihood ratio tests for a changepoint with survival data. {\it

2693: Biometrica} {\bf 84}, 555--565.

2694:

2695: \bibitem{mrv97}

2696: {\sc Murphy, S. A., Rossini, A. J., and van der Vaart, A. W.} (1997).

2697: Maximum likelihood estimation in the

2698: proportional odds model. {\it Journal of the American Statistical

2699: Association} {\bf 92}, 968--976.

2700:

2701: \bibitem{p98}{\sc Parner, E.} (1998). Asymptotic theory for the correlated

2702: gamma-frailty model. {\em Annals of Statistics} {\bf 26}, 183--214.

2703:

2704: \bibitem{p82}{\sc Pettit, A. N.} (1982). Inference for the linear model using

2705: a likelihood based on ranks. {\em Journal of the Royal Statistical

2706: Society}, Series B {\bf 44}, 234--243.

2707:

2708: \bibitem{p84}{\sc Pettit, A. N.} (1984). Proportional odds models for survival

2709: data and estimates using ranks. {\em Applied Statistics} {\bf 33}, 169--175.

2710:

2711: \bibitem{pr94}{\sc Politis, D. N., and Romano, J. P.} (1994). Large sample

2712: confidence regions based on subsamples under minimal assumptions.

2713: {\em Annals of Statistics} {\bf 22}, 2031--2050.

2714:

2715: \bibitem{p03}{\sc Pons, O.} (2003). Estimation in a cox regression model

2716: with a change-point according to a threshold in a covariate. {\it

2717: The Annals of Statistics} {\bf 31}, 442--463.

2718:

2719: \bibitem{stg98}

2720: {\sc Scharfstein, D. O., Tsiatis, A. A., and Gilbert, P. B.} (1998).

2721: Semiparametric efficient estimation in the generalized odds-rate class

2722: of regression models for right-censored time-to-event data.

2723: {\em Lifetime Data Analysis} {\bf 4}, 355--391.

2724:

2725: \bibitem{s98}{\sc Shen, X.} (1998). Proportional odds regression and sieve

2726: maximum likelihood estimation. {\em Biometrika} {\bf 85}, 165--177.

2727:

2728: \bibitem{sv04}{\sc Slud, E. V., and Vonta, F.} (2004). Consistency of the

2729: NPML estimator in the right-censored transformation model.

2730: {\em Scandinavian Journal of Statistics} {\bf 31}, 21--41.

2731:

2732: \bibitem{v98}{\sc van der Vaart, A. W.} (1998). {\em Asymptotic Statistics}.

2733: Cambridge University Press, Cambridge.

2734:

2735: \bibitem{vw96}{\sc van der Vaart, A. W., and Wellner, J. A.} (1996). {\it

2736: Weak Convergence and Empirical Processes: With Applications to

2737: Statistics.} Springer, New York.

2738:

2739: \bibitem{vw00}

2740: {\sc van der Vaart, A. W., and Wellner, J. A.} (2000). Preservation

2741: theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli classes.

2742: {\em High Dimensional Probability II}, 113--132. Birkhauser, Boston.

2743:

2744: \end{thebibliography}

2745:

2746: \end{document}

2747: