1: %First revision of annals paper on change-point transformation models
2: \documentclass[11pt]{article}
3:
4: \RequirePackage[OT1]{fontenc}
5:
6: \RequirePackage[aos,amsthm,amsmath,natbib]{imsarttech}
7: \usepackage{amsfonts,graphicx}
8:
9: \begin{document}
10:
11: \begin{frontmatter}
12:
13: \title{Further details on inference under right censoring
14: for transformation models
15: with a change-point based on a covariate threshold}
16: \runtitle{Change-point transformation models}
17: \author{Michael R. Kosorok\thanksref{t1}}
18: \and
19: \author{Rui Song\thanksref{t1}}
20: \affiliation{University of Wisconsin-Madison}
21: \address{Michael R. Kosorok\\ Departments of Statistics\\
22: and Biostatistics \& Medical Informatics\\
23: 1300 University Avenue\\
24: Madison, WI 53706\\USA\\Email: kosorok@biostat.wisc.edu}
25: \address{Rui Song\\ Department of Statistics\\
26: 1300 University Avenue\\
27: Madison, WI 53706\\USA\\Email: rsong@stat.wisc.edu}
28: \runauthor{M. R. Kosorok and R. Song}
29: \thankstext{t1}{Supported in part by Grant CA075142 from the
30: National Cancer Institute.}
31:
32: \begin{abstract}
33: We consider linear transformation models applied to right censored survival
34: data with a change-point in the regression coefficient based on a covariate
35: threshold. We establish consistency and weak convergence of the
36: nonparametric maximum likelihood estimators. The change-point parameter
37: is shown to be $\,n$-consistent, while the remaining parameters are shown to
38: have the expected root-$n$ consistency. We show that the
39: procedure is adaptive in the sense that the non-threshold parameters are
40: estimable with the same precision as if the true threshold value were known.
41: We also develop Monte-Carlo methods of inference for model parameters
42: and score tests for the existence of a change-point.
43: A key difficulty here is that some of the model
44: parameters are not identifiable under the null hypothesis of no change-point.
45: Simulation studies establish the validity of the proposed
46: score tests for finite sample sizes.
47: \end{abstract}
48:
49: \begin{keyword}[class=AMS]
50: \kwd[Primary ]{62N01}
51: \kwd{62F05}
52: \kwd[; secondary ]{62G20}
53: \kwd{62G10.}
54: \end{keyword}
55:
56: \begin{keyword}
57: \kwd{Change-point models}
58: \kwd{Empirical processes}
59: \kwd{Nonparametric maximum likelihood}
60: \kwd{Proportional hazards model}
61: \kwd{Proportional odds model}
62: \kwd{Right censoring}
63: \kwd{Semiparametric efficiency}
64: \kwd{Transformation models.}
65: \end{keyword}
66:
67: \end{frontmatter}
68:
69: \newtheorem{theorem}{\indent \sc Theorem}
70: \newtheorem{corollary}{\indent \sc Corollary}
71: \newtheorem{lemma}{\indent \sc Lemma}
72: \newtheorem{proposition}{\indent \sc Proposition}
73: \newtheorem{remark}{\indent \sc Remark}
74: \newcommand{\phif}{\textsc{igf}}
75: \newcommand{\sign}{\mbox{sign}}
76: \newcommand{\phgf}{\textsc{gf}}
77: \newcommand{\fix}{$\textsc{gf}_0$}
78: \newcommand{\mb}[1]{\mbox{\bf #1}}
79: \newcommand{\Exp}[1]{\mbox{E}\left[#1\right]}
80: \newcommand{\pr}[1]{\mbox{P}\left[#1\right]}
81: \newcommand{\pp}[0]{\mathbb{P}}
82: \newcommand{\ee}[0]{\mbox{E}}
83: \newcommand{\re}[0]{\mathbb{R}}
84: \newcommand{\argmax}[0]{\mbox{argmax}}
85: \newcommand{\argmin}[0]{\mbox{argmin}}
86: \newcommand{\ind}[0]{\mbox{\Large\bf 1}}
87: \newcommand{\narrow}{\stackrel{n\rightarrow\infty}{\longrightarrow}}
88: \newcommand{\weakpn}{\stackrel{P_n}{\leadsto}}
89: \newcommand{\weakpnboot}{\mbox{\raisebox{-1.5ex}{$\stackrel
90: {\mbox{\scriptsize $P_n$}}{\stackrel{\mbox{\normalsize $\leadsto$}}
91: {\mbox{\normalsize $\circ$}}}$}}\,}
92: \newcommand{\ol}[1]{\overline{#1}}
93: \newcommand{\avgse}[1] { \bar{\hat{\sigma}}_{#1} }
94: \newcommand{\mcse}[1] { \sigma^{*}_{#1} }
95: \newcommand{\po}{\textsc{po}}
96: \newcommand{\ph}{\textsc{ph}}
97:
98: \section{Introduction} The linear transformation model states that
99: a continuous outcome $U$, given a $d$-dimensional covariate vector $Z$,
100: has the form
101: \begin{eqnarray}
102: H(U)= - \beta ^\prime Z + \varepsilon, \label{s1.e1}
103: \end{eqnarray}
104: where $H$ is an increasing, unknown transformation function, $\beta\in\re^d$
105: are the unknown regression parameters of interest, and $\varepsilon$ has a
106: known distribution $F$. This model is readily applied to a failure time $T$
107: by letting $U=\log T$ and $H(u)=\log A(e^u)$, where $A$ is an
108: unspecified integrated baseline hazard. Setting
109: $F(s)=1-\exp(-e^s)$ results in the Cox model, while setting
110: $F(s)=e^s/(1+e^s)$ results in the proportional odds model. More
111: generally, the transformation model for a survival time $T$ conditionally
112: on a time-dependent covariate $\tilde{Z}(t)=\{Z(s),0\leq s\leq t\}$,
113: takes the form
114: \begin{eqnarray}
115: \pr{T>t|\tilde{Z}(t)}=S_Z(t)&\equiv&\Lambda\left(\int_0^te^{\beta'Z(s)}
116: dA(s)\right),\label{new.e1}
117: \end{eqnarray}
118: where $\Lambda$ is a known decreasing function
119: with $\Lambda(0)=1$. The model~(\ref{new.e1})
120: becomes model~(\ref{s1.e1}) when the covariates are time-independent
121: and $F(s)=1-\Lambda(e^s)$.
122:
123: In data analysis, the assumption of linearity of the regression
124: effect in~(\ref{new.e1}) is not always satisfied over the whole
125: range of the covariate, and the fit may be improved with a
126: two-phase transformation model having a change-point at an
127: unknown threshold of a one-dimensional covariate $Y$. Let
128: $Z=(Z_1,Z_2)$, where $Z_1$ and $Z_2$ are possibly time-dependent
129: covariates in $\re^p$ and $\re^q$, respectively, where $p+q=d$
130: and $q\geq 1$. The new model is obtained by replacing $\beta'Z(s)$
131: in~(\ref{new.e1}) with
132: \begin{eqnarray}
133: r_{\xi}(s;Z,Y)\equiv\beta'Z(s) + [\alpha + \eta'Z_2(s)]\ind\{Y >
134: \zeta\},\label{new.e2}
135: \end{eqnarray}
136: where $\alpha$ is a scalar, $\eta\in\re^q$, $\ind\{B\}$ is the
137: indicator of $B$, and $\xi$
138: denotes the collected parameters $(\alpha,\beta,\eta,\zeta)$.
139: We also require $Y$ to be time-independent but allow it to possibly
140: be one of the covariates in $Z(t)$. The overall goal of this paper
141: is to develop methods of inference for this model applied to
142: right censored data.
143:
144: We note that for the special case when $\alpha=0$ and
145: $\Lambda(t)=e^{-t}$, the model~(\ref{new.e2})
146: becomes the Cox model considered by \cite{p03} under a slightly
147: different parameterization. Permitting a nonzero $\alpha$
148: allows the possibility of a ``bent-line''
149: covariate effect. Suppose, for example, that
150: $Z_2$ is one-dimensional and time-independent,
151: while $Z_1\in\re^{d-1}$ may be time-dependent.
152: If we set $Y=Z_2$ and $\beta=(\beta_1',\beta_2')'$, where
153: $\beta_1\in\re^{d-1}$ and $\beta_2\in\re$, the model~(\ref{new.e2})
154: becomes $r_{\xi}(s;Z,Y)=\beta_1'Z_1(s)+\beta_2Z_2
155: +(\alpha+\eta Z_2)\ind\{Z_2>\zeta\}$.
156: When $\alpha=-\eta\zeta$, the covariate effect for $Z_2$
157: consists of two connected linear segments. In many biological settings,
158: such a bent-line effect is realistic and can be much easier
159: to interpret than a quadratic or more complex nonlinear effect \cite{c89}.
160: Hence including the intercept term $\alpha$ is useful
161: for applications.
162:
163: Linear transformation models of the form~(\ref{s1.e1}) have been
164: widely used and studied (see, for example, \cite{bc64,bd81,bc82,p82,dd88,
165: cwy95,cwy97,fyw98,bn04}). Efficient methods of estimation in the
166: uncensored setting were rigorously studied by \cite{br97},
167: among others. The model~(\ref{new.e1}) for right-censored data has
168: also been studied rigorously for a variety of specific choices of
169: $\Lambda$ \cite{p84,mrv97,stg98,s98}; for
170: general but known $\Lambda$ \cite{sv04}; and for certain
171: parameterized families of $\Lambda$ \cite{klf04}.
172:
173: Change-point models have also been studied extensively and have
174: proven to be popular in clinical research. Several researchers have
175: considered a nonregular Cox model involving a two-phase regression on
176: time-dependent covariates, with a change-point at an unknown time
177: \cite{lsl90,lb97,ltc97}. As mentioned above, \cite{p03}
178: considered the Cox model with a
179: change-point at an unknown threshold of a covariate.
180: These authors studied the maximum partial likelihood estimators of
181: the parameters and the estimator of the baseline hazard function.
182: They show that the estimator of the threshold parameter
183: is $n$-consistent, while the regression parameters are
184: $\sqrt{n}$-consistent. This happens because
185: the likelihood function is not differentiable with respect to the
186: threshold parameter, and hence the usual Taylor expansion
187: is not available. In this paper, we focus on
188: the covariate threshold setting. While time threshold models are
189: also interesting, we will not pursue them further in this paper
190: because the underlying techniques for estimation and
191: inference are quite distinct from the covariate threshold setting.
192:
193: The contribution of our paper builds on \cite{p03} in three
194: important ways. Firstly, we extend to general transformation models.
195: This results in a significant increase in complexity over the Cox
196: model since estimation of the baseline hazard can no longer be
197: avoided through the use of the partial-profile likelihood. Secondly,
198: we study nonparametric maximum likelihood inference for all model parameters.
199: As part of this, we show that the estimation procedure is adaptive in the
200: sense that the non-threshold parameters---including the infinite-dimensional
201: parameter~$A$---are estimable with the same precision
202: as if the true threshold parameter were known. Thirdly, we develop hypothesis
203: tests for the existence of a change-point. This is quite challenging
204: since some of the model parameters are no longer identifiable under
205: the null hypothesis of no change-point. \cite{a01} considers
206: similar nonstandard testing problems when the model is fully
207: parametric and establishes asymptotic null and local alternative
208: distributions of a number of likelihood-based test procedures.
209: Unfortunately, Andrews' results are not directly
210: applicable to our setting because of the presence of an infinite
211: dimensional nuisance parameter, the baseline integrated hazard $A$,
212: and new methods are required.
213:
214: The next section, section~2, presents the data and model
215: assumptions. The nonparametric maximum log-likelihood estimation
216: (NPMLE) procedure is
217: presented in section~3. In section~4, we establish the consistency
218: of the estimators. Score and information operators of the regular
219: parameters are given in section~5. Results on the convergence rates
220: of the estimators are established in section~6. Section~7 presents
221: weak convergence results for the estimators, including the
222: asymptotic distribution of the change-point estimator and
223: the asymptotic normality of the other parameters. This section
224: also establishes the adaptive semiparametric efficiency
225: mentioned above. Monte Carlo inference for the parameters is discussed
226: in section~8. Methods for testing the existence of a change-point are
227: then presented in section~9. A brief discussion on implementation
228: and a small simulation study evaluating the moderate
229: sample size performance of the proposed change-point tests are
230: given in section~10. Proofs are given in section~11.
231:
232: \section{The data set-up and model assumptions}
233: The data $X_i=(V_i,\delta_i,$ $Z_i,Y_i)$, $i=1,\ldots,n$, consists of $n$
234: i.i.d. realizations of $X = (V,\delta,Z,Y)$, where $V = T \land C$, $\delta
235: = 1(T \le C)$, and $C$ is a right censoring time. The analysis is
236: restricted to the interval $[0, \tau]$, where $\tau < \infty$.
237: The covariate $Y\in\re$ and $Z \equiv \{Z(t), t \in [0, \tau] \}$ is assumed
238: to be a caglad (left-continuous with right-hand limits) process with
239: $Z(t)=(Z_1'(t),Z_2'(t))'\in\re^p\times\re^q$, for all $t\in [0, \tau]$,
240: where $q\geq 1$ but $p=0$ is allowed.
241:
242: We assume that conditionally on $Z$ and $Y$, the survival function
243: at time $t$ has the form:
244: \begin{eqnarray}
245: S_{Z,Y}(t)\equiv\Lambda \left(\int_0^t e^{r_{\xi}(u;Z,Y)}dA(u)\right),
246: \label{s2.e1}
247: \end{eqnarray}
248: where $\Lambda$ is a known, thrice differentiable
249: decreasing function with $\Lambda(0)=1$,
250: $r_{\xi}(s;Z,Y)$ is as defined in~(\ref{new.e2}), and $A$ is an
251: unknown increasing function restricted to $[0,\tau]$.
252:
253: Let $G\equiv-\log\Lambda$,
254: and define the derivatives
255: $\dot{\Lambda}\equiv\partial\Lambda(t)/(\partial t)$,
256: $\ddot{\Lambda}\equiv\partial\dot{\Lambda}(t)/(\partial t)$,
257: $\dot G \equiv \partial G(t) /(\partial t)$,
258: $\ddot G \equiv \partial \dot G (t) /(\partial t)$, and
259: $\dddot G \equiv \partial\ddot G /(\partial t)$.
260: We also define the collected parameters
261: $\gamma\equiv(\alpha,\eta,\beta)$, $\psi\equiv(\gamma, A)$, and
262: $\theta \equiv(\psi, \zeta)$. We use $P$ to denote the
263: true probability measure, while the true parameter values are
264: indicated with a subscript 0.
265:
266: We now make the following additional assumptions:
267: \begin{itemize}
268: \item[A1]: $P[C=0]=0$, $P[C \ge \tau | Z,Y] = P[C = \tau | Z,Y] > 0$
269: almost surely, and censoring is independent of $T$ given $(Z,Y)$
270: and uninformative.
271: \item[A2]: The total variation of $Z(\cdot)$ on $[0, \tau]$ is
272: $\le m_0<\infty$ almost surely.
273: \item[B1]: $\zeta_0\in(a,b)$, for some known $-\infty<a<b<\infty$
274: with $P[Y<a]>0$ and $P[Y>b]>0$.
275: \item[B2]: For some neighborhood $\tilde{V}(\zeta_0)$ of $\zeta_0$:
276: \begin{itemize}
277: \item[(i)] the density of $Y$, $\tilde{h}$, exists and
278: is strictly positive, bounded and continuous for all
279: $y\in\tilde{V}(\zeta_0)$; and
280: \item[(ii)] the conditional law of $(C,Z)$ given $Y=y$,
281: ${\cal L}_y$, is left-continuous with right-hand limits
282: over $\tilde{V}(\zeta_0)$.
283: \end{itemize}
284: \item[B3]: For some $t_1,t_2\in(0,\tau]$, both var$[Z(t_1)|Y=\zeta_0]$
285: and var$[Z(t_2)|Y=\zeta_0+]$ are positive definite.
286: \item[B4]: For some $t_3,t_4\in(0,\tau]$, both
287: var$[Z(t_3)|Y<a]$ and var$[Z(t_4)|Y>b]$ are positive definite.
288: \item[C1]: $\alpha_0\in\Upsilon\subset\re$, $\beta_0\in B_1\subset\re^d$,
289: $\eta_0\in B_2\subset\re^q$, where $d\geq q\geq 1$, and $\Upsilon$, $B_1$
290: and $B_2$ are open, convex, bounded and known.
291: \item[C2]: Either $\alpha_0\neq 0$ or $\eta_0\neq 0$.
292: \item[C3]: $A_0\in{\cal A}$, where ${\cal A}$ is the set of all
293: increasing functions $A:[0,\tau]\mapsto[0,\infty)$ with
294: $A(0)=0$ and $A(\tau)<\infty$; and $A_0$ has derivative $a_0$
295: satisfying $0<a_0(t)<\infty$ for all $t\in[0,\tau]$.
296: \item[D1]: $G:[0,\infty)\mapsto[0,\infty)$ is thrice continuously
297: differentiable, with $G(0)=0$, and, for each $u\in[0,\infty)$,
298: $0<\dot{G}(u),\ddot{\Lambda}(u)<\infty$ and
299: $\sup_{s\in[0,u]}|\dddot G(s)|<\infty$.
300: \item[D2]: For some $c_0>0$, both
301: $\sup_{u\geq 0}|u^{c_0}\Lambda(u)|<\infty$ and
302: $\sup_{u\geq 0}|u^{1+c_0}\dot{\Lambda}(u)|<\infty$.
303: \end{itemize}
304:
305: Conditions~A1, A2, C1 and~C3 are commonly used for NPMLE
306: consistency and identifiability in right-censored
307: transformation models, while conditions~B1, B2, B3 and~C2 are
308: needed for change-point identifiability. As pointed out by a
309: referee, the use of a time-dependent covariate will require
310: that $Z_i(V_j)$ be observed for each individual $i$ and for
311: every $j$ such that $\delta_1=1$ and $V_j\leq V_i$. While this
312: is often assumed in theoretical contexts, it can be unrealistic
313: in practice, where missing values of $Z_i(t)$ are not
314: unusual (see \cite{ly93}). Frequently, data analysts will simply
315: carry the last observation of $Z_i(t)$ forward to avoid the missingness
316: problem. Unfortunately, this simple solution is not necessarily valid.
317: However, addressing this
318: issue thoroughly is beyond the scope of this paper, and we will
319: only mention it again briefly in section~9,
320: where we develop a test of the null hypothesis that there is no
321: change-point ($H_0:\alpha_0=0$ and $\eta_0=0$). Also in section~9,
322: we will relax condition~C2
323: to allow for a sequence of contiguous alternative hypotheses that
324: includes $H_0$. Condition~B2(ii) is also needed to obtain weak convergence
325: for the NPMLE of $\zeta_0$. The continuity requirements
326: at each point $y$ can be restated in the following way:
327: ${\cal L}_{\zeta}$ converges
328: weakly to ${\cal L}_y$, as $\zeta\uparrow y$; and
329: ${\cal L}_{\zeta}$ converges weakly to ${\cal L}_{y+}$,
330: as $\zeta\downarrow y$, for some law ${\cal L}_{y+}$.
331: It would require a fairly pathological relationship among
332: the variables $(C,Z,Y)$ for this not to hold. Condition~B4 will also be
333: needed for the change-point test developed in section~9.
334:
335: Conditions~D1 and~D2
336: are also needed for asymptotic normality. Condition D1~is quite similar to
337: conditions~(G.1) through~(G.4) in \cite{sv04} who
338: use the condition for developing asymptotic theory
339: for transformation models without a change-point. Condition~D2
340: is slightly weaker than conditions~D2 and~D3 of \cite{klf04}
341: who use the condition to obtain asymptotic theory
342: for frailty regression models without a change-point.
343: The following are several instances that satisfy conditions~D1 and D2:
344: \begin{enumerate}
345: \item $\Lambda(u)=e^{-u}$ corresponds to the extreme value distribution
346: and results in the Cox model.
347: \item $\Lambda(u)=(1+c u)^{-1/c}$, for any $c\in(0,\infty)$,
348: corresponds to the family of log-Pareto distributions and results
349: in the odds-rate transformation family. Taking the limit as
350: $c\downarrow 0$ yields the Cox model, while $c=1$
351: yields the proportional odds model.
352: \item $\Lambda(u)=\Exp{e^{-Wu}}$, where $W$ is a positive frailty with
353: $\Exp{W^{-c}}<\infty$, for some $c>0$, and
354: $\Exp{W^4}<\infty$, corresponds to the family of frailty transformations.
355: In addition to the odds-rate family, these conditions are
356: satisfied by both the inverse Gaussian and log-normal families
357: (see \cite{klf04}), as well as many other frailty families.
358: \item $\Lambda(u)=[1+2cu+u^2]^{-1}$,
359: where $c\in(1/2,1)$. Because this is the Laplace transform of
360: $t\mapsto e^{-ct}$ $\times\sin\left(t\sqrt{1-c^2}\right)/\sqrt{1-c^2}$,
361: it is not the Laplace transform of a density. Hence this family
362: is not a member of the family of frailty transformations. Note, however, that
363: taking the limit as $c\uparrow 1$ results in the Laplace transform
364: of the frailty density $te^{-t}$.
365: \end{enumerate}
366:
367: Verification of these conditions is routine for examples~1, 2 and~4 above, but
368: verification for example~3 is slightly more involved:
369: \begin{lemma}\label{l.v1}
370: Conditions~D1 and~D2 are satisfied for example~3 above.
371: \end{lemma}
372:
373: \section{Nonparametric Maximum log-likelihood estimation} The
374: nonparametric log-likelihood has the form $L_n(\psi,\zeta)\equiv$
375: \begin{eqnarray}
376: &&\mathbb{P}_n \left\{
377: \delta\log(a(V))+l_1^{\psi}(V,\delta,Z)
378: \ind\{Y \le \zeta \} + l_2^{\psi}(V,\delta,Z)
379: \ind\{Y > \zeta \} \right \}, \label{s3.e1}
380: \end{eqnarray}
381: where
382: \begin{eqnarray*}
383: l_1^{\psi}(V,\delta,Z)&\equiv&\int_0^{\tau}\left[\log\dot{G}
384: \left(H^{\psi}_1(s)\right)+\beta' Z(s)\right]
385: dN(s)-G(H^{\psi}_1(V)),\\
386: l_2^{\psi}(V,\delta,Z)
387: &\equiv&\int_0^{\tau}\left[\log\dot{G}\left(H^{\psi}_2(s)
388: \right)+\beta'Z(s)+\alpha+\eta'Z_2(s)\right]dN(s)\\
389: &&-G(H^{\psi}_2(V)),
390: \end{eqnarray*}
391: where $N(t)\equiv\ind\{V\leq t\}\delta$, $\tilde{Y}(s)\equiv\ind\{V\geq s\}$,
392: $a \equiv dA/dt$,
393: $H^{\psi}_1(t)\equiv\int_0^t\tilde{Y}(s)e^{\beta' Z(s)}dA(s)$,
394: $H^{\psi}_2(t)\equiv\int_0^t\tilde{Y}(s)e^{\beta'Z(s)+\alpha + \eta'Z_2(s)}
395: dA(s)$, and $\mathbb{P}_n$ is the
396: empirical probability measure.
397:
398: As discussed by \cite{mrv97}, the
399: maximum likelihood estimator for $a$ does not exist, since any
400: unrestricted maximizer of~(\ref{s3.e1}) puts mass only at observed failure
401: times and is thus not a continuous hazard. We replace $a(u)$
402: in $L_n(\psi,\zeta)$ with $n\Delta A(u)$ as suggested in \cite{p98}
403: who remarked that
404: this form of the empirical log-likelihood function is asymptotically
405: equal to the true log-likelihood function in certain instances.
406: Let $\tilde{L}_n(\psi,\zeta)$ be this modified log-likelihood.
407: Note that the maximum likelihood
408: estimator for $\zeta$ is not unique, since the likelihood is constant
409: in $\zeta$ over the intervals $[Y_{(r)},Y_{(r+1)})$, where $Y_{(1)}
410: <\cdots<Y_{(r)}<\cdots<Y_{(n)}$ are the order statistics of~$Y$. For this
411: reason, we only need to consider $\zeta$ at the values of the
412: $Y$ order statistics.
413:
414: The estimators are obtained in the following way: For fixed
415: $\zeta$, we maximize the fully nonparametric log-likelihood over
416: $\psi$, to obtain the profile log-likelihood
417: $pL_n(\zeta)\equiv\sup_{\psi}\tilde{L}_n(\psi,\zeta)$. We then maximize
418: $pL_n(\zeta)$ over $\zeta$, to obtain $\hat{\zeta}_n$; and then
419: compute $\hat{\psi}_n=\argmax_{\psi}\tilde{L}_n(\psi,\hat{\zeta}_n)$.
420: This yields the NPMLE $\hat{\theta}_n=(\hat{\psi}_n,\hat{\zeta}_n)$
421: for $\theta_0$. Hence we obtain an estimator for $A_0$ but not for $a_0$.
422:
423: \section{Consistency}
424: To study consistency, we first
425: characterize the NPMLE $\hat\theta_n$.
426: Consider the following one-dimensional submodels for A:
427: \begin{eqnarray*}
428: t \mapsto A_t \equiv \int_0 ^{(\cdot)} (1 + tg(s) )dA(s),
429: \end{eqnarray*}
430: where $g$ is an arbitrary non-negative bounded function. A score
431: function for $A$, defined as the derivative of $\tilde{L}_n(\xi, A_t)$
432: with respect to $t$ at $t=0$, is
433: \begin{eqnarray}
434: &&\mathbb{P}_n \left \{ \delta g(X) - \left[
435: \dot{G}(H^{\theta}(V))-
436: \delta\frac{ \ddot{G}(H^{\theta}(V))}
437: { \dot{G}(H^{\theta}(V))}\right]
438: \int_0^{\tau}\tilde{Y}(s)
439: e^{r_{\xi}(s;Z,Y)}g(s)dA(s) \right\}, ~\label{c4:e1}
440: \end{eqnarray}
441: where $H^{\theta}(t)\equiv\int_0^t\tilde{Y}(s)
442: e^{r_{\xi}(s;Z,Y)}dA(s)$.
443: For any fixed $\xi$, let $\hat A_{\xi}$ denote the maximizer of
444: $A\mapsto\tilde{L}_n(\xi, A)$, and let $\hat \theta_{\xi} \equiv (\xi, \hat
445: A_{\xi})$. Then the score function~(\ref{c4:e1}) is equal to zero
446: when evaluated at $\hat \theta_{\xi}$. We select $g(u) = \ind
447: \{u \le t \}$, insert this into~(\ref{c4:e1}), and equate the
448: resulting expression to zero: $\hat{A}_{\xi}(u)=$
449: \begin{eqnarray}
450: \label{c4:e2}&&\\
451: \int_0^u\left(\pp_n \left[\tilde{Y}(s)e^{r_{\xi}(s;Z,Y)}\left(
452: \dot{G}\left\{H^{\hat{\theta}_{\xi}}(V)\right\}-\delta\frac{
453: \ddot{G}\left\{H^{\hat{\theta}_{\xi}}(V)\right\}}{\dot{G}
454: \left\{H^{\hat{\theta}_{\xi}}(V)\right\}}
455: \right)\right]\right)^{-1}\pp_n\{dN(s)\}&&\nonumber\\
456: \equiv\int_0^u\{\mathbb{P}_nW(s;
457: \hat{\theta}_{\xi})\}^{-1}\pp_n\{dN(s)\}.&&\nonumber
458: \end{eqnarray}
459: Now the profile likelihood has the form
460: $pL_n(\zeta)=\argmax_{\gamma}\tilde{L}_n
461: \left((\gamma,\hat{A}_{(\gamma,\zeta)}),\zeta\right)$.
462:
463: The above characterization facilitates the following consistency
464: results for $\hat{\theta}_n$:
465: \begin{lemma}\label{l1}
466: Under the regularity conditions of section~2,
467: the transformation model with a change-point based on a covariate
468: threshold is identifiable.
469: \end{lemma}
470: \begin{lemma}\label{l2}
471: Under the regularity conditions of section~2,
472: $\hat{A}_n$ is asymptotically bounded, and thus the NPMLE
473: $\hat{\theta}_n$ exists.
474: \end{lemma}
475: Using these results, we can establish the uniform consistency of
476: $\hat \theta_n$:
477: \begin{theorem}\label{t1}
478: Under the regularity conditions of section~2,
479: $\hat {\theta}_n$ converges outer almost surely to $\theta_0$
480: in the uniform norm.
481: \end{theorem}
482:
483: \section{Score and information operators for regular parameters}
484: In this section, we derive the score and information operators
485: for the collected parameters $\psi$. We refer to these parameters
486: as the regular parameters because, as we will see in section~6,
487: these parameters converge at the $\sqrt{n}$ rate. On the other
488: hand, $\hat{\zeta}_n$ converges at the $n$ rate and thus
489: the parameter $\zeta$ is not regular. The score and
490: information operators for $\psi$ are needed for the convergence
491: rate and weak limit results of sections~6 and~7.
492:
493: Let $\mathcal{H}$ denote the space of the elements $h = (h_1, h_2, h_3, h_4)$
494: such that $h_1 \in \mathbb{R}$, $h_2\in\re^q$, $h_3 \in \mathbb{R}^d$,
495: and $h_4 \in D[0,\tau]$, where $D[0,\tau]$ is the space of cadlag
496: functions (right-continuous with left-hand limits) on $[0,\tau]$.
497: We denote by $BV$ the subspace of $D[0,\tau]$ consisting of functions
498: that are of bounded variation over the
499: interval $[0,\tau]$. Define, for future use,
500: the following linear functional for
501: each $\theta=(\psi,\zeta)$ and each $t \in [0,\tau]$:
502: \begin{eqnarray}
503: R^t_{\zeta,\psi}(f) \equiv \int_0^t f(u)\tilde{Y}(u)
504: e^{r_{\xi}(u;Z,Y)}dA(u),
505: \end{eqnarray}
506: where $f$ is an element or vector of elements in $BV$.
507: Also let $\rho_1 (h) \equiv ( | h_1 |^2 +
508: \|h_2 \|^2+ \| h_3\|^2+ \| h_4 \|_v^2)^{1/2}$ and $\mathcal{H}_r
509: \equiv \{h \in \mathcal{H}: \rho_1(h)\leq r \}$, where $\|\cdot\|_v$
510: is the total variation norm on $BV$ and $r\in(0,\infty)$.
511:
512: The parameter
513: $\psi\in\Psi\equiv\Upsilon\times B_2\times B_1\times{\cal A}$
514: can be considered a linear functional on $\mathcal{H}_r$ by defining
515: $\psi(h) \equiv h_1\alpha + h_2' \eta + h_3' \beta + \int_0^{\tau}
516: h_4(u)dA(u)$, $h \in \mathcal{H}_r$.
517: Viewed this way, $\Psi$ is a subset of $\ell^{\infty}(H_r)$ with uniform
518: norm $\|\psi\|_{(r)}\equiv\sup_{h\in{\cal H}_r}|\psi(h)|$, where
519: $\ell^{\infty}(B)$ is the space of bounded functionals on~$B$. Note that
520: ${\cal H}_1$ is rich enough to extract all components of $\psi$. This
521: is easy to see for the Euclidean components; and, for the $A$ component,
522: it works by using the elements $\{h:h_1=0,h_2=0,h_3=0,
523: h_4(u)=\ind\{u\leq t\},t\in[0,\tau]\}\subset{\cal H}_1$.
524:
525: In section~5.1, we derive the score operator; while in section~5.2
526: we derive the information operator and establish its continuous invertibility.
527:
528: \subsection{The score operator} Using the one-dimensional submodel
529: \begin{eqnarray*}
530: t \rightarrow \psi_t \equiv \psi + t(h_1, h_2, h_3,
531: \int_0^{(\cdot)}h_4(u)dA(u)), ~~~ h \in \mathcal{H}_r,
532: \end{eqnarray*}
533: the score operator takes the form
534: \[U^{\tau}_{n\zeta}(\psi)(h) \equiv
535: \left.\frac{\partial}{\partial t}L_n(\psi_t, \zeta) \right|_{t=0}
536: =\mathbb{P}_n U^{\tau}_{\zeta}(\psi)(h),\]
537: where $U_{\zeta}^{\tau}(\psi)(h)\equiv U^{\tau}_{\zeta,1}(\psi)(h_1) +
538: U^{\tau}_{\zeta,2}(\psi)(h_2)+
539: U^{\tau}_{\zeta,3}(\psi)(h_3)+U^{\tau}_{\zeta,4}(\psi)(h_4)$, and
540: \begin{eqnarray*}
541: U^{\tau}_{\zeta,1}(\psi)(h_1)&\equiv&
542: \ind\{Y>\zeta\}\left\{\int_0^{\tau}h_1dN(u)-\hat{\Xi}_{\theta}^{(0)}(\tau)
543: R^{\tau}_{\zeta,\psi}(h_1)\right\},\\
544: U^{\tau}_{\zeta,2}(\psi)(h_2)&\equiv&\ind(Y>\zeta)\left\{
545: \int_0^{\tau}Z_2'(u)h_2dN(u)-\hat{\Xi}_{\theta}^{(0)}(\tau)
546: R^{\tau}_{\zeta,\psi}(Z_2'h_2)\right\},\\
547: U^{\tau}_{\zeta,3}(\psi)(h_3)&\equiv&
548: \int_0^{\tau}Z'(u)h_3dN(u)-\hat{\Xi}_{\theta}^{(0)}(\tau)
549: R^{\tau}_{\zeta,\psi}(Z'h_3),\\
550: U^{\tau}_{\zeta,4}(\psi)(h_4)&\equiv&\int_0^{\tau}h_4(u)dN(u)
551: -\hat{\Xi}_{\theta}^{(0)}(\tau)R^{\tau}_{\zeta,\psi}(h_4),\\
552: \hat{\Xi}_{\theta}^{(0)}(\tau)&\equiv&\ind\{Y\leq\zeta\}
553: \hat{\Xi}_{\psi,1}^{(0)}(\tau)+\ind\{Y>\zeta\}
554: \hat{\Xi}_{\psi,2}^{(0)}(\tau),
555: \end{eqnarray*}
556: and where, for $j=1,2$,
557: \[\hat{\Xi}_{\psi,j}^{(0)}(\tau)\equiv
558: \left[\dot{G}(H^{\psi}_j(V \wedge \tau)) -
559: \delta \frac{ \ddot{G}(H^{\psi}_j(V \wedge \tau))}
560: {\dot{G}(H^{\psi}_j(V \wedge \tau))}\right].\]
561: The dependence in the notation on $\tau$ will prove useful in later
562: developments.
563:
564: \subsection{The information operator} To obtain the information
565: operator, we can differentiate the expectation of the score operator
566: using the map $t \rightarrow \psi+ t\psi_1$,
567: where $\psi,\psi_1\in\Psi$. The information operator,
568: $\sigma_{\theta}:\mathcal{H}_{\infty} \rightarrow
569: \mathcal{H}_{\infty}$, where $\mathcal{H}_{\infty}\equiv\{h:\mbox{$h\in
570: \mathcal{H}_r$ for some $r<\infty$}\}$, satisfies
571: \begin{eqnarray}
572: \psi_1(\sigma_{\theta}(h))
573: &=&\left. -\frac{\partial}{\partial t}
574: PU^{\tau}_{\zeta}(\psi+t\psi_1)(h) \right|_{t=0},\label{new.j12.e1}
575: \end{eqnarray}
576: for every $h\in{\cal H}_{\infty}$.
577: Taking the G\^{a}teaux derivative in~(\ref{new.j12.e1}), we obtain
578: $\sigma_{\theta}(h)=$
579: \begin{eqnarray}
580: \label{c5.e2}&&\\
581: \left(\begin{array}{cccc}\sigma_{\theta}^{11}&
582: \sigma_{\theta}^{12}& \sigma_{\theta}^{13} & \sigma_{\theta}^{14} \\
583: \sigma_{\theta}^{21}&\sigma_{\theta}^{22}&
584: \sigma_{\theta}^{23}& \sigma_{\theta}^{24} \\
585: \sigma_{\theta}^{31}&\sigma_{\theta}^{32}
586: &\sigma_{\theta}^{33}&\sigma_{\theta}^{34}\\
587: \sigma_{\theta}^{41}&\sigma_{\theta}^{42}&\sigma_{\theta}^{43}
588: &\sigma_{\theta}^{44}
589: \end{array}\right)
590: \left(\begin{array}{c}h_1\\ h_2\\ h_3\\ h_4\end{array}\right)
591: \equiv P\left(\begin{array}{cccc}\hat{\sigma}_{\theta}^{11}&
592: \hat{\sigma}_{\theta}^{12}& \hat{\sigma}_{\theta}^{13}&
593: \hat{\sigma}_{\theta}^{14} \\
594: \hat{\sigma}_{\theta}^{21}&
595: \hat{\sigma}_{\theta}^{22}&\hat{\sigma}_{\theta}^{23}&
596: \hat{\sigma}_{\theta}^{24} \\
597: \hat{\sigma}_{\theta}^{31}&\hat{\sigma}_{\theta}^{32}&
598: \hat{\sigma}_{\theta}^{33}& \hat{\sigma}_{\theta}^{34}\\
599: \hat{\sigma}_{\theta}^{41}&\hat{\sigma}_{\theta}^{42}&
600: \hat{\sigma}_{\theta}^{43}& \hat{\sigma}_{\theta}^{44}
601: \end{array}\right)
602: \left(\begin{array}{c}h_1\\ h_2\\ h_3\\ h_4\end{array}\right)&&
603: \nonumber
604: \end{eqnarray}
605: $\equiv P\hat{\sigma}_{\theta}(h)$, where
606: \begin{eqnarray*}
607: \hat{\sigma}_{\theta}^{11}(h_1)&\equiv&\ind\{Y>\zeta\}\left\{
608: \hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)
609: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(h_1),\\
610: \hat{\sigma}_{\theta}^{12}(h_2)&\equiv&\ind\{Y>\zeta\}
611: \left\{\hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)
612: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(Z_2'h_2),\\
613: \hat{\sigma}_{\theta}^{13}(h_3)&\equiv&\ind\{Y>\zeta\}
614: \left\{\hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)
615: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(Z'h_3),\\
616: \hat{\sigma}_{\theta}^{14}(h_4)&\equiv&\ind\{Y>\zeta\}
617: \left\{\hat{\Xi}_{\theta}^{(0)}(\tau)+\hat{\Xi}_{\theta}^{(1)}(\tau)
618: H_2^{\psi}(V\wedge\tau)\right\}R_{\zeta,\psi}^{\tau}(h_4),\\
619: \hat{\sigma}_{\theta}^{21}(h_1)&\equiv&\ind\{Y>\zeta\}\left\{
620: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2h_1)
621: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)
622: R_{\zeta,\psi}^{\tau}(h_1)\right\}\\
623: \hat{\sigma}_{\theta}^{22}(h_2)&\equiv&\ind\{Y>\zeta\}\left\{
624: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2Z_2'h_2)
625: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)
626: R_{\zeta,\psi}^{\tau}(Z_2'h_2)\right\},\\
627: \hat{\sigma}_{\theta}^{23}(h_3)&\equiv&
628: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2Z'h_3)
629: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)
630: R_{\zeta,\psi}^{\tau}(Z'h_3),\\
631: \hat{\sigma}_{\theta}^{24}(h_4)&\equiv&
632: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2h_4)
633: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2)
634: R_{\zeta,\psi}^{\tau}(h_4),\\
635: \hat{\sigma}_{\theta}^{31}(h_1)&\equiv&\ind\{Y>\zeta\}\left\{
636: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Zh_1)
637: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)
638: R_{\zeta,\psi}^{\tau}(h_1)\right\},\\
639: \hat{\sigma}_{\theta}^{32}(h_2)&\equiv&\ind\{Y>\zeta\}\left\{
640: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(ZZ_2'h_2)
641: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)
642: R_{\zeta,\psi}^{\tau}(Z_2'h_2)\right\},\\
643: \hat{\sigma}_{\theta}^{33}(h_3)&\equiv&
644: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(ZZ'h_3)
645: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)
646: R_{\zeta,\psi}^{\tau}(Z'h_3),\\
647: \hat{\sigma}_{\theta}^{34}(h_4)&\equiv&
648: \hat{\Xi}_{\theta}^{(0)}(\tau)R_{\zeta,\psi}^{\tau}(Zh_4)
649: +\hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z)
650: R_{\zeta,\psi}^{\tau}(h_4),
651: \end{eqnarray*}
652: \begin{eqnarray*}
653: \hat{\sigma}_{\theta}^{41}(h_1)(u)&\equiv&\ind\{Y>\zeta\}
654: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{
655: \hat{\Xi}_{\theta}^{(0)}(\tau)h_1+
656: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(h_1)\right\},\\
657: \hat{\sigma}_{\theta}^{42}(h_2)(u)&\equiv&\ind\{Y>\zeta\}
658: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{
659: \hat{\Xi}_{\theta}^{(0)}(\tau)Z_2'(u)h_2+
660: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z_2'h_2)\right\},\\
661: \hat{\sigma}_{\theta}^{43}(h_3)(u)&\equiv&
662: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{
663: \hat{\Xi}_{\theta}^{(0)}(\tau)Z'(u)h_3+
664: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(Z'h_3)\right\},\\
665: \hat{\sigma}_{\theta}^{44}(h_4)(u)&\equiv&
666: \tilde{Y}(u)e^{r_{\xi}(u;Z,Y)}\left\{
667: \hat{\Xi}_{\theta}^{(0)}(\tau)h_4(u)+
668: \hat{\Xi}_{\theta}^{(1)}(\tau)R_{\zeta,\psi}^{\tau}(h_4)\right\},
669: \end{eqnarray*}
670: and where
671: \begin{eqnarray*}
672: \hat{\Xi}_{\theta}^{(1)}(\tau)&\equiv&\ddot{G}(H^{\theta}(V\wedge\tau))
673: -\delta\left[\frac{\dddot{G}(H^{\theta}(V\wedge\tau))}
674: {\dot{G}(H^{\theta}(V\wedge\tau))}-\left\{
675: \frac{\ddot{G}(H^{\theta}(V\wedge\tau))}
676: {\dot{G}(H^{\theta}(V\wedge\tau))}\right\}^2\right].
677: \end{eqnarray*}
678: Note that all of the above operators are clearly bounded
679: whenever $\theta$ is bounded.
680:
681: The following lemma strengthens
682: the above G\^{a}teaux derivative to a Fr\'{e}chet derivative. We will
683: need this strong differentiability to obtain weak convergence of
684: our estimators.
685: \begin{lemma}\label{l3}
686: Under the regularity conditions of section~2 and for any $\zeta\in[a,b]$
687: and $\psi_1\in\Psi$, the operator
688: $\psi\mapsto P U_{\zeta}^{\tau}(\psi)$
689: is Fr\'{e}chet differentiable at $\psi_1$,
690: with derivative $-\psi(\sigma_{\psi_1}(h))$, where
691: $h$ ranges over ${\cal H}_r$ and is the index
692: for $P_{\zeta}^{\tau}(\psi)(\cdot)$, $\psi$ ranges over the linear span
693: $\mbox{lin}\,\Psi$ of $\Psi$, and $0<r<\infty$.
694: \end{lemma}
695:
696: The following lemma gives
697: us the desired continuous invertibility of both
698: $\sigma_{\theta_0}$ and the operator
699: $\psi\mapsto\psi(\sigma_{\theta_0}(\cdot))$. This last operator will
700: be needed for weak convergence of regular parameters.
701: \begin{lemma}\label{l4}
702: Under the regularity conditions of section~2,
703: the linear operator $\sigma_{\theta_0}: \mathcal{H}_{\infty}
704: \rightarrow \mathcal{H}_{\infty}$ is
705: continuously invertible and onto, with inverse $\sigma_{\theta_0}^{-1}$.
706: Moreover, the linear operator $\psi\mapsto \psi(\sigma_{\theta_0}(\cdot))$,
707: as a map from and to $\mbox{lin}\,\Psi$,
708: is also continuously invertible and onto, with inverse
709: $\psi\mapsto\psi(\sigma_{\theta_0}^{-1}(\cdot))$.
710: \end{lemma}
711:
712: \section{The convergence rates of the estimators}
713:
714: To determine the convergence rates of the estimators, we need to
715: study closely the log-likelihood process $\tilde{L}_n(\theta)$
716: near its maximizer. In the parametric setting, this process
717: can be approximated by its expectation which can be shown to
718: be locally concave. For the Cox model, as in \cite{p03}, this
719: same procedure can be applied to the partial likelihood which
720: shares the local concavity features of a parametric likelihood.
721: Unfortunately, in our present set-up, studying the expectation
722: of $\tilde{L}_n(\theta)$ will lead to problems since $A_0$ has
723: a density and thus $\Delta A_0(t)=0$ for all $t\in[0,\tau]$.
724: Hence $\tilde{L}_n(\theta_0)=-\infty$, and
725: a new approach is needed. The approach we take involves a
726: careful reparameterization of $\hat{A}_n$.
727:
728: From section~4, we know that the maximizer $\hat{A}_n(t)=
729: \int_0^t\left\{\pp_n W(s;\hat{\theta}_n)\right\}^{-1}$
730: $\times d\tilde{G}_n(s)$,
731: where $\tilde{G}_n(t)\equiv\pp_n N(t)$ and $W(\cdot;\cdot)$ is
732: as defined in~(\ref{c4:e2}). It is easy to see that for all $n$
733: large enough and all $\theta$ sufficiently close to $\theta_0$,
734: $t\mapsto\pp_n W(t;\theta)$ is bounded below and above and in
735: total variation, with large probability.
736: Thus, if we use the reparameterization
737: $\Gamma(\cdot)\mapsto A^{(\Gamma)}_n(\cdot)\equiv\int_0^{(\cdot)}
738: \exp\{-\Gamma(s)\}d\tilde{G}_n(s)$, and
739: maximize $\tilde{L}_n(\xi,A^{(\Gamma)}_n)$ over $\xi$ and $\Gamma$, where
740: $\Gamma\in BV$, we will achieve the same NPMLE as before. Note that
741: the $\Gamma$ component of the maximizer of $\tilde{L}(\xi,A^{(\Gamma)}_n)$
742: is therefore just $\hat{\Gamma}_n(\cdot)\equiv-\log\pp_n W(\cdot;
743: \hat{\theta}_n)$.
744:
745: Define $\Gamma_0(\cdot)\equiv-\log(PW(\cdot;\theta_0))$ and
746: $\theta_n(\zeta,\gamma,\Gamma)\equiv(\zeta,\gamma,A^{(\Gamma)}_n)$,
747: and note that
748: the reparameterized NPMLE $(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)$
749: is the maximizer of the process
750: \begin{eqnarray*}
751: \lefteqn{
752: (\zeta,\gamma,\Gamma)\mapsto\tilde{X}_n(\zeta,\gamma,\Gamma)
753: \;\equiv\;\tilde{L}_n(\zeta,\gamma,A^{(\Gamma)}_n)-\tilde{L}_n(\zeta_0,
754: \gamma_0,A^{(\Gamma_0)}_n)}&&\\
755: &&\mbox{\hspace{-0.1in}}=\pp_n\left\{\int_0^{\tau}\left[-\Gamma(t)+\Gamma_0(t)
756: +\log\frac{\dot{G}(H^{\theta_n(\zeta,\gamma,\Gamma)}(t))}
757: {\dot{G}(H^{\theta_n(\zeta_0,\gamma_0,\Gamma_0)}(t))} +
758: (r_{\xi}-r_{\xi_0})(t;Z,Y)\right]\right.\\
759: &&\mbox{\hspace{0.2in}}
760: \left.\rule[-0.3cm]{0cm}{1.0cm}\times dN(t)
761: -(G(H^{\theta_n(\zeta,\gamma,\Gamma)}(V))
762: -G(H^{\theta_n(\zeta_0,\gamma_0,\Gamma_0)}(V)))
763: \right\}.
764: \end{eqnarray*}
765: We will argue shortly that $\tilde{X}_n$ is uniformly consistent for
766: the function
767: \begin{eqnarray*}
768: \lefteqn{(\zeta,\gamma,\Gamma)\mapsto\tilde{X}(\zeta,\gamma,\Gamma)}&&\\
769: &\equiv&P\left\{\int_0^{\tau}\left[-\Gamma(t)+\Gamma_0(t)
770: +\log\frac{\dot{G}(H^{\theta_0(\zeta,\gamma,\Gamma)}(t))}
771: {\dot{G}(H^{\theta_0}(t))} +
772: (r_{\xi}-r_{\xi_0})(t;Z,Y)\right]dN(t)\right.\\
773: &&\left.\rule[-0.3cm]{0cm}{1.0cm}-(G(H^{\theta_0(\zeta,\gamma,\Gamma)}(V))
774: -G(H^{\theta_0}(V)))
775: \right\},
776: \end{eqnarray*}
777: where $\theta_0(\zeta,\gamma,\Gamma)\equiv(\zeta,\gamma,
778: A^{(\Gamma)}_0)$, $A^{(\Gamma)}_0(\cdot)\equiv\int_0^{(\cdot)}
779: \exp\{-\Gamma(s)\}d\tilde{G}_0(s)$, and $\tilde{G}_0(t)\equiv P N(t)$.
780: It will occasionally be useful to use the shorthand
781: $\lambda\equiv(\gamma,\Gamma)$,
782: $\hat{\lambda}_n\equiv(\hat{\gamma}_n,\hat{\Gamma}_n)$ and
783: $\lambda_0\equiv(\gamma_0,\Gamma_0)$.
784:
785: Define the modified parameter space
786: $\Theta^{\ast}\equiv (a,b)\times\Upsilon\times B_2\times B_1
787: \times BV$; and, for each
788: $h=(h_1,h_2,h_3,h_4,h_5)\in\re\times{\cal H}_{\infty}$, define
789: the metric $\rho_2(h)\equiv(|h_1|+|h_2|^2+\|h_3\|^2+\|h_4\|^2
790: +\|h_5\|_{\infty}^2)^{1/2}$, where $\|\cdot\|_{\infty}$ is the uniform
791: norm. Note that $|h_1|$ is deliberately not squared.
792: For each $\epsilon>0$ and $k<\infty$, define $B_{\epsilon}^{\ast k}
793: \equiv\{(\zeta,\lambda)\in\Theta^{\ast}:
794: \rho_2((\zeta,\lambda)-(\zeta_0,\lambda_0))<\epsilon,\|\Gamma\|_v\leq k\}$.
795: Note that for some $k_0<\infty$ and
796: any $\epsilon>0$, $(\hat{\zeta}_n,\hat{\lambda}_n)$
797: is eventually in $B_{\epsilon}^{\ast k_0}$ for all
798: $n$ large enough by theorem~\ref{t1} above combined with
799: lemma~\ref{l5} below:
800: \begin{lemma}\label{l5}
801: There exists a $k_0<\infty$ such that
802: $\limsup_{n\rightarrow\infty}\|\hat{\Gamma}_n\|_v\leq k_0$ and
803: $\lim_{n\rightarrow\infty}\|\hat{\Gamma}_n-\Gamma_0\|_{\infty}=0$
804: outer almost surely.
805: \end{lemma}
806:
807: Now we study the local behavior of $\tilde{X}$. First
808: fix $\zeta\in(a,b)$. Since, for any $g\in BV$,
809: \[\left.\frac{\partial A^{(\Gamma+t g)}_0(\cdot)}{\partial t}\right|_{t=0}
810: =-\int_0^{(\cdot)}g(s)dA^{(\Gamma)}_0(s),\]
811: we obtain that the first derivative of $(\gamma,\Gamma)\mapsto
812: \tilde{X}(\zeta,\gamma,\Gamma)$ in the direction $h\in{\cal H}_{\infty}$,
813: is precisely $-PU_{\zeta}^{\tau}(\gamma,A^{(\Gamma)}_0)(h)$. Moreover,
814: by definition of the score and information operators,
815: the second derivative in the same direction is
816: $-\psi^h_{\Gamma}\left(\sigma_{\left(\zeta,\gamma,
817: A^{(\Gamma)}_0\right)}(h)\right)$,
818: where $\psi^h_{\Gamma}\equiv\left(h_1,h_2,h_3,\int_0^{(\cdot)}
819: h_4(s)dA^{(\Gamma)}_0(s)\right)$. At the point $(\zeta,\gamma,\Gamma)=
820: (\zeta_0,\gamma_0,\Gamma_0)$, the first derivative is $0$,
821: while the second derivative is $<0$, by lemma~\ref{l4}.
822: By the smoothness of the score
823: and information operators ensured by condition~D1 and~D2,
824: and by the arbitrariness
825: of $h$, we now have that the function
826: $(\gamma,\Gamma)\mapsto\tilde{X}(\zeta,\gamma,\Gamma)$
827: is concave for every $(\zeta,\gamma,\Gamma)\in B_{\epsilon}^{\ast k_0}$, for
828: sufficiently small $\epsilon$.
829:
830: Now note that $\tilde{X}(\zeta,\gamma,\Gamma)=P l^{\ast}(\zeta,\gamma,
831: \Gamma)-P l^{\ast}(\zeta_0,\gamma_0,\Gamma_0)$, where
832: $l^{\ast}(\zeta,\gamma,\Gamma)\equiv$
833: \begin{eqnarray}
834: &&\;\;-\int_0^{\tau}\Gamma(t)dN(t)
835: +l_1^{\psi(\gamma,\Gamma)}(V,\delta,Z)\ind\{Y\leq\zeta\}
836: +l_2^{\psi(\gamma,\Gamma)}(V,\delta,Z)
837: \ind\{Y>\zeta\},\label{new.j14.e1}
838: \end{eqnarray}
839: and where $l_j^{\psi}$, $j=1,2$, are as defined in section~3, and
840: $\psi(\gamma,\Gamma)\equiv(\gamma,A^{(\Gamma)}_0)$.
841: By condition~B2, we now have that for small enough $\epsilon>0$,
842: $\zeta\mapsto\tilde{X}(\zeta,\gamma,\Gamma)$
843: is right and left continuously differentiable for all
844: $(\zeta,\gamma,\Gamma)\in B_{\epsilon}^{\ast k_0}$,
845: with left partial derivative
846: \[\dot{X}_{\zeta}^{-}(\gamma,\Gamma)
847: \equiv P\left\{\left.l_1^{\psi(\gamma,\Gamma)}(V,\delta,Z)
848: -l_2^{\psi(\gamma,\Gamma)}(V,\delta,Z)\right|Y=\zeta\right\}\]
849: and right partial derivative
850: \[\dot{X}_{\zeta}^{+}(\gamma,\Gamma)
851: \equiv P\left\{\left.l_1^{\psi(\gamma,\Gamma)}(V,\delta,Z)
852: -l_2^{\psi(\gamma,\Gamma)}(V,\delta,Z)\right|Y=\zeta+\right\}.\]
853:
854: We now have the following lemmas on the local behavior of $\tilde{X}$
855: with respect to $\zeta$:
856: \begin{lemma}\label{l6}
857: Under the conditions of section~2,
858: $\dot{X}_{\zeta_0}^{-}(\gamma_0,\Gamma_0)>0$ and
859: $\dot{X}_{\zeta_0}^{+}(\gamma_0,\Gamma_0)<0$.
860: \end{lemma}
861: \begin{lemma}\label{l7}
862: There exists $\epsilon_1,k_1>0$ such that
863: $\tilde{X}(\zeta,\gamma,\Gamma)\leq -k_1|\zeta-\zeta_0|$
864: for all $(\zeta,\gamma,\Gamma)\in B_{\epsilon_1}^{\ast k_0}$.
865: \end{lemma}
866:
867: The two previous lemmas can be combined with the next lemma,
868: lemma~\ref{l8}, to yield $\sqrt{n}$ rates for all of the parameters
869: (theorem~\ref{t.l9}):
870: \begin{lemma}\label{l8}
871: There exists an $\epsilon_2>0$ such that $D_n\equiv\sqrt{n}(\tilde{X}_n
872: -\tilde{X})$ converges weakly to a tight mean zero Gaussian process
873: $D_0$, in $\ell^{\infty}(B_{\epsilon_2}^{\ast k_0})$, for which
874: $D_0(\zeta,\gamma,\Gamma)\rightarrow 0$ in probability, as
875: $\rho_2((\zeta,\gamma,\Gamma)-(\zeta_0,\gamma_0,\Gamma_0))
876: \rightarrow 0$.$\Box$
877: \end{lemma}
878:
879: \begin{theorem}\label{t.l9}
880: Under the conditions of section~2,
881: $\sqrt{n}|\hat{\zeta}_n-\zeta_0|=O_P(1)$,
882: $\sqrt{n}\|\hat{\psi}_n-\psi_0\|_{\infty}=O_P(1)$, and
883: $\sqrt{n}\|\hat{\Gamma}_n-\Gamma_0\|_{\infty}=O_P(1)$.
884: \end{theorem}
885:
886: To refine the rate for $\hat{\zeta}_n$, we need two more lemmas,
887: lemmas~\ref{l10} and~\ref{l11} below. We will
888: also need to define the process $\zeta\mapsto\tilde{X}_n^{\ast}(\zeta)
889: \equiv$
890: \begin{eqnarray*}
891: &&\pp_n\left\{\int_0^{\tau}\left[
892: \log\frac{\dot{G}(H^{\theta_0(\zeta,\gamma_0,\Gamma_0)}(t))}
893: {\dot{G}(H^{\theta_0}(t))} +
894: (r_{(\zeta,\gamma_0)}-r_{\xi_0})(t;Z,Y)\right]dN(t)\right.\\
895: &&\mbox{\hspace{0.4in}}
896: \left.\rule[-0.3cm]{0cm}{1.0cm}-(G(H^{\theta_0(\zeta,\gamma_0,\Gamma_0)}(V))
897: -G(H^{\theta_0}(V)))\right\}.
898: \end{eqnarray*}
899: \begin{lemma}\label{l10}
900: $0\leq \tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)-
901: \tilde{X}_n^{\ast}(\hat{\zeta}_n)\leq O_P(n^{-1})$.
902: \end{lemma}
903: \begin{lemma}\label{l11}
904: There exists an $\epsilon_3>0$ and $k_2<\infty$ such that,
905: for all $0\leq\epsilon\leq\epsilon_3$ and $n\geq 1$,
906: $\Exp{\sup_{|\zeta-\zeta_0|\leq\epsilon}|\tilde{D}_n(\zeta)|}
907: \leq k_2\sqrt{\epsilon}$,
908: where $\tilde{D}_n(\zeta)\equiv\sqrt{n}(\tilde{X}_n^{\ast}(\zeta)
909: -\tilde{X}(\zeta,\lambda_0))$.
910: \end{lemma}
911:
912: We now have the following theorem about the convergence rate for
913: $\hat{\zeta}_n$:
914: \begin{theorem}\label{t2}
915: Under the conditions of section~2, $n|\hat{\zeta}_n-\zeta_0|=O_P(1)$.
916: \end{theorem}
917:
918: {\it Proof.} The method of proof involves a ``peeling device'' (see,
919: for example, the proof of theorem~5.1 of \cite{ih81},
920: or the proof of theorem~2 of \cite{p03}).
921: Fix $\epsilon>0$. By consistency and lemma~\ref{l5},
922: $P((\hat{\zeta}_n,\hat{\lambda}_n)\in B_{\epsilon_4}^{\ast k_0})
923: \geq 1-\epsilon$ for
924: all $n$ large enough, where $\epsilon_4=
925: \epsilon_1\wedge \epsilon_2\wedge\epsilon_3$.
926: By lemma~\ref{l10}, there exists an $M_1^{\ast}<\infty$ such that
927: $P(\tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)-
928: \tilde{X}_n^{\ast}(\hat{\zeta}_n)>M_1^{\ast}/n)\leq\epsilon$.
929: For integers $k\geq 1$, let $m_k\equiv k^4$. We now have, for any
930: integer $k\geq 1$, that
931: $\limsup_{n\rightarrow\infty}P\left(n|\hat{\zeta}_n-\zeta_0|>m_k\right)$
932: \begin{eqnarray}
933: &\leq&\limsup_{n\rightarrow\infty}P\left(
934: n|\hat{\zeta}_n-\zeta_0|>m_k,\;(\hat{\zeta}_n,\hat{\lambda}_n)
935: \in B_{\epsilon_4}^{\ast k_0},\right.\nonumber\\
936: &&\left.\tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)
937: -\tilde{X}_n^{\ast}(\hat{\zeta}_n)
938: \leq \frac{M_1^{\ast}}{n}\right)+2\epsilon\nonumber\\
939: &\leq&\limsup_{n\rightarrow\infty}P\left(\sup_{\zeta:\,m_k/n<|\zeta-\zeta_0|
940: \leq\epsilon_4}\tilde{X}_n^{\ast}(\zeta)\geq -\frac{M_1^{\ast}}{n}\right)
941: +2\epsilon\nonumber\\
942: &\leq&\limsup_{n\rightarrow\infty}\sum_{j=k}^{k_{\epsilon_4}}
943: P\left(\sup_{\zeta:\,m_j/n<|\zeta-\zeta_0|\leq (m_{j+1}/n)
944: \wedge\epsilon_4}\tilde{D}_n(\zeta)\right.\label{t2.e1}\\
945: &&\left.\mbox{\hspace{0.4in}}\geq\sqrt{n}\left(\frac{k_1m_j}{n}
946: -\frac{M_1^{\ast}}{n}\right)\right)+2\epsilon,\nonumber
947: \end{eqnarray}
948: by lemma~\ref{l7}, where $k_{\epsilon_4}=
949: \min\{k:\,m_{k+1}\geq n\epsilon_4\}$. But, by lemma~\ref{l11},
950: \[\mbox{(\ref{t2.e1})}\leq\limsup_{n\rightarrow\infty}
951: \sum_{j=k}^{k_{\epsilon_4}}\frac{k_2\sqrt{m_{j+1}}}{k_1m_j-M_1^{\ast}}
952: +2\epsilon\leq\sum_{j=k}^{\infty}\frac{k_2(j+1)^2}
953: {k_1j^4-M_1^{\ast}}+2\epsilon.\]
954: We can now choose $k<\infty$ large enough so that this last term
955: $\leq 3\epsilon$. Since $\epsilon>0$ was arbitrary, we now have that
956: $\lim_{m\rightarrow\infty}\limsup_{n\rightarrow\infty}
957: P(n|\hat{\zeta}_n-\zeta_0|>m)=0$, and the desired conclusion follows.$\Box$
958:
959: \section{Weak convergence of the estimators}
960:
961: \subsection{The asymptotic distribution of the change-point
962: estimator}
963:
964: Denote $\mathbb{U}_{n,M}\equiv\{u=n(\zeta-\zeta_0):\zeta\in[a,b],|u|\leq M\}$
965: and $\zeta_{n,u}\equiv\zeta_0+u/n$.
966: The limiting distribution of $n(\hat{\zeta}_n-\zeta_0)$
967: will be deduced from the behavior of the restriction of
968: the process $u \rightarrow n[\tilde{L}_n(\hat{\psi}_n,\zeta_{n,u})
969: -\tilde{L}_n(\hat{\psi}_n,\zeta_0)]$ to
970: the compact set $\mathbb{U}_{n,M}$, for $M$ sufficiently large.
971: \begin{theorem}\label{t3}
972: The following approximation holds for all $M > 0$, as $n \rightarrow \infty$:
973: \[u\mapsto n[\tilde{L}_n(\hat{\psi}_n,\zeta_{n,u})
974: -\tilde{L}_n (\hat{\psi}_n,\zeta_0)]=
975: Q_n(u)+o_P^{\mathbb{U}_{n,M}}(1),\]
976: where $o_P^B(1)$ denotes a term going to zero in probability uniformly
977: over the set $B$ and $\mbox{Q}_n(u)=$
978: \begin{eqnarray*}
979: n\mathbb{P}_n
980: \left\{\left(\ind\{\zeta_{n,u}<Y \le \zeta_0\}
981: - \ind\{\zeta_0<Y \le \zeta_{n,u}\}\right)
982: \left[l_2^{\psi_0}(V,\delta,Z)-l_1^{\psi_0}(V,\delta,Z)\right]\right\}.
983: \end{eqnarray*}
984: \end{theorem}
985:
986: Let $Q_n(u)=Q_n^+(u)\ind\{u>0\}-Q_n^-(u)\ind\{u<0\}$.
987: We now study the weak convergence of $Q_n$ as a random variable on the
988: space of cadlag functions $D$ with the Skorohod topology, and on
989: its restriction to the space $D_M$ of cadlag functions on $[-M,
990: M]$, for any $M > 0$, similar to the approach taken in \cite{p03}.
991: In order to describe the asymptotic distribution of $Q_n$,
992: let $\nu^+$ and $\nu^-$ be two independent jump processes on~$\mathbb{R}$
993: such that $\nu^+ (s)$ is a Poisson variable with parameter
994: $s^+\tilde{h}(\zeta_0)$ and $\nu^- (s)$ is a Poisson variable with
995: parameter $(-s)^+\tilde{h}(\zeta_0)$. Here,
996: $u^+$ denotes $u\vee 0$. Let $(\check{V}_k^+)_{k\ge 1}$
997: and $(\check{V}_k^-)_{k \ge 1}$ be independent sequences of i.i.d.
998: random variables with characteristic functions
999: \[\phi^+(t)=P\left[e^{it\check{V}_k^+}\right]
1000: =P\left[\left.e^{it\left\{l_1^{\psi_0}(V,\delta,Z)-l_2^{\psi_0}(V,\delta,Z)
1001: \right\}}\right|Y=\zeta_0^+\right],\]
1002: and
1003: \[\phi^-(t)=P\left[e^{it\check{V}_k^-}\right]
1004: =P\left[\left.e^{it\left\{l_1^{\psi_0}(V,\delta,Z)-l_2^{\psi_0}(V,\delta,Z)
1005: \right\}}\right|Y=\zeta_0\right],\]
1006: respectively, where $(\check{V}_k^+)_{k \ge 1}$ and
1007: $(\check{V}_k^-)_{k \ge 1}$ are independent of $\nu^+$ and $\nu^-$.
1008:
1009: Let $Q(s) = Q^+(s)\ind\{s > 0\} - Q^-(s)\ind\{s < 0\}$ be the
1010: right-continuous jump process defined by
1011: \[Q^+(s)=\sum_{0 \le k \le {\nu}^+(s)}\check{V}_k^+, ~ ~ ~
1012: Q^-(s)=\sum_{0 \le k \le {\nu}^-(s+)}\check{V}_k^-, \]
1013: where $\check{V}_0^+=\check{V}_0^-=0$.
1014: Using a modification of the arguments in \cite{p03}, we obtain:
1015: \begin{theorem}\label{t4}
1016: Under the regularity conditions of section~2,
1017: the process $Q_n$ converges weakly to $Q$ in $D_M$, for every $M > 0$;
1018: $n(\hat{\zeta}_n - \zeta_0) = \argmax_{u}Q_n(u) + o_p(1)$ which
1019: converges weakly to $\hat{v}_Q \equiv \argmin\{|v| : Q(v) =
1020: \argmax\,Q\}$; and $n(\hat{\zeta}_n-\zeta_0)$ and
1021: $\sqrt{n}\pp_nU_{\zeta_0}^{\tau}(\psi_0)(h)$ are asymptotically
1022: independent for all $h\in{\cal H}_{\infty}$.
1023: \end{theorem}
1024:
1025: \subsection{Asymptotic normality of the regular parameters} We use
1026: Hoffmann-J{\o}rgensen weak convergence as described in \cite{vw96}.
1027: We have the
1028: following result:
1029: \begin{theorem}\label{t5}
1030: Under the conditions of theorem 1, $\sqrt{n}(\hat \psi_n -
1031: \psi_0)$ is asymptotically linear, with influence function $\tilde
1032: l(h) = U_{\zeta_0}^{\tau}(\psi_0)(\sigma_{\theta_0}^{-1}(h))$, $h
1033: \in{\cal H}_1$, converging weakly in the uniform norm to a tight, mean
1034: zero Gaussian process $\mathbb{Z}$ with covariance
1035: $E[\tilde l(g) \tilde l(h)]$, for all $g, h \in H_1$. Thus
1036: $n(\hat{\zeta}_n-\zeta_0)$ and $\sqrt{n}(\hat{\psi}_n-\psi_0)$
1037: are asymptotically independent.
1038: \end{theorem}
1039:
1040: \begin{remark}\label{r1}
1041: Since $\sqrt{n}(\hat \psi_n - \psi_0)$ is asymptotically linear,
1042: with influence function contained in
1043: the closed linear span of the tangent space (since
1044: $\sigma_{\theta_0}$ is continuously invertible), $\hat\psi_n$ is
1045: regular and hence as efficient as if $\zeta_0$ were known, by
1046: Theorem 5.2.3 and Theorem 5.2.1 of \cite{bkrw98}.
1047: \end{remark}
1048:
1049: \section{Inference when $\alpha_0\neq 0$ or $\eta_0\neq 0$}
1050: In this section we develop Monte Carlo methods for inference for the parameter estimators when
1051: it is known that either $\alpha_0\neq 0$ or $\eta_0\neq 0$, i.e., it is known that condition~C2
1052: is satisfied. In section~9,
1053: we develop a hypothesis testing procedure to assess whether
1054: $H_0:\alpha_0=0=\eta_0$ holds (i.e., that~C2 does not hold). When it is known that $H_0$ holds,
1055: the model reduces to the usual transformation model
1056: (see \cite{sv04}),
1057: and thus validity of the bootstrap will follow from arguments
1058: similar to those used in the proof of
1059: corollary~1 of \cite{klf04}.
1060:
1061: \subsection{Inference for the change-point} One possibility for
1062: inference for $\zeta$ is to use the subsampling bootstrap \cite{pr94}
1063: which is guaranteed to work, provided the subsample
1064: sizes $\ell_n$ satisfy $\ell_n\rightarrow\infty$ and $\ell_n/n\rightarrow 0$.
1065: However, this approach is very computationally intense since,
1066: for each subsample, the likelihood must be maximized over the entire
1067: parameter space. To ameliorate the computational strain, we propose as
1068: an alternative the following specialized parametric bootstrap.
1069: Let $\tilde{F}_+$ and $\tilde{F}_-$ be the
1070: distribution functions corresponding to the moment generating functions
1071: $\phi^+$ and $\phi^-$, respectively. We need to make the following
1072: additional assumption:
1073: \begin{enumerate}
1074: \item[B5:] Both $\tilde{F}_+$ and $\tilde{F}_-$ are continuous.
1075: \end{enumerate}
1076: Now let $\tilde{m}_n$ be the minimum of the number of $Y$ observations
1077: in the sample $>\hat{\zeta}_n$ and the number of $Y$ observations $<\hat{\zeta}_n$. Now
1078: choose sequences of possibly data dependent integers $1\leq C_{1,n}<C_{2,n}\leq \tilde{m}_n$
1079: such that $C_{1,n}\rightarrow\infty$,
1080: $C_{2,n}-C_{1,n}\rightarrow\infty$, and
1081: $C_{2,n}/n\rightarrow 0$, in probability, as $n\rightarrow\infty$.
1082: Note that if one
1083: chooses $C_{1,n}$ to be the closest integer to $\tilde{m}_n^{1/4}$ and $C_{2,n}$ to be
1084: the closest integer to $\tilde{m}_n^{3/4}$,
1085: the given requirements will be satisfied since
1086: $\tilde{m}_n\rightarrow\infty$, in probability, by assumption~B1. Let
1087: $X_{(1)},\ldots,X_{(n)}$ be the complete data observations corresponding
1088: to the order statistics $Y_{(1)},\ldots,Y_{(n)}$ of the $Y$ observations.
1089: Also let $\tilde{k}_n\equiv C_{2,n}-C_{1,n}+1$, and define $\tilde{l}_n$
1090: to be the integer satisfying $\hat{\zeta}_n=Y_{(\tilde{l}_n)}$.
1091: The existence of this integer follows from the form of the MLE.
1092:
1093: Now, for $j=1,\ldots,\tilde{k}_n$, and any $\psi\in\Psi$, define
1094: \begin{eqnarray*}
1095: \check{V}_{j,\psi}^+&\equiv& l_1^{\psi}(
1096: V_{(\tilde{l}_n+C_{1,n}+j-1)},\delta_{(\tilde{l}_n+C_{1,n}+j-1)},Z_{(\tilde{l}_n+C_{1,n}+j-1)})\\
1097: &&-l_2^{\psi}(V_{(\tilde{l}_n+C_{1,n}+j-1)},\delta_{(\tilde{l}_n+C_{1,n}+j-1)},
1098: Z_{(\tilde{l}_n+C_{1,n}+j-1)}),\\
1099: \check{V}_{j,\psi}^-&\equiv&l_1^{\psi}(
1100: V_{(\tilde{l}_n-C_{1,n}-j)},\delta_{(\tilde{l}_n-C_{1,n}-j)},Z_{(\tilde{l}_n-C_{1,n}-j)})\\
1101: &&-l_2^{\psi}(V_{(\tilde{l}_n-C_{1,n}-j)},\delta_{(\tilde{l}_n-C_{1,n}-j)},
1102: Z_{(\tilde{l}_n-C_{1,n}-j)}),
1103: \end{eqnarray*}
1104: $Y^+_j\equiv Y_{(\tilde{l}_n+C_{1,n}+j-1)}$, and $Y^-_j\equiv Y_{(\tilde{l}_n-C_{1,n}-j)}$. Also let $\hat{F}_+^n$ be
1105: the data-dependent distribution function
1106: for a random variable drawn with replacement from
1107: $\{\check{V}_{1,\hat{\psi}_n}^+,\ldots,\check{V}_{\tilde{k}_n,
1108: \hat{\psi}_n}^+\}$, and let $\hat{F}_-^n$ be
1109: the data-dependent distribution function
1110: for a random variable drawn with replacement from
1111: $\{\check{V}_{1,\hat{\psi}_n}^-,\ldots,$ $\check{V}_{\tilde{k}_n,
1112: \hat{\psi}_n}^-\}$. By the smoothness
1113: of the terms involved, it is easy to verify
1114: that both $\sup_{1\leq j\leq\tilde{k}_n}$ $\left|\check{V}_{j,\hat{\psi}_n}^+
1115: -\check{V}_{j,\psi_0}^+\right|=o_P(1)$ and
1116: $\sup_{1\leq j\leq\tilde{k}_n}\left|\check{V}_{j,\hat{\psi}_n}^-
1117: -\check{V}_{j,\psi_0}^-\right|=o_P(1)$. Moreover, by assumption~B2(i), the
1118: fact that $n(\hat{\zeta}_n-\zeta_0)=O_P(1)$, and the conditions on $C_{1,n}$
1119: and $C_{2,n}$, we have that both $P(Y^-_{1}<\zeta_0<Y^+_{1})\rightarrow 1$ and
1120: $Y^+_{\tilde{k}_n}-Y^-_{\tilde{k}_n}=o_P(1)$. Thus, by assumption~B2(ii),
1121: the collection $\{\check{V}^+_{1,\psi_0},\ldots,$
1122: $\check{V}^+_{\tilde{k}_n,\psi_0}\}$ converges
1123: in distribution to an i.i.d. sample of random
1124: variables with characteristic function
1125: $\phi^+$, while the collection $\{\check{V}^-_{1,\psi_0},\ldots,
1126: \check{V}^-_{\tilde{k}_n,\psi_0}\}$ is
1127: independent of the first collection and converges
1128: in distribution to an i.i.d. sample of random
1129: variables with characteristic function $\phi^-$. By assumption~B5
1130: and the fact that $\tilde{k}_n\rightarrow\infty$, in probability,
1131: we now have that both
1132: $\sup_{v\in\re}|\hat{F}_+^n(v)-\tilde{F}_+(v)|=o_P(1)$ and
1133: $\sup_{v\in\re}|\hat{F}_-^n(v)-\tilde{F}_-(v)|=o_P(1)$.
1134:
1135: Now let $\hat{h}_n$ be a consistent estimator of $\tilde{h}(\zeta_0)$.
1136: Such an estimator can be obtained from a kernel density estimator of
1137: $\tilde{h}$ based on the $Y$ observations and evaluated at $\hat{\zeta}_n$.
1138: The basic idea of our parametric bootstrap is to create a stochastic
1139: process $\hat{Q}_n$ defined similarly to the process $Q$ described
1140: in section~7.1. To this end,
1141: let $\hat{\nu}^+$ and $\hat{\nu}^-$ be two independent jump processes
1142: defined on the interval $\tilde{B}_n\equiv
1143: [-n(\hat{\zeta}_n-a),n(b-\hat{\zeta}_n)]$
1144: such that $\hat{\nu}^+(s)$ is Poisson with parameter $s^+\hat{h}_n$
1145: and $\hat{\nu}^-(s)$ is Poisson with parameter $(-s)^+\hat{h}_n$.
1146: Also let $(\check{V}_{\ast,k}^+)_{k\geq 1}$ and
1147: $(\check{V}_{\ast,k}^-)_{k\geq 1}$ be two independent sequences of
1148: i.i.d. random variables drawn from $\hat{F}_+^n$ and $\hat{F}_-^n$
1149: and independent of the Poisson processes. Now construct
1150: $u\mapsto\hat{Q}_n(u)\equiv\hat{Q}_n^+(u)\ind\{u>0\}
1151: -\hat{Q}_n^-(u)\ind\{u<0\}$ on the interval $\tilde{B}_n$,
1152: where $\hat{Q}_n^+(u)\equiv
1153: \sum_{0\leq k\leq\hat{\nu}^+(u)}\check{V}_{\ast,k}^+$ and
1154: $\hat{Q}_n^-(u)\equiv\sum_{0\leq k\leq\hat{\nu}^-(u+)}\check{V}_{\ast,k}^-$.
1155: Finally, we compute $\hat{v}_{\ast}\equiv\argmin_{\tilde{B}_n}\left\{|v|:
1156: \hat{Q}_n(v)=\argmax_{\tilde{B}_n}\hat{Q}_n\right\}$.
1157: The following proposition now follows from
1158: the fact that $P(K\in\tilde{B}_n)\rightarrow 1$ for all
1159: compact $K\subset\re$:
1160: \begin{proposition}\label{p1}
1161: The conditional distribution of $\hat{v}_{\ast}$ given the data is
1162: asymptotically equal to the distribution of $\hat{v}_Q$ defined
1163: in theorem~\ref{t4}.
1164: \end{proposition}
1165:
1166: Hence for any $\pi>0$, we can consistently estimate the
1167: $\pi/2$ and $1-\pi/2$ quantiles of $\hat{v}_Q$ based
1168: on a large number of independent draws from $\hat{v}_{\ast}$,
1169: which estimates we will denote by $\hat{q}_{\pi/2}$ and
1170: $\hat{q}_{1-\pi/2}$, respectively. Thus an asymptotically
1171: valid $1-\pi$ confidence interval for $\zeta_0$ is
1172: $[\hat{\zeta}_n-\hat{q}_{1-\pi/2},\hat{\zeta}_n-\hat{q}_{\pi/2}]$.
1173:
1174: \subsection{Inference for regular parameters} Because $\hat{\zeta}_n$
1175: is $n$-consistent for $\zeta_0$, $\zeta_0$ can be treated as known
1176: in constructing inference for the regular parameters.
1177: Accordingly, we propose bootstrapping the likelihood and maximizing
1178: over $\psi$ while holding $\zeta$ fixed at $\hat{\zeta}_n$. This will
1179: significantly reduce the computational demands of the bootstrap.
1180: Also, to avoid the occurrence of ties during resampling,
1181: we suggest the following weighted bootstrap alternative
1182: to the usual nonparametric bootstrap. First generate
1183: $n$ i.i.d. positive random variables $\kappa_1,\ldots,\kappa_n$,
1184: with mean $0<\mu_{\kappa}<\infty$, variance
1185: $0<\sigma_{\kappa}^2<\infty$, and with
1186: $\int_0^{\infty}\sqrt{P(\kappa_1>u)}du<\infty$. Divide each weight
1187: by the sample average of the weights $\bar{\kappa}$, to obtain
1188: ``standardized weights'' $\kappa_1^{\circ},\ldots,\kappa_n^{\circ}$
1189: which sum to~$n$. For a real, measurable function $f$, define the
1190: weighted empirical measure $\pp_n^{\circ}f\equiv n^{-1}
1191: \sum_{i=1}^n\kappa_i^{\circ}f(X_i)$. Recall that the nonparametric bootstrap
1192: empirical measure $\pp_n^{\bullet}f\equiv n^{-1}\sum_{i=1}^n
1193: \kappa_i^{\bullet}f(X_i)$ uses multinomial weights
1194: $\kappa_1^{\bullet},\ldots,\kappa_n^{\bullet}$,
1195: where $\Exp{\kappa_i^{\bullet}}=1$, $i=1,\ldots,n$, and
1196: $\sum_{i=1}^n\kappa_i^{\bullet}=n$ almost surely.
1197:
1198: The proposed weighted bootstrap estimate $\hat{\psi}_n^{\circ}$
1199: is obtained by maximizing $\tilde{L}_n^{\circ}(\psi,\hat{\zeta}_n)$ over
1200: $\psi\in\Psi$, where $\tilde{L}_n^{\circ}$ is obtained by replacing
1201: $\pp_n$ with $\pp_n^{\circ}$ in the definition of $\tilde{L}_n$
1202: from section~3. We can similarly defined a modified nonparametric
1203: bootstrap $\hat{\psi}_n^{\bullet}$ as the $\argmax$ of
1204: $\psi\mapsto\tilde{L}_n^{\bullet}(\psi,\hat{\zeta}_n)$, where
1205: $\tilde{L}_n^{\bullet}$ is obtained by replacing $\pp_n$ with
1206: $\pp_n^{\bullet}$ in the definition of $\tilde{L}_n$. The following
1207: corollary establishes the validity of both kinds
1208: of bootstraps:
1209: \begin{corollary}\label{c1}
1210: Under the conditions of theorem~\ref{t5}, the conditional
1211: bootstrap of $\hat{\psi}_n$, based on either
1212: $\hat{\psi}_n^{\bullet}$ or $\hat{\psi}_n^{\circ}$,
1213: is asymptotically consistent for
1214: the limiting distribution $\mathbb{Z}$ in the following sense:
1215: Both $\sqrt{n}(\hat{\psi}_n^{\bullet}-\hat{\psi}_n)$ and
1216: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\hat{\psi}_n^{\circ}
1217: -\hat{\psi}_n)$ are asymptotically measurable, and both
1218: \begin{enumerate}
1219: \item[(i)] $\sup_{g\in BL_1}\left|E_{\bullet}g\left(
1220: \sqrt{n}(\hat{\psi}_n^{\bullet}-\hat{\psi}_n)\right)
1221: -Eg(\mathbb{Z})\right|\rightarrow 0$ in outer probability and
1222: \item[(ii)] $\sup_{g\in BL_1}\left|E_{\circ}g\left(
1223: \sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})
1224: (\hat{\psi}_n^{\circ}-\hat{\psi}_n)\right)
1225: -Eg(\mathbb{Z})\right|\rightarrow 0$ in outer probability,
1226: \end{enumerate}
1227: where $BL_1$ is the space of functions mapping
1228: $\re^{d+q+1}\times\ell^{\infty}[0,\tau]\mapsto\re$ which are
1229: bounded in absolute value by~1 and have Lipschitz norm $\leq 1$.
1230: Here, $E_{\bullet}$ and $E_{\circ}$ are expectations that are
1231: taken over the multinomial and standardized weights, respectively,
1232: conditional on the data.
1233: \end{corollary}
1234:
1235: \begin{remark}\label{r2}
1236: As discussed in remark~15 of \cite{klf04}, the
1237: choice of weights $\kappa_1,\ldots,\kappa_n$ in this kind of
1238: setting does not effect the first order asymptotics. However,
1239: it may have an effect on finite samples. In our experience, we
1240: have found that both exponential and truncated exponential weights
1241: perform quite well.
1242: \end{remark}
1243:
1244: \section{Test for the presence of a change-point}
1245: Constructing a valid
1246: test of the null hypothesis that there is no change-point,
1247: $H_0:\alpha_0=0=\eta_0$, poses an interesting challenge.
1248: Since the location of the change-point is no longer identifiable
1249: under $H_0$, this is an example of the issue studied in
1250: \cite{a01}. The test statistic we propose is
1251: a functional of the $\alpha$ and $\eta$ components of the score process,
1252: $\zeta\mapsto
1253: \hat{S}_{1}(\zeta)\equiv\sqrt{n}\pp_n
1254: (U^{\tau}_{\zeta, 1}(\hat \psi_0),
1255: U^{\tau}_{\zeta,2}(\hat \psi_0)')'$, where $\zeta\in[a,b]$,
1256: $\hat\psi_0 \equiv (0,0, \hat\beta_0,\hat A_0)$,
1257: and where $(\hat{\beta}_0,\hat{A}_0)$
1258: is the restricted MLE of $(\beta_0, A_0)$ under the
1259: assumption that $\alpha=0$ and $\eta=0$. This MLE is relatively easy to
1260: compute since estimation of $\zeta$ is not needed. Specifically,
1261: we have from section~3, that $\hat{\psi}_0$ is the maximizer of
1262: \begin{eqnarray}
1263: \psi&\mapsto&\pp_n\left\{\delta\log(n\Delta A(V))+l_1^{\psi}(V,\delta,Z)
1264: \right\}.\label{s9.e1}
1265: \end{eqnarray}
1266: We also define for future use
1267: $h\mapsto\hat{S}_{2}(h)\equiv\sqrt{n}\pp_n
1268: (U^{\tau}_{\zeta,3}(\hat\psi_0)(h_3),U^{\tau}_{\zeta,4}(\hat\psi_0)(h_4))'$,
1269: where $h\in{\cal H}_1$. The statistic we propose using is $\hat T_n\equiv\sup
1270: _{\zeta \in [a, b]}\left\{\hat{S}_{1}'(\zeta)\hat V_n^{-1}(\zeta)\right.$
1271: $\left.\times\hat{S}_{1} (\zeta)\right\}$, where
1272: $\hat V_n (\zeta)$ is a consistent
1273: estimator of the covariance of $\hat{S}_{1}(\zeta)$.
1274:
1275: There are several reasons for us to consider the sup functional of score
1276: statistics instead of wald or likelihood ratio statistics. Firstly, the score
1277: statistic is much less computational intense which makes the bootstrap
1278: implementation feasible. Secondly, we choose the sup functional because of
1279: its guarantee to have some power under local alternatives, as argued in
1280: \cite{d87} and which we prove below. We note, however, that \cite{ap94}
1281: argue that certain weighted averages of score statistics are optimal
1282: tests in some settings. A careful analysis of the relative merits of
1283: the two approaches in our setting is beyond the scope of the current paper
1284: but is an interesting topic for future research. However, as a step in
1285: this direction, we will compare $\hat{T}_n$ with the integrated statistic
1286: $\tilde{T}_n\equiv\int_{[a,b]}\left\{\hat{S}_{1} ' (\zeta)
1287: \hat V_n^{-1}(\zeta) \hat{S}_{1} (\zeta)\right\}d\zeta$.
1288:
1289: In this section, we first discuss a Monte Carlo technique which
1290: enables computation of $\hat{V}_n(\zeta)$, so that
1291: $\hat{T}_n$ and $\tilde{T}_n$ can be calculated in the first place,
1292: as well as computation of critical values for hypothesis testing.
1293: We then discuss the asymptotic properties of the statistics
1294: under a sequence of contiguous alternatives so that power can be
1295: verified. Specifically, we assume that all the conditions of section~2
1296: hold except for C2 which we replace with
1297: \begin{enumerate}
1298: \item[C2':] For each $n\geq 1$, $\alpha_0=\alpha_{\ast}/\sqrt{n}$ and
1299: $\eta_0=\eta_{\ast}/\sqrt{n}$, for some fixed $\alpha_{\ast}\in\re$
1300: and $\eta_{\ast}\in\re^q$. The joint distribution of $(C,Z,Y)$ does
1301: not change with $n$.
1302: \end{enumerate}
1303: Note that when $\alpha_{\ast}\neq 0$ or $\eta_{\ast}\neq 0$,
1304: condition~C2' will cause the distribution of the failure time $T$,
1305: given the covariates $(Z,Y)$, to change with $n$, and the
1306: value of $\zeta_0$ will affect this distribution.
1307:
1308: \subsection{Monte Carlo computation and inference}
1309: While the nonparametric bootstrap may be a reasonable approach,
1310: it is unclear how to verify its theoretical properties in this context.
1311: We will use instead the weighted bootstrap, based on the multipliers
1312: $\kappa_1^{\circ},\ldots,\kappa_n^{\circ}$ defined in section~8.2.
1313: Let $\pp_n^{\circ}$ be the corresponding weighted empirical measure,
1314: and define $\hat{\psi}_0^{\circ}$ to be the maximizer of~(\ref{s9.e1})
1315: after replacing $\pp_n$ with $\pp_n^{\circ}$. Also let
1316: $\hat{S}_1^{\circ}(\zeta)\equiv\sqrt{n}\pp_n^{\circ}(U_{\zeta,1}^{\tau}
1317: (\hat{\psi}_0^{\circ}),U_{\zeta,2}^{\tau}(\hat{\psi}_0^{\circ})')'$.
1318: Note that the same sample of weights
1319: $\kappa_1^{\circ},\ldots,\kappa_n^{\circ}$ are used for computing
1320: both $\hat{\psi}_0^{\circ}$ and the process
1321: $\{\hat{S}_1^{\circ}(\zeta),\zeta\in[a,b]\}$, so that the proper dependence
1322: between the score statistic and $\hat{\psi}_0$ will be captured.
1323: The structure of the set-up
1324: only requires considering values of $\zeta$ in the set
1325: $\{Y_{(1)},\ldots,Y_{(n)}\}\cap[a,b]$,
1326: since $\zeta\mapsto\hat{S}_{1}^{\circ}(\zeta)$
1327: does not change over the intervals $[Y_{(j)},Y_{(j+1)})$, $1\leq j\leq n-1$.
1328: Now repeat the bootstrap
1329: procedure a large number of times $\tilde{M}_n$, to obtain
1330: the bootstrapped score processes $\hat{S}_{1,1}^{\circ},\ldots,
1331: \hat{S}_{1,\tilde{M}_n}^{\circ}$. Note that we are allowing
1332: the number of bootstraps to depend on~$n$. Define
1333: $\zeta\mapsto\hat{\mu}_n(\zeta)\equiv\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}
1334: \hat{S}_{1,k}^{\circ}(\zeta)$ and let
1335: \[\zeta\mapsto\hat{V}_n(\zeta)
1336: =\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}\left\{
1337: \hat{S}_{1,k}^{\circ}(\zeta)
1338: -\hat{\mu}_n(\zeta)\right\}\left\{
1339: \hat{S}_{1,k}^{\circ}(\zeta)
1340: -\hat{\mu}_n(\zeta)\right\}'.\]
1341: Now we can compute the test statistics $\hat{T}_n$ and $\tilde{T}_n$
1342: with this choice for $\hat{V}_n$.
1343:
1344: To estimate critical values,
1345: we compute the standardized bootstrap test statistics
1346: $\hat{T}_{n,k}^{\circ}\equiv\sup_{\zeta\in[a,b]}\left\{
1347: \left[\hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]'
1348: \hat{V}_n^{-1}(\zeta)\left[
1349: \hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]\right\}$ and
1350: $\tilde{T}_{n,k}^{\circ}\equiv\int_{[a,b]}\left\{
1351: \left[\hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]'
1352: \hat{V}_n^{-1}(\zeta)\left[
1353: \hat{S}_{1,k}^{\circ}(\zeta)-\hat{\mu}_n(\zeta)\right]\right\}d\zeta$,
1354: for $1\leq k\leq\tilde{M}_n$. For a test of size $\pi$, we compare
1355: the test statistics with the $(1-\pi)$th quantile of the
1356: corresponding $\tilde{M}_n$ standardized bootstrap statistics.
1357: The reason we subtract off the sample mean when computing
1358: the bootstrapped test statistics is to make sure that we
1359: are approximating the null distribution even when the
1360: null hypothesis may not be true. What is a little unusual about this
1361: procedure is that the bootstrap must be performed before the
1362: statistics $\hat{T}_n$ and $\tilde{T}_n$ can be calculated in
1363: the first place. We also reiterate again that we are assuming the
1364: covariates $Z_i(\cdot)$ are observed at all time points
1365: $V_j\leq V_i$ for which $\delta_j=1$. As noted in section~2, we are
1366: aware that this is not necessarily valid in practice. As pointed out by
1367: a referee this is an important issues and it would be worth investigating
1368: whether the bootstrap weighting scheme could be
1369: modified to perform and account
1370: for imputation of the missing covariate values. Nevertheless, this issue
1371: is beyond the scope of this paper and we do not pursue it further here.
1372:
1373: \subsection{Asymptotic properties} In this section we establish
1374: the asymptotic validity of the proposed test procedure.
1375: Let $P$ denote the fixed probability distribution under the null
1376: hypothesis $H_0$, and let $P_n$ be the sequence of probability
1377: distributions under the contiguous sequence of
1378: alternatives $H_1^n$ defined in C2'. Note that $P$ and $P_n$
1379: can be equal if $\alpha_{\ast}=0=\eta_{\ast}$.
1380: We need to study the
1381: proposed procedure under general $P_n$ to determine both its size under
1382: the null and its power under the alternative.
1383: We will use the notation $\weakpn$ to denote
1384: weak convergence under $P_n$. We need the following
1385: lemmas and theorem:
1386:
1387: \begin{lemma}\label{s9.l1}
1388: The sequence of probability measures $P_n$ satisfies
1389: \begin{eqnarray}
1390: \label{c8.e1}\\
1391: \int \left[ \sqrt{n}(d P_n ^{1/2} - d P ^{1/2})-
1392: \frac{1}{2}\left(U_{\zeta_0,1}^{\tau}(\psi_0^{\ast})(\alpha_{\ast})
1393: +U_{\zeta_0,2}^{\tau}(\psi_0^{\ast})(\eta_{\ast})\right)dP^{1/2}
1394: \right]^2 \rightarrow 0,\nonumber
1395: \end{eqnarray}
1396: where $\psi_0^{\ast}\equiv(0,0,\beta_0,A_0)$.
1397: \end{lemma}
1398:
1399: \begin{lemma}\label{s9.l2}
1400: $\|\hat{\psi}_0-\psi_0^{\ast}\|_{\infty}
1401: \rightarrow 0$ in probability under $P_n$.
1402: \end{lemma}
1403:
1404: \begin{theorem}\label{s9.t1}
1405: Under the conditions of section~2, with condition C2 replaced by
1406: C2', $\hat{S}_1$ converges
1407: under $P_n$ in distribution in $l^{\infty}([a,b]^{q+1})$ to the
1408: $(q+1)$-vector process $\zeta\mapsto\mathbb{Z}_{\ast}(\zeta)
1409: +\nu_{\ast}(\zeta)$,
1410: where $\mathbb{Z}_{\ast}$ is a tight, mean zero Gaussian
1411: $(q+1)$-vector process with
1412: $\mbox{cov}[\mathbb{Z}_{\ast}(\zeta_1),\mathbb{Z}_{\ast}(\zeta_2)]
1413: =\Sigma_{\ast}(\zeta_1,\zeta_2)\equiv
1414: \sigma_{\ast}^{11}(\zeta_1\vee\zeta_2)-\sigma_{\ast}^{12}(\zeta_1)
1415: [\sigma_{\ast}^{22}]^{-1}\sigma_{\ast}^{21}(\zeta_2)$, for all
1416: $\zeta_1,\zeta_2\in[a,b]$, where, for each $\zeta\in[a,b]$,
1417: \begin{eqnarray*}
1418: \nu_{\ast}(\zeta)&\equiv&\left\{\sigma_{\ast}^{11}
1419: (\zeta\vee\zeta_0)
1420: -\sigma_{\ast}^{12}(\zeta)[\sigma_{\ast}^{22}]^{-1}
1421: \sigma_{\ast}^{21}(\zeta_0)\right\}
1422: \left(\begin{array}{c}\alpha_{\ast}\\ \eta_{\ast}\end{array}\right),\\
1423: \sigma_{\ast}^{11}(\zeta)
1424: &\equiv&\left(\begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta}^{11}
1425: &\sigma_{\psi_0^{\ast},\zeta}^{12}\\ \\ \sigma_{\psi_0^{\ast},\zeta}^{21}
1426: &\sigma_{\psi_0^{\ast},\zeta}^{22}\end{array}\right),\;\;\;\;
1427: \sigma_{\ast}^{12}(\zeta)\;\;\equiv\;\;\left(
1428: \begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta}^{13}
1429: &\sigma_{\psi_0^{\ast},\zeta}^{14}\\ \\ \sigma_{\psi_0^{\ast},\zeta}^{23}
1430: &\sigma_{\psi_0^{\ast},\zeta}^{24}\end{array}\right),\\
1431: \sigma_{\ast}^{21}(\zeta)&\equiv&\left(
1432: \begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta}^{31}
1433: &\sigma_{\psi_0^{\ast},\zeta}^{32}\\ \\ \sigma_{\psi_0^{\ast},\zeta}^{41}
1434: &\sigma_{\psi_0^{\ast},\zeta}^{42}\end{array}\right),\;\;\;\;
1435: \sigma_{\ast}^{22}\;\;\equiv\;\;\left(
1436: \begin{array}{cc}\sigma_{\psi_0^{\ast},\zeta_0}^{33}
1437: &\sigma_{\psi_0^{\ast},\zeta_0}^{34}\\ \\ \sigma_{\psi_0^{\ast},\zeta_0}^{43}
1438: &\sigma_{\psi_0^{\ast},\zeta_0}^{44}\end{array}\right),
1439: \end{eqnarray*}
1440: and where $\sigma_{\theta}^{jk}$, for $1\leq j,k\leq 4$, is as defined
1441: in section~5.2.
1442: \end{theorem}
1443:
1444: The following is the main result on the limiting distribution of the
1445: test statistics. For the remainder of this section, we require
1446: condition~B4 to hold. As will be shown in the proof of corollary~\ref{c2},
1447: condition~B4 implies that $V_{\ast}(\zeta)\equiv\Sigma_{\ast}(\zeta,\zeta)$
1448: is positive definite
1449: for all $\zeta\in[a,b]$. Note that we will establish consistency of
1450: $\hat{V}_n$ after we verify the validity of the proposed bootstrap.
1451: \begin{corollary}\label{c2}
1452: Assume~B4 holds and $\hat{V}_n(\zeta)\rightarrow V_{\ast}(\zeta)$
1453: in probability under $P_n$, uniformly over $\zeta\in[a,b]$.
1454: Then $\hat{T}_n\weakpn\sup_{\zeta\in[a,b]}\left\{
1455: \left[\mathbb{Z}_{\ast}(\zeta)+\nu_{\ast}(\zeta)\right]'\right.$
1456: $\left.\times V_{\ast}^{-1}(\zeta)
1457: \left[\mathbb{Z}_{\ast}(\zeta)+\nu_{\ast}(\zeta)\right]\right\}$
1458: and $\tilde{T}_n\weakpn\int_{[a,b]}\left\{
1459: \left[\mathbb{Z}_{\ast}(\zeta)+\nu_{\ast}(\zeta)\right]'
1460: V_{\ast}^{-1}(\zeta)
1461: \left[\mathbb{Z}_{\ast}(\zeta)\right.\right.$\newline
1462: $\left.\left.+\nu_{\ast}(\zeta)\right]\right\}$.
1463: Thus the limiting null distributions of $\hat{T}_n$ and
1464: $\tilde{T}_n$ are
1465: $\hat{\mathbb{T}}_{\ast}\equiv\sup_{\zeta\in[a,b]}$ $\left\{
1466: \mathbb{Z}_{\ast}'(\zeta)V_{\ast}^{-1}(\zeta)
1467: \mathbb{Z}_{\ast}(\zeta)\right\}$ and
1468: $\tilde{\mathbb{T}}_{\ast}\equiv\int_{[a,b]}\left\{
1469: \mathbb{Z}_{\ast}'(\zeta)V_{\ast}^{-1}(\zeta)
1470: \mathbb{Z}_{\ast}(\zeta)\right\}d\zeta$, respectively.
1471: \end{corollary}
1472:
1473: \begin{remark}\label{r3}
1474: Note that $\nu_{\ast}(\zeta_0)$ equals the matrix
1475: $\Sigma_{\ast}(\zeta_0,\zeta_0)$ times $(\alpha_{\ast},\eta_{\ast}')'$.
1476: By arguments in the proof of lemma~\ref{l4}, we know that
1477: $\Sigma_{\ast}(\zeta_0,\zeta_0)$ is positive definite. Thus
1478: $\nu_{\ast}(\zeta_0)$ will be strictly nonzero whenever
1479: $(\alpha_{\ast},\eta_{\ast}')'\neq 0$. Thus both $\hat{T}_n$
1480: and $\tilde{T}_n$ will have power to reject~$H_0$ under
1481: strictly non-null contiguous alternatives $H_1^n$.
1482: \end{remark}
1483:
1484: The following theorem is the first step in establishing the validity
1485: of the bootstrap. For brevity, we will use the notation
1486: $\weakpnboot$ to denote conditional convergence of the bootstrap,
1487: either weakly in the sense of corollary~\ref{c1} or in probability,
1488: but under $P_n$ rather than $P$.
1489: \begin{theorem}\label{s9.t2}
1490: Under the conditions of theorem~\ref{s9.t1},
1491: $\hat{S}_1^{\circ}-\hat{S}_1\;\weakpnboot\;\mathbb{Z}_{\ast}$
1492: in $\ell^{\infty}([a,b]^{q+1})$.
1493: \end{theorem}
1494:
1495: The following corollary yields the desired consistency of $\hat{V}_n$
1496: and the validity of the proposed bootstrap for obtaining
1497: critical values. Define
1498: $\hat{\mathbb{F}}(u)\equiv\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}
1499: \ind\left\{\hat{T}_{n,k}^{\circ}\leq u\right\}$ and
1500: $\tilde{\mathbb{F}}(u)\equiv\tilde{M}_n^{-1}\sum_{k=1}^{\tilde{M}_n}
1501: \ind\left\{\tilde{T}_{n,k}^{\circ}\leq u\right\}$.
1502: \begin{corollary}\label{c3}
1503: There exists a sequence $\tilde{M}_n\rightarrow\infty$, as
1504: $n\rightarrow\infty$, such that
1505: $\hat{V}_n\weakpn\Sigma_{\ast}$, $\hat{V}_n\weakpnboot\Sigma_{\ast}$,
1506: and both $\sup_{u\in\re}\left|
1507: \hat{\mathbb{F}}(u)-P\left\{\hat{\mathbb{T}}_{\ast}\leq u\right\}\right|
1508: \weakpnboot 0$ and $\sup_{u\in\re}$ $\left|
1509: \tilde{\mathbb{F}}(u)-P\left\{\tilde{\mathbb{T}}_{\ast}\leq u\right\}\right|
1510: \weakpnboot 0$.
1511: \end{corollary}
1512:
1513: \section{Implementation and simulation study}
1514: We have implemented the proposed estimation
1515: and inference procedures for both the proportional hazards and proportional
1516: odds models. The maximum likelihood estimates were computed using the
1517: profile likelihood $pL_{n}(\zeta)$ defined in section~4. A line search
1518: over the order statistics of $Y$ is used to maximize over $\zeta$,
1519: while Newton's method is used to maximize over $\psi$.
1520: The stationary point equation~(\ref{c4:e2}) can be
1521: used to profile over $A$ for each value of $\zeta$ and $\gamma$. In
1522: our experience, the computational time of the entire procedure is
1523: reasonable. A thorough simulation study to validate the moderate
1524: sample size performance of this procedure and the proposed bootstrap
1525: procedures of section~8 is underway and will be presented elsewhere.
1526:
1527: Because of the unusual form of
1528: the statistical tests proposed in section~9, we feel it is worthwhile
1529: at this point to present a small simulation study evaluating their
1530: moderate sample size performance. Both the proportional hazards and
1531: proportional odds models were considered. A single time-independent
1532: covariate with
1533: a standard normal distribution was used, so that $d=q=1$, and the
1534: change-point $Y$ also had a standard normal distribution. The
1535: parameter values were set at $\zeta_0=0$, $\alpha_0=0$, $\beta_0=1$,
1536: $\eta_0\in\{0,-0.5,-1,-2,-3\}$, and $A_0(t)=t$. The range of
1537: $\eta_0$ values includes the null hypothesis $H_0$ (when $\eta_0=0$)
1538: and several alternative hypotheses. The censoring time was
1539: exponentially distributed with rate $0.1$ and truncated at 10.
1540: This resulted in a censoring rate of about 25\%. The sample
1541: size for each simulated data set was 300. For each simulated
1542: data set, 250 bootstraps were generated with standard exponential
1543: weights truncated at 5, to compute $\hat{V}_n$ and the critical
1544: values for the two test statistics, $\hat{T}_n$ (the ``sup score
1545: test'') and $\tilde{T}_n$ (the ``mean score test''). The range
1546: for $\zeta$ was restricted to the inner 80\% of the $Y$ values.
1547: Each scenario was replicated 250 times.
1548:
1549: The results of the simulation study are presented in table~\ref{table1} on
1550: page~\pageref{table1}.
1551: The type~I error (the $\eta_0=0$ column) is quite close to
1552: the targeted 0.05 level, and the power increases with the magnitude
1553: of $\eta_0$. Also, the sup test is notably more powerful than the mean
1554: test for all alternatives.
1555: We also tried the nonparametric bootstrap and found that
1556: it did not work nearly as well. While it is difficult to make sweeping
1557: generalizations with this small of a numerical study, it appears as if the
1558: proposed test statistics match the theoretical predictions and have
1559: reasonable power. More simulation studies into the properties of these
1560: statistics would be worthwhile, especially studies of the impact of
1561: time-dependent covariates.
1562:
1563: \begin{table*}
1564: \caption{Results from the simulation study of the
1565: sup and mean score test statistics in
1566: the proportional hazards and proportional odds models.
1567: The sample size is 300, the level of
1568: censoring approximately 25\%, and the nominal type~I error
1569: is 0.05. 250 replicates were generated for each
1570: configuration. The parameters were set at
1571: $\zeta_0=0$, $\alpha_0=0$, $\beta_0=1$, and $A_0(t)=t$, with
1572: the value of $\eta_0$ varying. The worst-case Monte Carlo standard error
1573: for the power estimates is $0.03=0.50/\sqrt{250}$.}
1574: \label{table1}
1575: \begin{tabular}{c|c|c|c|c|c}
1576: \hline\hline
1577: \multicolumn{6}{c}{Proportional hazards model}\\ \hline
1578: \scriptsize{Sup score test statistic}& Null $\eta_0=0$ & $\eta_0=-0.5$
1579: & $\eta_0=-1$ & $\eta_0=-2 $ & $\eta_0=-3$ \\ \hline
1580: \scriptsize{mean}&5.078&5.590 &7.874 &13.524&35.507 \\ \hline
1581: \scriptsize{Standard Deviation}&2.728&2.859 &3.919 &6.992& 11.337 \\\hline
1582: \scriptsize{power}&0.044&0.076 &0.180 &0.536&0.980\\ \hline\hline
1583: \scriptsize{Mean score test statistic}& Null $\eta_0=0$
1584: & $\eta_0=-0.5$ & $\eta_0=-1$& $\eta_0=-2$ & $\eta_0=-3$ \\\hline
1585: \scriptsize{mean}&1.403 &1.694 &2.560 &5.412 & 5.529 \\ \hline
1586: \scriptsize{Standard Deviation} &1.206 &1.104 &1.597 &2.492 & 2.683\\ \hline
1587: \scriptsize{power}&0.040 &0.050 &0.120 &0.236 &0.304 \\ \hline \hline
1588: \multicolumn{6}{c}{Proportional odds model}\\ \hline
1589: \scriptsize{Sup score test statistic}& Null $\eta_0=0$ & $\eta_0=-0.5$
1590: & $\eta_0=-1$ & $\eta_0=-2 $ & $\eta_0=-3$ \\ \hline
1591: \scriptsize{mean}
1592: &3.950&4.762 &5.693 & 8.327 &13.956 \\\hline \scriptsize{Standard
1593: Deviation}&2.390&1.610 & 1.255 &2.901 & 4.244\\ \hline
1594: \scriptsize{power}&0.043&0.068 &0.112 &0.364 &0.660 \\ \hline\hline
1595: \scriptsize{Mean score test statistic}& Null
1596: $\eta_0=0$ & $\eta_0=-0.5$ & $\eta_0=-1$& $\eta_0=-2$
1597: & $\eta_0=-3$ \\ \hline
1598: \scriptsize{mean}&1.177 &1.912 &2.848 &3.265 &4.349 \\ \hline
1599: \scriptsize{Standard Deviation} &0.946 & 1.078 &1.360 &1.498 &1.718 \\ \hline
1600: \scriptsize{power}&0.048 &0.056 &0.116 &0.167 &0.285 \\ \hline\hline
1601: \end{tabular}
1602: \end{table*}
1603:
1604: \section{Proofs}
1605:
1606: {\it Proof of lemma~\ref{l.v1}.} Verification of~D1 is straightforward.
1607: For~D2, we have for all $u\geq 0$,
1608: \[\left|\frac{\ddot{\Lambda}(u)}{\dot{\Lambda}(u)}\right|=
1609: \frac{\Exp{W^2e^{-uW}}}{\Exp{We^{-uW}}}\leq\frac{\Exp{W^2}}{\Exp{W}}<\infty.\]
1610: The second-to-last inequality requires some justification. Note that the
1611: probability measure $Qf(W)\equiv\Exp{f(W)W}/\Exp{W}$ is well-defined
1612: for functions $f$ bounded by $O(W^3)$ by the positivity of $W$ and the
1613: existence of a fourth moment. Now we have
1614: \[\frac{\Exp{W^2e^{-uW}}}{\Exp{We^{-uW}}}=\frac{Q[We^{-uW}]}
1615: {Q[e^{-uW}]}\leq Q[W]=\frac{\Exp{W^2}}{\Exp{W}},\]
1616: since $e^{-uW}$ uniformly down-weights larger values of $W$ and thus forces
1617: the left term of the inequality to be decreasing in~$u$. This proves
1618: the first part.
1619:
1620: For the second part, take $c_0=c$, and note that
1621: \[|u^c\Lambda(u)|=\Exp{u^ce^{-uW}}=\Exp{W^{-c}(uW)^ce^{-uW}}
1622: \leq k\Exp{W^{-c}},\]
1623: where $k=\sup_{x\geq 0}x^ce^{-x}=c^ce^{-c}<\infty$. Similarly,
1624: \[|u^{1+c}\dot{\Lambda}(u)|=\Exp{u^{1+c}We^{-uW}}
1625: =\Exp{W^{-c}(uW)^{1+c}e^{-uW}}\leq k'\Exp{W^{-c}},\]
1626: where $k'=\sup_{x\geq 0}x^{1+c}e^{-x}=(1+c)^{1+c}e^{-1-c}<\infty$.
1627: This concludes the proof.$\Box$
1628:
1629: {\it Proof of lemma~\ref{l1}.} Suppose that
1630: \begin{eqnarray}
1631: \;\;\;\;G\left(\int_0^t \tilde{Y}(s)e^{r_{\xi}(u;Z,Y)}dA(u) \right) = G
1632: \left(\int_0^t \tilde{Y}(s)e^{r_{\xi_0}(u;Z,Y)}dA_0(u)
1633: \right)~\label{c12:e1}
1634: \end{eqnarray}
1635: for all $t \in [0, \tau]$ almost surely under $P$. The target is
1636: to show that~(\ref{c12:e1}) implies that $\xi = \xi_0$ and $A =
1637: A_0$ on $[0, \tau]$. By condition~A1, (\ref{c12:e1})~implies
1638: \[\int_0^t e^{r_{\xi}(u;Z,Y)}dA(u)=
1639: \int_0^t e^{r_{\xi_0}(u:Z,Y)}dA_0(u)\]
1640: for all $t \in [0, \tau]$ almost surely.
1641: Taking the Radon-Nikodym derivative of both
1642: sides with respect to $A_0$, and taking logarithms, we obtain
1643: \begin{eqnarray}
1644: &&\beta'Z(t)+(\alpha+\eta'Z_2(t))\ind\{Y>\zeta\}-\beta_0'Z(t)\label{c12:e2}\\
1645: &&\mbox{\hspace{1.5in}}
1646: -(\alpha_0+\eta_0'Z_2(t))\ind\{Y>\zeta_0\} +\log(\tilde{a}(t))=0,\nonumber
1647: \end{eqnarray}
1648: almost surely, where $\tilde{a} \equiv dA/dA_0$.
1649:
1650: Assume that $\zeta>\zeta_0$. Now choose $y<\zeta_0$ such
1651: that $y\in\tilde{V}(\zeta_0)$ and
1652: $\mbox{var}[Z(t_1)|Y=y]$ is positive definite, where $t_1$ is
1653: as defined in~B3. Note that this is possible by assumptions~B2
1654: and~B3. Conditioning the left-hand side of~(\ref{c12:e2})
1655: on $Y=y$ and evaluating at $t=t_1$ yields that $\beta=\beta_0$.
1656: Now choose $\zeta_0<y<\zeta$ such that $y\in\tilde{V}(\zeta_0)$ and
1657: $\mbox{var}[Z(t_2)|Y=y]$ is positive definite. Conditioning
1658: the left-hand side of~(\ref{c12:e2}) on $Y=y$, and evaluating
1659: at $t=t_2$ yields that $\eta_0=0$. Because the density of $Y$ is
1660: positive in $\tilde{V}(\zeta_0)$, we also see that $\alpha_0=0$.
1661: But this is not possible by condition~C2. A similar argument can
1662: be used to show that $\zeta<\zeta_0$ is impossible. Thus $\zeta=\zeta_0$.
1663: Now it is not hard to argue that condition~B3 forces
1664: $\beta=\beta_0$, $\eta=\eta_0$ and $\alpha=\alpha_0$.
1665: Hence $\log(\tilde{a}(t))=0$ for all $t\in[0,\tau]$,
1666: and the proof is complete.$\Box$
1667:
1668: {\it Proof of lemma~\ref{l2}.} Note that for each $n$, maximizing
1669: the log-likelihood
1670: over $A$ is equivalent to maximizing over a fixed number of parameters
1671: since the number of jumps $K\leq n$. Thus maximizing over the
1672: whole parameter $\theta$ involves maximizing an empirical average of
1673: functions that are smooth over $\psi$ and cadlag over $\zeta$.
1674: Note also that
1675: \[\|\hat{A}_n-A_0\|_{[0,\tau]}=\sum_{j=1}^{K}
1676: \left(\left|\hat{A}_n(T_j-)-A_0(T_j)\right|\vee
1677: \left|\hat{A}_n(T_j)-A_0(T_j)\right|\right),\]
1678: where $\|\cdot\|_B$ is the uniform norm over the set $B$,
1679: and thus $\|\hat{A}_n-A_0\|_{[0,\tau]}$ is measurable.
1680: Hence the uniform distance between
1681: $\hat{\theta}_n$ and $\theta_0$ is also measurable.
1682: Thus almost sure convergence of
1683: $\hat{\theta}_n$ is equivalent to outer almost sure convergence.
1684: Now we return to the proof. Assume
1685: \begin{eqnarray}
1686: \lim\sup_{n \rightarrow \infty} \hat{A}_n(\tau) =
1687: \infty,\label{c12:e4}
1688: \end{eqnarray}
1689: with probability $>0$.
1690: We will show that this leads to a contradiction. It is now possible to
1691: choose a data sequence such that~(\ref{c12:e4}) holds and
1692: $\tilde{G}_n\equiv\pp_n N\rightarrow \tilde{G}_0\equiv P_0 N$
1693: uniformly, since the latter happens with probability~1. Fix one such
1694: sequence $\{n\}$, and define $\theta_n=(\xi_0,A_n)$, where $A_n=\tilde{G}_n$.
1695: Note that the log-likelihood difference,
1696: $\tilde{L}_n(\hat\theta_n)-\tilde{L}_n(\theta_n)$,
1697: should be non-negative for all $n$, since $\hat{\theta}_n$
1698: maximizes the log-likelihood. We are going to show that the
1699: difference is asymptotically negative under the
1700: assumption~(\ref{c12:e4}).
1701:
1702: Now choose a subsequence $\{n_k\}$ such that $\hat{A}_{n_k}(\tau)
1703: \rightarrow \infty$, as $k\rightarrow\infty$. We now have,
1704: for $c_0>0$ from assumption~D2, that
1705: $L_{n_k}(\hat{\theta}_{n_k})-L_{n_k}(\theta_{n_k})$
1706: \begin{eqnarray}
1707: \mbox{\hspace{0.7in}} &\le& O(1)
1708: +\mathbb{P}_{n_k} \delta \left[\log
1709: \left(n_k \Delta \hat{A}_{n_k}(V) \right)+\log\left(
1710: -\dot{\Lambda}(H^{\hat{\theta}_n}(V))\right)\right]\nonumber\\
1711: &&-\mathbb{P}_{n_k}(1-\delta)G(H^{\hat{\theta}_n}(V))\nonumber\\
1712: &\leq& O(1)+\mathbb{P}_{n_k} \delta\log
1713: \left(n_k \Delta \hat{A}_{n_k}(V) \right)
1714: -\mathbb{P}_{n_k}(\delta+c_0)\log\hat{A}_n(V),\label{c12:e5}
1715: \end{eqnarray}
1716: since, for all $u>0$,
1717: $\log\dot{G}(u)=\log[-\dot{\Lambda}(u)]-\log[\Lambda(u)]$;
1718: $\log[-\dot{\Lambda}(u)]=\log[-u^{1+c_0}$
1719: $\dot{\Lambda}(u)]-(1+c_0)\log(u)
1720: \leq O(1)-(1+c_0)\log(u)$ by condition~D2; and since
1721: $\log\Lambda(u)=\log[u^{c_0}\Lambda(u)]-c_0\log(u)\leq O(1)-c_0\log(u)$
1722: also by condition~D2.
1723:
1724: Next we take a partition of $[0, \tau]$, $0=v_0<v_1<\cdots<v_M=\tau$,
1725: for some finite $M$. The right hand side of~(\ref{c12:e5}) is now
1726: dominated by
1727: \begin{eqnarray}\label{c12:e6}
1728: O(1)+\log \hat{A}_{n_k}(\tau) \mathbb{P}_{n_k} \left(\delta
1729: \ind\{V \in [v_{M-1}, \infty] \} - (\delta + c_0)\ind\{V \in
1730: [\tau, \infty \} \right)&&\\
1731: +\sum_{m=1}^{M-1}\log \hat{A}_{n_k}(v_m) \mathbb{P}_{n_k}
1732: \left(\delta \ind\{V \in [v_{m-1}, v_m] \} - (\delta + c_0)
1733: \ind\{V \in [v_{m}, v_{m+1}] \} \right).&&\nonumber
1734: \end{eqnarray}
1735: For a fixed constant $c > 1$, we can choose this partition such that
1736: \begin{eqnarray*}
1737: P_0N(\tau)\ind\{V \in [v_{M-1}, \infty]\} = P_0[N(\tau) +
1738: c_0/c]\ind\{V \in [\tau, \infty] \},
1739: \end{eqnarray*}
1740: and, for $m = 1,...,M-1,$
1741: \begin{eqnarray*}
1742: P_0N(\tau)\ind\{V \in [v_{m-1}, v_m]\} = P_0[N(\tau) +
1743: c_0/c]\ind\{V \in [v_m, v_{m+1}] \}.
1744: \end{eqnarray*}
1745: Recalling that $\tilde{G}_n\rightarrow \tilde{G}_0$ uniformly, we
1746: obtain that~(\ref{c12:e6}) tends to $-\infty$ as $k \rightarrow \infty$,
1747: which is the intended contradiction. Thus, $\limsup_{n
1748: \rightarrow \infty} \hat{A}_n(\tau) < \infty$ almost surely.$\Box$
1749:
1750: {\it Proof of theorem~\ref{t1}.}
1751: By the opening arguments in the proof of lemma~\ref{l2}, we have that
1752: outer almost sure convergence is equivalent to the usual almost
1753: sure convergence in this instance. Note that $\{\hat{A}_n(\tau)\}$ is bounded
1754: almost surely, $\tilde{G}_n\rightarrow\tilde{G}_0$ almost
1755: surely, and the class
1756: \[{\cal F}_{(k)}\equiv\left\{W(t;\theta):t\in[0,\tau],\xi\in{\cal X},
1757: A\in{\cal A}_{(k)}\right\},\]
1758: where ${\cal A}_{(k)}\equiv\{A\in{\cal A}:A(\tau)\leq k\}$,
1759: is Donsker (and hence also Glivenko-Cantelli) for every $k<\infty$
1760: by lemma~\ref{l.t1.1} below. By similar arguments to those used
1761: in lemma~\ref{l.t1.1}, we have that the class
1762: $\{G(H^{\theta}(V)):\xi\in{\cal X},A\in{\cal A}_{(k)}\}$ is also
1763: Glivenko-Cantelli for all $k<\infty$. We therefore
1764: have the following with probability~1:
1765: $\{\hat{A}_n(\tau)\}$ is bounded asymptotically, $\tilde{G}_n\rightarrow
1766: \tilde{G}_0$ uniformly, $(\mathbb{P}_n-P)W(\cdot;\hat{\theta}_n)
1767: \rightarrow 0$ uniformly, and $(\mathbb{P}_n-P)\left[
1768: G(H^{\hat{\theta}_n}(V))-G(H^{\theta_n}(V))\right]\rightarrow 0$.
1769: Now, fix a sequence $\{n\}$ for which these last four asymptotic events
1770: hold. We can now use the Helly selection theorem to find a
1771: subsequence $\{n_k\}$ and a function $A$ such that
1772: $\hat{A}_{n_k}(t) \rightarrow A(t)$ for all $t \in [0, \tau]$ at
1773: which $A$ is continuous. From~(\ref{c4:e2}), we obtain
1774: \[|\hat{A}_{n_k}(s)-\hat{A}_{n_k}(t)| \le
1775: O(1)\mathbb{P}_{n_k}|N(s)-N(t)|\rightarrow O(1)|\tilde{G}_0(s)
1776: -\tilde{G}_0(t)|,\]
1777: for all $s,t \in [0, \tau]$. Since $\tilde{G}_0$ is continuous by
1778: condition~C3, we know that $A$ must be continuous on all of $[0,\tau]$.
1779: Thus $\hat{A}_{n_k} \rightarrow A$ uniformly.
1780: Without loss of generality, we can also assume that along this
1781: subsequence $\hat{\xi}_{n_k} \rightarrow \xi$ for some $\xi \in
1782: {\cal X}\equiv\Upsilon\times B_1\times B_2\times(a,b)$. Denote
1783: $\theta=(\xi,A)$.
1784:
1785: Consider now $\theta_n \equiv (\xi_0, A_n)$, where
1786: \begin{eqnarray*}
1787: A_n(t) \equiv \int_0^t \frac{d\tilde{G}_n(u)}{PW(u;
1788: \theta_0)}.
1789: \end{eqnarray*}
1790: We can use the same technique as in the derivation
1791: of~(\ref{c4:e2}) to show that $A_0$ satisfies
1792: \begin{eqnarray*}
1793: A_0(t)\equiv\int_0^t \frac{d\tilde{G}_0(u)}{PW(u;\theta_0)},
1794: \end{eqnarray*}
1795: for all $t\in[0,\tau]$. Thus $A_{n_k}\rightarrow A_0$ uniformly,
1796: as $k\rightarrow\infty$. At this point, we have
1797: \begin{eqnarray*}
1798: 0&\leq&\tilde{L}_{n_k}(\hat{\theta}_{n_k})-\tilde{L}_{n_k}(\theta_{n_k})\\
1799: &=&\int_0^{\tau}\log\left[\frac{PW(u;\theta_0)}
1800: {\mathbb{P}_{n_k}W(u;\hat{\theta}_{n_k})}\right]
1801: d\tilde{G}_{n_k}(u)-\mathbb{P}_{n_k}\left[G(H^{\hat{\theta}_{n_k}}(V))-
1802: G(H^{\theta_{n_k}}(V))\right]\\
1803: &\rightarrow&\int_0^{\tau}\log\frac{dA(u)}{dA_0(u)}d\tilde{G}_0(u)
1804: -P\left[G(H^{\theta}(V))-G(H^{\theta_0}(V))\right]\\
1805: &=&\int\log\frac{dP_{\theta}}{dP}dP\\
1806: &\leq&0.
1807: \end{eqnarray*}
1808: But this forces $\theta=\theta_0$ by the identifiability of
1809: the model as given in lemma~\ref{l1}. Thus all convergent
1810: subsequences of $\hat{\theta}_n$,
1811: on a set of probability~1, converge to $\theta_0$. The desired
1812: result now follows.$\Box$
1813:
1814: \begin{lemma}\label{l.t1.1}
1815: $\forall k<\infty$, the class
1816: ${\cal F}_{(k)}\equiv\left\{W(t;\theta):t\in[0,\tau],\xi\in{\cal X},
1817: \right.$ $\left.A\in{\cal A}_{(k)}\right\}$,
1818: is $P$-Donsker.
1819: \end{lemma}
1820:
1821: {\it Proof.} Routine arguments can be used to establish that
1822: the class ${\cal F}_1\equiv
1823: \{e^{r_{\xi}(t;Z,Y)}:t\in[0,\tau],\xi\in{\cal X}\}$ is
1824: Donsker. Consider the map
1825: \[h\in D[0,\tau]\mapsto
1826: \left\{\int_0^th(s)dA(s):t\in[0,\tau],A\in{\cal A}_{(k)}\right\}
1827: \in\ell^{\infty}([0,\tau]\times{\cal A}_{(k)}),\]
1828: and note that it is uniformly equicontinuous and linear.
1829: Thus the class
1830: \[{\cal F}_2\equiv\left\{\int_0^t e^{r_{\xi}(s;Z,Y)}dA(s):
1831: t\in[0,\tau],\xi\in{\cal X},A\in{\cal A}_{(k)}\right\}\]
1832: is Donsker by the continuous mapping theorem.
1833: Now condition~D1 ensures that both $\dot{G}$ and
1834: $\ddot{G}/\dot{G}$ are Lipschitz on compacts. This fact,
1835: combined with the facts that sums of Donsker classes
1836: are Donsker and products of bounded Donsker classes are
1837: Donsker, yields the desired results.$\Box$
1838:
1839: {\it Proof of lemma~\ref{l3}.} By the smoothness assumed in~D1 of the involved
1840: derivatives, we have for each $\zeta\in[a,b]$ and $\psi^{\ast}\in\Psi$,
1841: \[\lim_{t\downarrow 0}\sup_{h^{\ast}\in\mbox{lin}\,\Psi:
1842: \rho_1(h^{\ast})\leq 1}
1843: \sup_{h\in{\cal H}_r}\left|\int_0^1 h^{\ast}
1844: \left(\sigma_{\psi^{\ast}+st h^{\ast}}(h)-\sigma_{\psi^{\ast}}(h)\right)ds
1845: \right|=0.\]
1846: Thus, $\sup_{h\in{\cal H}_r}\left|PU_{\zeta}^{\tau}(\psi^{\ast}+h^{\ast})(h)
1847: -PU_{\zeta}^{\tau}(\psi^{\ast})(h)+h^{\ast}\left(\sigma_{\psi^{\ast}}(h)
1848: \right)\right|=o(\rho_1(h^{\ast}))$, as $\rho_1(h^{\ast})\rightarrow 0$.$\Box$
1849:
1850: {\it Proof of lemma~\ref{l4}.}
1851: First note that for any $h=(h_1,h_2,h_3,h_4)\in{\cal H}_{\infty}$,
1852: $\sigma_{\theta_0}(h)=\mathbb{A}(h)+\mathbb{B}(h)$, where
1853: $\mathbb{A}(h)=\left(h_1,h_2,h_3,g_0h_4\right)$,
1854: $\mathbb{B}(h)=\sigma_{\theta_n}(h)-\mathbb{A}(h)$,
1855: and $g_0(u)=P\left[\tilde{Y}(u)e^{r_{\xi_0}(u;Z,Y)}
1856: \hat{\Xi}_{\theta_0}^{(0)}(\tau)\right]$. It is not hard to verify that
1857: since $g_0$ is bounded below, $\mathbb{A}$
1858: is one-to-one and onto with continuous
1859: inverse defined by $\mathbb{A}^{-1}(h)=(h_1,h_2,h_3,h_4/g_0)$.
1860: It is also not hard to
1861: verify that the operator $\mathbb{B}$
1862: is compact as an operator on ${\cal H}_r$
1863: for any $0<r<\infty$. Thus the first part of the theorem is proved
1864: by lemma~25.93 of \cite{v98}, if
1865: we can show that $\sigma_{\theta_0}$ is one-to-one. This will then imply
1866: that for each $r>0$, there is an $s>0$ with
1867: $\sigma_{\theta_0}^{-1}({\cal H}_s)\subset{\cal H}_r$. Now we have
1868: \[\inf_{\psi\in\mbox{lin}\,\Psi}
1869: \frac{\|\psi(\sigma_{\theta_0}(\cdot))\|_{(r)}}
1870: {\|\psi\|_{(r)}}\geq \inf_{\psi\in\mbox{lin}\,\Psi}
1871: \frac{\sup_{h\in\sigma_{\theta_0}^{-1}({\cal H}_s)}
1872: |\psi(\sigma_{\theta_0}(h))|}{\|\psi\|_{(r)}}=
1873: \inf_{\psi\in\mbox{lin}\,\Psi}\frac{\|\psi\|_{(s)}}{\|\psi\|_{(r)}}\]
1874: $\geq s/(4r)$, since $\|\psi\|_{(r)}\leq 4(r/s)\|\psi\|_{(s)}$.
1875: Thus $\psi\mapsto\psi(\sigma_{0}(\cdot))$ is continuously invertible
1876: on its range by proposition~A.1.7 of \cite{bkrw98}. That it is also onto with
1877: inverse $\psi\mapsto\psi(\sigma_{\theta_0}^{-1})$ follows from
1878: $\sigma_{\theta_0}$ being onto. All that remains is verifying that
1879: $\sigma_{\theta_0}$ is one-to-one.
1880:
1881: Let $h \in \mathcal{H}_{\infty}$ such that $\sigma_{\theta_0}(h)=0$.
1882: For the one-dimensional submodel defined by the map $s \rightarrow\psi_{0s}
1883: \equiv \psi_0 + s(h_1, h_2, h_3, \int_0^{(\cdot)}h_4(u)dA_0(u))$, we have
1884: \begin{eqnarray}
1885: P \{ \frac{ \partial}{\partial s}L_1(\psi_{0s},\zeta_0)|_{s=0}\}^2 = P
1886: \{U^{\tau}_{\zeta_0}(\psi_0)(h)\}^2=0.~\label{c12:e9}
1887: \end{eqnarray}
1888: Define the random set
1889: $\mathcal{S}(n,\tilde{y},t) \equiv \{(N,\tilde{Y}): N(u) = n(u),
1890: \tilde{Y}(u) = \tilde{y}(u), u \in [t, \tau] \}$.
1891: The equality~(\ref{c12:e9}) implies that
1892: $P\{U_{\zeta_0}^{\tau}(\psi_0)(h)|\mathcal{S}(n,y,t)\}^2=0$
1893: for all $\mathcal{S}$ such that $P\{\mathcal{S}(n,y,t)\} > 0$, which
1894: implies that $U^t_{\zeta_0}(\psi_0)(h)=0$ almost surely for all $t \in [0,
1895: \tau]$. Consider the set on which the observation $(X, \delta, Z,
1896: Y)$ is censored at a time $t \in [0, \tau]$. From (\ref{c12:e9})
1897: and the preceding argument,
1898: \begin{eqnarray}
1899: R_{\zeta_0,\psi_0}^t(h_1\ind(Y > \zeta_0) +
1900: h_2'Z_2(t)\ind(Y > \zeta_0)+h_3'Z(t)+h_4)=0.~\label{c12:e11}
1901: \end{eqnarray}
1902: Taking the Radon-Nikodym derivative of (\ref{c12:e11}) with respect to $A_0$
1903: and dividing throughout by $e^{r_{\xi_0}(t;Z,Y)}$ yields
1904: \begin{eqnarray}
1905: \tilde{Y}(t)(h_1\ind(Y > \zeta_0) +
1906: h_2'Z_2(t)\ind(Y > \zeta_0)+h_3'Z(t)+h_4(t))=0.~\label{c12:e12}
1907: \end{eqnarray}
1908: Arguments quite similar to those used in the proof of lemma~\ref{l1}
1909: can now be used to verify that~(\ref{c12:e12}) forces $h=0$.
1910: Hence $\sigma_{\theta_0}(h)=0$ implies $h=0$, and thus
1911: $\sigma_{\theta_0}$ is one-to-one.$\Box$
1912:
1913: {\it Proof of lemma~\ref{l5}.} For the first part, note that
1914: $t\mapsto\tilde{Y}(t)$ has total variation bounded by~1;
1915: and, by the model assumptions, the total variation of
1916: $t\mapsto e^{r_{\xi}(t;Z,Y)}$ is bounded by a universal constant that doesn't
1917: depend on $\theta$. Thus there exists a universal constant $k_{\ast}$
1918: such that $\|\mathbb{P}_n W(\cdot;\hat{\theta}_n)\|_v\leq
1919: k_{\ast}\mathbb{P}_n|\hat{\Xi}^{(0)}_{\hat{\theta}_n}|$. By the smoothness of
1920: the functions involved, and the fact that $u\mapsto\log(u)$ is Lipschitz
1921: on compacts bounded above zero, we obtain the first result of the lemma.
1922: The consistency part follows from
1923: lemma~\ref{l.t1.1} combined with theorem~\ref{t1}, the
1924: continuity of $\theta\mapsto PW(\cdot;\theta)$, and reapplication
1925: of the Lipschitz continuity of $u\mapsto\log(u)$.$\Box$
1926:
1927: {\it Proof of lemma~\ref{l6}.} The right-hand derivative of
1928: $P(L_1(\psi,\zeta))$ with respect to $\zeta$ at $\zeta=\zeta_0$ is:
1929: $\left.(\partial^{+}/(\partial\zeta))
1930: P(L_1(\psi, \zeta))\right|_{\zeta=\zeta_0}$
1931: \begin{eqnarray*}
1932: &=&\int\left
1933: \{P[l_1^{\psi}(V,\delta,Z)|Y=y+]-P[l_2^{\psi}(V,\delta,Z)|Y=y+] \right \}
1934: \tilde{\delta}_{\zeta_0}(y)\tilde{h}(y)dy \\
1935: &=&\left(P[l_1^{\psi}(V,\delta,Z) |Y=\zeta_0+]-
1936: P[l_2^{\psi}(V,\delta,Z)|Y=\zeta_0+]\right)\tilde{h}(\zeta_0),
1937: \end{eqnarray*}
1938: where the superscript~$+$ denotes differentiating from the right
1939: and $\tilde{\delta}_{\zeta_0}(y)$ is the Dirac delta function
1940: assigning counting measure~1 to the event $\{y=\zeta_0\}$. Now,
1941: $P[l_1^{\psi}(V,\delta,Z)|Y=\zeta_0+]-
1942: P[l_2^{\psi}(V,\delta,Z)|Y=\zeta_0+]$
1943: \[=\int\left[ l_1^{\psi}(v,d,z)-l_2^{\psi}(v,d,z)\right]
1944: \ell_2(v,d,z)\ell_0^{+}(v,d,z)d\mu(v,d,z)\]
1945: $\equiv\tilde{R}^{+}(\psi)$,
1946: where $\ell_j(v,d,z)\equiv\exp\{l_j^{\psi_0}(v,d,z)\}$, for $j=1,2$;
1947: $\mu(v,d,z)$ is the dominating measure; and $\ell_0^{+}(v,d,z)$
1948: consists of the remaining components of the conditional distribution
1949: of $(V,\delta,Z)$ given $Y=\zeta_0+$. Note that under the model
1950: assumptions, $\ell_0^{+}$ does not depend on the parameters. Thus
1951: \begin{eqnarray*}
1952: \tilde{R}^{+}(\psi_0)&=&\int\left[ l_1^{\psi_0}(v,d,z)
1953: -l_2^{\psi_0}(v,d,z)\right]\ell_2(v,d,z)\ell_0^{+}(v,d,z)d\mu(v,d,z)\\
1954: &=&\int\log\left[\frac{\ell_1\ell_0^{+}}{\ell_2\ell_0^{+}}\right]
1955: \ell_2\ell_0^{+}d\mu
1956: \;\;<\;\;\log\int\left[\frac{\ell_1\ell_0^{+}}
1957: {\ell_2\ell_0^{+}}\right]\ell_2\ell_0^{+}d\mu\\
1958: &=&\log\int\ell_1(v,d,z)\ell_0^{+}(v,d,z)d\mu(v,d,z)\;\;=\;\;0,
1959: \end{eqnarray*}
1960: since the integral of a density is~1. Thus
1961: $\dot{X}_{\zeta_0}^{+}(\gamma_0,\Gamma_0)<0$.
1962:
1963: A similar argument is
1964: used for the left-hand derivative. In this case, the true density
1965: of $(V,\delta,Z)$ given $Y=\zeta_0$ is $\ell_1^{\psi_0}(v,d,z)
1966: \ell_0^{-}(v,d,z)$, where $\ell_0^{-}$ does not involve the parameters.
1967: We now have
1968: \begin{eqnarray*}
1969: \lefteqn{P[l_1^{\psi}(V,\delta,Z)|Y=\zeta_0]-
1970: P[l_2^{\psi}(V,\delta,Z)|Y=\zeta_0]}\mbox{\hspace{1.0cm}}&&\\
1971: &=&\int\left[ l_1^{\psi_0}(v,d,z)
1972: -l_2^{\psi_0}(v,d,z)\right]\ell_2(v,d,z)\ell_0^{-}(v,d,z)d\mu(v,d,z)\\
1973: &=&-\int\log\left[\frac{\ell_2\ell_0^{-}}{\ell_1\ell_0^{-}}\right]
1974: \ell_1\ell_0^{-}d\mu
1975: \;\;>\;\;-\log\int\left[\frac{\ell_2\ell_0^{-}}{\ell_1\ell_0^{-}}\right]
1976: \ell_1\ell_0^{-}d\mu\\
1977: &=&\log\int\ell_2(v,d,z)\ell_0^{-}(v,d,z)d\mu(v,d,z)\;\;=\;\;0,
1978: \end{eqnarray*}
1979: and thus we conclude that $\dot{X}_{\zeta_0}^{-}(\gamma_0,\Gamma_0)>0$.$\Box$
1980:
1981: {\it Proof of lemma~\ref{l7}.} This follows from lemma~\ref{l6}, the local
1982: concavity of $\tilde{X}$, and the
1983: smoothness of the derivatives involved.$\Box$
1984:
1985: {\it Proof of lemma~\ref{l8}.}
1986: Note that $\tilde{X}_n(\zeta,\eta,\Gamma)$
1987: \[=\mathbb{P}_n\left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)
1988: +\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})
1989: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})\right],\]
1990: where
1991: $\tilde{W}(\zeta,\gamma,A)\equiv l_1^{\psi}(V,\delta,Z)\ind\{Y\leq\zeta\}
1992: +l_2^{\psi}(V,\delta,Z)\ind\{Y>\zeta\}$. The classes
1993: \[\left\{\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t):
1994: \|\Gamma-\Gamma_0\|_{\infty}\leq\epsilon,\|\Gamma\|_v\leq k_0\right\},\]
1995: for any $\epsilon>0$, and
1996: $\left\{\tilde{W}(\zeta,\lambda):(\zeta,\lambda)
1997: \in B_{\epsilon_2}^{\ast k_0}\right\}$, for some $\epsilon_2>0$,
1998: can be shown to be Donsker. That this holds for the second class
1999: follows from arguments similar to those used in the proof
2000: of lemma~\ref{l.t1.1}. For the first class, note that
2001: $\int_0^{\tau}\Gamma(t)dN(t)=\delta\Gamma(V)$. Since $\|\Gamma\|_v\leq k_0$,
2002: $\Gamma$ can be written as the
2003: difference between two monotone increasing functions,
2004: each with total variation bounded by $k_0$. By theorem~2.7.5 of \cite{vw96},
2005: the class of all monotone functions with a given compact range is universally
2006: Donsker. Since sums of Donsker classes are Donsker, we have that the class
2007: $\{\Gamma(V):\|\Gamma\|_v\leq k_0\}$ is Donsker. That the first class
2008: is Donsker now follows since products of bounded Donsker classes are Donsker.
2009: Since we also have that
2010: $\sqrt{n}(\tilde{G}_n-\tilde{G}_0)$ converges to a Gaussian process,
2011: we have that
2012: \[\sqrt{n}(\mathbb{P}_n-P)
2013: \left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)
2014: +\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})
2015: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})\right]\]
2016: converges weakly in $\ell^{\infty}(B_{\epsilon_2}^{\ast k_0})$
2017: to the tight Gaussian process
2018: \[\mathbb{G}\left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)
2019: +\tilde{W}(\zeta,\eta,A_0^{(\Gamma)})
2020: -\tilde{W}(\zeta_0,\eta_0,A_0^{(\Gamma_0)})\right],\]
2021: where $\mathbb{G}$ is the Brownian bridge measure.
2022:
2023: By the smoothness of the functions and derivatives involved,
2024: we also have
2025: $\sqrt{n}\left\{P\left[-\int_0^{\tau}\left\{\Gamma(t)-\Gamma_0(t)\right\}dN(t)
2026: +\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})
2027: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})\right]-\right.$
2028: \newline $\left.\tilde{X}(\zeta,\eta,\Gamma)\right\}\;\;=\;\;
2029: \sqrt{n}P\left[\tilde{W}(\zeta,\eta,A_n^{(\Gamma)})
2030: -\tilde{W}(\zeta_0,\eta_0,A_n^{(\Gamma_0)})
2031: -\tilde{W}(\zeta,\eta,A_0^{(\Gamma)})\right.$ \newline
2032: $\left.+\tilde{W}(\zeta_0,\eta_0,A_0)\right]\;\;=\;\;$
2033: $-\sqrt{n}\int_0^{\tau}\left\{P[W(t;\theta_0(\zeta,\lambda))]e^{-\Gamma(t)}
2034: -P[W(t;\theta_0)]e^{-\Gamma_0(t)}\right\}$ $\times
2035: \left[d\tilde{G}_n(t)-d\tilde{G}_0(t)\right]+\epsilon_n(\zeta,\lambda)
2036: \equiv-\int_0^{\tau}\tilde{C}(t;\zeta,\lambda)d{\cal Z}_n(t)+
2037: \epsilon_n(\zeta,\lambda)$,
2038: where $\|\epsilon_n\|_{\infty}$ $=o_P(1)$. The fact that the class
2039: of functions $\{\tilde{C}(\cdot;\zeta,\lambda):(\zeta,\lambda)\in
2040: B_{\epsilon_2}^{\ast k_0}\}$ has uniformly bounded total variation yields
2041: asymptotic linearity and normality of $\left\{
2042: \int_0^{\tau}\tilde{C}(t;\zeta,\lambda)d{\cal Z}_n(t):(\zeta,\lambda)
2043: \in B_{\epsilon_2}^{\ast k_0}\right\}$,
2044: and the desired result follows.$\Box$
2045:
2046: {\it Proof of theorem~\ref{t.l9}.} By lemma~\ref{l8},
2047: \[-\tilde{X}(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)
2048: =(\tilde{X}_n-\tilde{X})(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)
2049: -\tilde{X}_n(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)\leq O_P(n^{-1/2}).\]
2050: Combining this with lemma~\ref{l7}, we obtain
2051: $\sqrt{n}|\hat{\zeta}_n-\zeta_0|$
2052: \begin{eqnarray*}
2053: &=&\sqrt{n}|\hat{\zeta}_n-\zeta_0|\ind\{(\hat{\zeta}_n,\hat{\gamma}_n,
2054: \hat{\Gamma}_n)\in B_{\epsilon_1}^{\ast k_0}\}+
2055: \sqrt{n}|\hat{\zeta}_n-\zeta_0|\ind\{(\hat{\zeta}_n,\hat{\gamma}_n,
2056: \hat{\Gamma}_n)\not\in B_{\epsilon_1}^{\ast k_0}\}\\
2057: &\leq&-\sqrt{n}k_1^{-1}\tilde{X}(\hat{\zeta}_n,\hat{\gamma}_n,
2058: \hat{\Gamma}_n)+o_P(1)\\
2059: &\leq& O_P(1).
2060: \end{eqnarray*}
2061: Thus the first part of the lemma is proved.
2062:
2063: For the second part, denote $U_{0\zeta}^{\tau}(\psi)\equiv
2064: P U_{\zeta}^{\tau}(\psi)$. By arguments similar to those used
2065: in the proof of lemma~\ref{l.t1.1}, we can verify that for some $e_1>0$,
2066: ${\cal F}\equiv
2067: \{U_{\zeta}^{\tau}(\psi)(h):\|\theta-\theta_0\|_{\infty}\leq e_1,
2068: h\in{\cal H}_1\}$ is Donsker. Moreover, the continuity of the
2069: functions involved also yields that, as
2070: $\|\theta-\theta_0\|_{\infty}\rightarrow 0$,
2071: $\sup_{h\in{\cal H}_1}
2072: P\left(U_{\zeta}^{\tau}(\psi)(h)-U_{\zeta_0}^{\tau}(\psi_0)(h)\right)^2
2073: \rightarrow 0$. Thus
2074: \begin{eqnarray}
2075: \sqrt{n}\left(U_{n\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)
2076: -U_{0\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)-U_{n\zeta_0}^{\tau}(\psi_0)
2077: +U_{0\zeta_0}^{\tau}(\psi_0)\right)&=&o_P^{{\cal H}_1}(1).\label{l9.e1}
2078: \end{eqnarray}
2079: Note also that $\sqrt{n}|\hat{\zeta}_n-\zeta_0|=O_P(1)$ implies that
2080: $\sqrt{n}\left(U_{0\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)
2081: -U_{0\zeta_0}^{\tau}(\hat{\psi}_n)\right)=o_P^{{\cal H}_1}(1)$.
2082: Thus, since
2083: $U_{n\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)=0$, (\ref{l9.e1})~implies
2084: $\sqrt{n}U_{0\zeta_0}^{\tau}(\hat{\psi}_n)=$
2085: \[\sqrt{n}U_{0\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)+o_P^{{\cal H}_1}(1)
2086: =-\sqrt{n}\left(U_{n\zeta_0}^{\tau}(\psi_0)-U_{0\zeta_0}^{\tau}(\psi_0)\right)
2087: +o_P^{{\cal H}_1}(1)=O_P^{{\cal H}_1}(1),\]
2088: where $O_P^{B}(1)$ denotes a term bounded in probability
2089: uniformly over the set $B$. By lemma~\ref{l4}, we know that there
2090: exists a constant $e_2>0$ such that
2091: \[\|U_{0\zeta_0}^{\tau}(\psi)-
2092: U_{0\zeta_0}^{\tau}(\psi_0)\|_{{\cal H}_1}\geq
2093: e_2\|\psi-\psi_0\|_{\infty}
2094: +o(\|\psi-\psi_0\|_{\infty}),\]
2095: as $\|\psi-\psi_0\|_{\infty}\rightarrow 0$.
2096: Hence $\sqrt{n}\|\hat{\psi}_n-\psi_0\|_{\infty}(e_2-o_P(1))\leq O_P(1)$,
2097: and we obtain the second conclusion of the lemma.
2098:
2099: For the third part, we have
2100: \[\sqrt{n}\sup_{t\in[0,\tau]}\left|
2101: \pp_n W(t;\hat{\theta}_n)-PW(t;\hat{\theta}_n)
2102: \right|=\sqrt{n}\sup_{t\in[0,\tau]}|(\pp_n-P)W(t;\theta_0)|+o_P(1)\]
2103: $=O_P(1)$ and $\sqrt{n}\sup_{t\in[0,\tau]}|PW(t;\hat{\theta}_n)-
2104: PW(t;\theta_0)|=O_P(1)$ by the first two parts of this lemma.
2105: Hence $\sqrt{n}\sup_{t\in[0,\tau]}
2106: \left|\pp_n W(t;\hat{\theta}_n)-PW(t;\theta_0)\right|=O_P(1)$.
2107: The result now follows by the Lipschitz continuity of $\log(u)$
2108: over strictly positive compact intervals.$\Box$
2109:
2110: {\it Proof of lemma~\ref{l10}.}
2111: The first inequality follows from the definitions.
2112: For the second inequality, we use a Taylor's expansion around
2113: $(\hat{\zeta}_n,\hat{\gamma}_n,\hat{\Gamma}_n)$ to obtain
2114: $\tilde{X}_n(\hat{\zeta}_n,\hat{\lambda}_n)-\tilde{X}_n(\hat{\zeta}_n,
2115: \lambda_0)=$
2116: \[-\pp_n U_{\hat{\zeta}_n}^{\tau}(\hat{\gamma}_n,
2117: A_n^{(\hat{\Gamma}_n)})(\lambda_0-\hat{\lambda}_n)
2118: -\psi_{n,t}^{(\lambda_0-\hat{\lambda}_n)}\left(\pp_n
2119: \hat{\sigma}_{
2120: \left(\hat{\zeta}_n,\hat{\gamma}_{n,t},A_n^{(\hat{\Gamma}_{n,t})}\right)}
2121: \right)(\lambda_0-\hat{\lambda}_n),\]
2122: for some $t\in[0,1]$, where $\hat{\lambda}_{n,t}\equiv
2123: (\hat{\gamma}_{n,t},\hat{\Gamma}_{n,t})$; $\hat{\gamma}_{n,t}\equiv
2124: t\hat{\gamma}_n+(1-t)\gamma_0$; $\hat{\Gamma}_{n,t}\equiv
2125: t\hat{\Gamma}_n+(1-t)\Gamma_0$; and,
2126: for any $h\in{\cal H}_{\infty}$,
2127: $\psi_{n,t}^{(h)}\equiv\left(h_1,h_2,h_3,\int_0^{(\cdot)}h_4(s)
2128: dA_n^{(\hat{\Gamma}_{n,t})}(s)\right)$.
2129: The score term is zero by definition of the NPMLE, and the second term
2130: has absolute value bounded by $\hat{K}_n
2131: \|\hat{\lambda}_n-\lambda_0\|_{\infty}^2$,
2132: where $\hat{K}_n$ is bounded in probability
2133: by the uniform consistency of $\hat{\lambda}_n$ and by
2134: the form of the information terms listed in section~5.2.
2135:
2136: Now, letting $\psi_n(\gamma,\Gamma)\equiv(\gamma,A_n^{(\Gamma)})$, we have
2137: $\tilde{X}_n(\hat{\zeta}_n,\lambda_0)-\tilde{X}_n^{\ast}(\hat{\zeta}_n)$
2138: \begin{eqnarray}
2139: &&\label{l10.e1}\\
2140: &&=\pp_n\left\{\left(\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right)
2141: \right.\nonumber\\
2142: &&\left.\mbox{\hspace{0.1in}}
2143: \times\left[l_1^{\psi_n(\gamma_0,\Gamma_0)}(V,\delta,Z)
2144: -l_2^{\psi_n(\gamma_0,\Gamma_0)}(V,\delta,Z)
2145: -l_1^{\psi_0}(V,\delta,Z)+l_2^{\psi_0}(V,\delta,Z)\right]\right\}\nonumber
2146: \end{eqnarray}
2147: $=\int_0^{\tau}\pp_n\left\{
2148: \left(\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right)
2149: \tilde{Y}(s)\tilde{K}_n(s)\right\}e^{-\Gamma_0(s)}
2150: \left[d\tilde{G}_n(s)-d\tilde{G}_0(s)\right]$,
2151: where
2152: \begin{eqnarray*}
2153: \tilde{K}_n(s)&=&\left[\dot{G}(H_1^{\psi_{n,t}}(V))
2154: -\delta\frac{\ddot{G}(H_1^{\psi_{n,t}}(V))}{\dot{G}(H_1^{\psi_{n,t}}(V))}
2155: \right]e^{\beta_0'Z(s)}\\
2156: &&-\left[\dot{G}(H_2^{\psi_{n,t}}(V))
2157: -\delta\frac{\ddot{G}(H_2^{\psi_{n,t}}(V))}{\dot{G}(H_2^{\psi_{n,t}}(V))}
2158: \right]e^{\beta_0'Z(s)+\alpha_0+\eta_0'Z_2(s)}
2159: \end{eqnarray*}
2160: and $\psi_{n,t}\equiv\left(\gamma,\int_0^{(\cdot)}\Gamma_0(u)\left[
2161: td\tilde{G}_n(u)+(1-t)d\tilde{G}_0(u)\right]\right)$, for
2162: some $t\in[0,1]$, by the mean value theorem.
2163: By the conditions given in section~2, we have that there is a
2164: constant $k^{\ast}<\infty$ such that
2165: $\|\tilde{K}_n(s)\Gamma_0(s)\|_{v}\leq k^{\ast}$
2166: with probability~1 for all $n\geq 1$. Thus the absolute value
2167: of~(\ref{l10.e1}) is bounded above by
2168: $k^{\ast}\|\tilde{G}_n-\tilde{G}_0\|_{\infty}\times\pp_n
2169: \left|\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right|
2170: =O_P(n^{-1})$.
2171: This last statement follows because $\|\tilde{G}_n-\tilde{G}_0\|_{\infty}
2172: =O_P(n^{-1/2})$,
2173: $(\pp_n-P)\left|\ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right|
2174: =o_P(n^{-1/2})$, and $P\left|
2175: \ind\{Y\leq\hat{\zeta}_n\}-\ind\{Y\leq\zeta_0\}\right|=O_P(n^{-1/2})$
2176: by theorem~\ref{t.l9}. Now the desired result follows.$\Box$
2177:
2178: {\it Proof of lemma~\ref{l11}.} Note first that
2179: \[\tilde{D}_n(\zeta)=\sqrt{n}(\mathbb{P}_n-P)\left\{\left[
2180: \ind\{Y\leq \zeta\}-\ind\{Y\leq\zeta_0\}\right]\times
2181: \left[l_1^{\psi_0}-l_2^{\psi_0}\right](V,\delta,Z)\right\}.\]
2182: Denote $\tilde{H}\equiv[l_1^{\psi_0}-l_2^{\psi_0}](V,\delta,Z)$,
2183: and note that $|\tilde{H}|\leq c_{\ast}$ almost surely
2184: for a fixed constant $c_{\ast}<\infty$. Thus
2185: $F_{\epsilon}\equiv\ind\{\zeta_0-\epsilon\leq Y
2186: \leq\zeta_0+\epsilon\}c_{\ast}$
2187: serves as an envelope for
2188: the class of functions
2189: \[{\cal F}_{\epsilon}\equiv\{\left[\ind\{Y\leq\zeta\}
2190: -\ind\{Y\leq\zeta_0\}\right]\tilde{H}:|\zeta-\zeta_0|\leq\epsilon\},\]
2191: for each $\epsilon>0$.
2192: Note that by the assumptions on the density $\tilde{h}$ in a neighborhood
2193: of $\zeta_0$, we have for some $\epsilon_3>0$ that there exists
2194: $0<k_{\ast},k_{\ast\ast}<\infty$ such that $k_{\ast}\epsilon\leq
2195: \tilde{p}(\epsilon)\equiv
2196: P[\zeta_0-\epsilon\leq Y\leq\zeta_0+\epsilon]\leq k_{\ast\ast}\epsilon$
2197: for all $0\leq\epsilon\leq\epsilon_3$.
2198: Thus the bracketing entropy
2199: \[N_{[]}(u\|F_{\epsilon}\|_{P,2},{\cal F}_{\epsilon},L_2(P))\leq
2200: O\left(\frac{\epsilon}{u^2\tilde{p}(\epsilon)}\right)\leq
2201: O\left(\frac{1}{c_{\ast}u^2}\right),\]
2202: for all $u>0$ and $0\leq\epsilon\leq\epsilon_3$;
2203: and thus, by theorem~2.14.2 of \cite{vw96},
2204: there exists a $c_{\ast\ast}<\infty$
2205: such that
2206: \[E\left[\sup_{|\zeta-\zeta_0|\leq\epsilon}
2207: |\tilde{D}(\zeta)|\right]\leq c_{\ast\ast}\|F_{\epsilon}\|_{P,2}
2208: \leq c_{\ast\ast}c_{\ast}\sqrt{k_{\ast\ast}\epsilon},\]
2209: for all $0\leq\epsilon\leq\epsilon_3$.
2210: The result now follows for $k_2=c_{\ast\ast}c_{\ast}
2211: \sqrt{k_{\ast\ast}}$.$\Box$
2212:
2213: {\it Proof of theorem~\ref{t3}.} We can deduce from section~3 that
2214: \begin{eqnarray*}
2215: \lefteqn{\tilde{L}_n(\hat{\psi}_n,\zeta_{n,u})
2216: -\tilde{L}_n(\hat{\psi}_n,\zeta_0)}
2217: &&\\
2218: &\mbox{\hspace{0.5cm}}=&
2219: \mathbb{P}_n\left\{\left(\ind\{\zeta_{n,u}<Y\leq\zeta_0\}
2220: -\ind\{\zeta_0<Y\leq\zeta_{n,u}\}\right)
2221: \left[l_2^{\hat{\psi}_n}-l_1^{\hat{\psi}_n}\right](V,\delta,Z)\right\}\\
2222: &\mbox{\hspace{0.5cm}}=&n^{-1}Q_n(u)+\hat{E}_n(u),\;\;\;\;\mbox{where}
2223: \end{eqnarray*}
2224: $\hat{E}_n(u)\equiv\mathbb{P}_n\left\{\left(\ind\{Y\leq\zeta_0\}
2225: -\ind\{Y\leq\zeta_{n,u}\}\right)\left[l_2^{\hat{\psi}_n}-l_2^{\psi_0}
2226: -l_1^{\hat{\psi}_n}+l_1^{\psi_0}\right](V,\delta,Z)\right\}$.
2227: By arguments similar to those used in the proof of lemma~\ref{l10},
2228: we can obtain constants $0<F_1,F_2<\infty$ such that
2229: $\left|l_j^{\hat{\psi}_n}(V,\delta,Z)-l_j^{\psi_0}(V,\delta,Z)\right|
2230: \leq F_j\|\hat{\psi}_n-\psi_0\|_{\infty}$ almost surely, for $j=1,2$. Hence
2231: \[|\hat{E}_n(u)|\leq\pp_n\left|\ind\{Y\leq\zeta_0\}
2232: -\ind\{Y\leq\zeta_{n,u}\}\right|O_P(n^{-1/2}).\]
2233: By arguments given in the proof of lemma~\ref{l11}, we know that
2234: \[(\pp_n-P)\left|\ind\{Y\leq\zeta_0\}
2235: -\ind\{Y\leq\zeta_{n,u}\}\right|=O_P^{\mathbb{U}_{n,M}}(n^{-1}).\]
2236: Since also $\sup_{u\in\mathbb{U}_{n,M}}P\left|\ind\{Y\leq\zeta_0\}
2237: -\ind\{Y\leq\zeta_{n,u}\}\right|=O(n^{-1})$ by condition B2(i),
2238: we now have that $\hat{E}_n=O_P^{\mathbb{U}_{n,M}}(n^{-3/2})$.
2239: The desired result now follows.$\Box$
2240:
2241: {\it Proof of theorem~\ref{t4}.} Fix $h\in{\cal H}_{\infty}$.
2242: We first establish that
2243: $\left(Q_n^+,{\cal Z}^n(h)\equiv\right.$
2244: $\left.\sqrt{n}\pp_n U_{\zeta_0}^{\tau}(\psi_0)(h)\right)$
2245: converges weakly to $(Q^+,{\cal Z}(h))$, on $D_M\times\re$,
2246: where $Q^+$ and ${\cal Z}(h)$ are independent, for each
2247: fixed $M<\infty$, and ${\cal Z}(h)$ is mean zero Gaussian with
2248: variance $\tilde{\sigma}_h^2\equiv\mbox{var}[U_{\zeta_0}^{\tau}(\psi_0)(h)]$.
2249: Accordingly, fix $M$, and
2250: let $0=u_0<u_1<u_2<\cdots<u_J\leq M$ be a finite collection of
2251: points and $q_1,\ldots,q_J,\tilde{q}$ be arbitrary real numbers. Our plan
2252: is to first show that the characteristic function of
2253: $(Q_n^+(u_1),\ldots,Q_n^+(u_J),{\cal Z}^n(h))$ converges to that of
2254: $(Q^+(u_1),\ldots,Q^+(u_J))$ times that of ${\cal Z}(h)$.
2255: Since the choice of points
2256: $u_1,\ldots,u_J$ is arbitrary, this will imply convergence
2257: of all finite-dimensional distributions. We will then show
2258: that $Q_n^+$ is asymptotically tight, and this will imply
2259: the desired weak convergence.
2260:
2261: Let $y\mapsto I_{nj}(y)\equiv\ind\{\zeta_0+u_{j-1}/n<y\leq\zeta_0+u_j/n\}$,
2262: $j=1,\ldots,J$; and
2263: $F_i\equiv[l_1^{\psi_0}-l_2^{\psi}](V_i,\delta_i,Z_i)$ and
2264: ${\cal Z}_i\equiv U_{\zeta_0}^{\tau}(\psi_0)(h)(X_i)$,
2265: $i=1,\ldots,n$. In other words, ${\cal Z}_i$ is the score contribution
2266: from the $i$th observation. Thus
2267: \begin{eqnarray}
2268: \lefteqn{P\exp\left[i\left\{\sum_{j=1}^Jq_j[Q_n^+(u_j)-Q_n^+(u_{j-1})]
2269: +\tilde{q}{\cal Z}^n(h)\right\}\right]}\mbox{\hspace{1.0in}}&&\label{t4.e1}\\
2270: &=&\prod_{k=1}^n P\left[\exp\left\{\sum_{j=1}^J i q_jI_{nj}(Y_k)F_k\right\}
2271: e^{i\tilde{q}{\cal Z}_k/\sqrt{n}}\right].\nonumber
2272: \end{eqnarray}
2273: However, using the facts that
2274: $e^{\sum_j w_j}-1=\sum_j(e^{w_j}-1)$ when
2275: only one of the $w_j$'s differs from zero and
2276: $e^{uv}-1=u(e^{v}-1)$ when $u$ is dichotomous, we have
2277: $\exp\left\{\sum_{j=1}^J iq_jI_{nj}(Y_k)F_k\right\}
2278: =1+\sum_{j=1}^J\left(e^{iq_j I_{nj}(Y_k)F_k}-1\right)
2279: =1+\sum_{j=1}^JI_{nj}(Y_k)\left(e^{iq_jF_k}-1\right)$.
2280: Combining this with condition~B2 and the boundedness of
2281: $F_k$ and ${\cal Z}_k$, we obtain
2282: $P\left[\exp\left\{\sum_{j=1}^J i q_jI_{nj}(Y_k)F_k\right\}
2283: e^{i\tilde{q}{\cal Z}_k/\sqrt{n}}\right]$
2284: \begin{eqnarray*}
2285: &=&Pe^{i\tilde{q}{\cal Z}_k/\sqrt{n}}+
2286: \sum_{j=1}^J\frac{(u_j-u_{j-1})\tilde{h}(\zeta_0)}{n}
2287: P\left[\left.\left(e^{iq_jF_k}-1\right)e^{i\tilde{q}{\cal Z}_k/\sqrt{n}}
2288: \right|Y=\zeta_0+\right]\\
2289: &&+o(n^{-1})\\
2290: &=&1+n^{-1}\left[-\frac{\tilde{q}^2\tilde{\sigma}_h^2}{2}
2291: +\tilde{h}(\zeta_0)\sum_{j=1}^J(u_j-u_{j-1})\{\phi^+(q_j)-1\}\right]
2292: +o(n^{-1}),
2293: \end{eqnarray*}
2294: where $o(1)$ denotes a quantity going to zero uniformly
2295: over $k=1,\ldots,n$. Thus the right-hand side of~(\ref{t4.e1}) is
2296: \[\exp\left[\frac{-\tilde{q}^2\tilde{\sigma}_h^2}{2}+
2297: \tilde{h}(\zeta_0)\sum_{j=1}^J(u_j-u_{j-1})\{\phi^+(q_j)-1\}\right],\]
2298: which is precisely
2299: \[P\exp\left[i\tilde{q}{\cal Z}(h)+i\sum_{j=1}^jq_j\left\{
2300: Q^+(u_j)-Q^+(u_{j_1})\right\}\right].\]
2301: Thus the finite dimensional distributions converge as desired.
2302:
2303: We next need to verify that $Q_n^+$ is asymptotically tight
2304: on~$[0,M]$. Since there exists
2305: a constant $c_{\ast}<\infty$ such that $\max_{1\leq i\leq n}
2306: |F_i|\leq c_{\ast}<\infty$ almost surely, we have that
2307: $|Q_n^+(u_2)-Q_n^+(u_1)|\leq c_{\ast}n\pp_n
2308: \ind\{\zeta_0+u_1/n<Y\leq\zeta_0+u_2/n\}$,
2309: for all $0\leq u_1<u_2\leq M$. Thus we are done if we can show
2310: that $u\mapsto\tilde{R}_n(u)\equiv n\pp_n\ind\{\zeta_0<Y\leq\zeta_0+u/n\}$ is
2311: tight on $[0,M]$. To this end, fix $0\leq u_1<u_2\leq M$. Now,
2312: the expectation of $|\tilde{R}_n(u_2)-\tilde{R}_n(u_1)|$ is
2313: $nP\{\zeta_0+u_1/n<Y\leq\zeta_0+u_2/n\}\rightarrow
2314: |u_2-u_1|\tilde{h}(\zeta_0)$, as $n\rightarrow\infty$.
2315: This implies the desired tightness since
2316: $u\mapsto\tilde{R}_n(u)$ is monotone. We have now established that
2317: $\left(Q_n^+,{\cal Z}^n(h)\right)$
2318: converges weakly to $(Q^+,{\cal Z}(h))$, on $D_M\times\re$,
2319: where $Q^+$ and ${\cal Z}(h)$ are independent, for each
2320: fixed $M<\infty$. Similar arguments also yield the weak convergence
2321: of $\left(Q_n^-,{\cal Z}^n(h)\right)$ to
2322: $(Q^-,{\cal Z}(h))$, on $D_M\times\re$,
2323: where $Q^-$ and ${\cal Z}(h)$ are again independent, for each
2324: fixed $M<\infty$. Thus also $\left(Q_n,{\cal Z}^n(h)\right)$ converges
2325: weakly to $(Q,{\cal Z}(h))$, on $D_M\times\re$,
2326: where $Q$ and ${\cal Z}(h)$ are independent, for each
2327: fixed $M<\infty$. Since $n(\hat{\zeta}_n-\zeta_0)=O_P(1)$, the
2328: argmax continuous mapping theorem (theorem~3.2.2 of \cite{vw96}) now yields
2329: that $\left(n(\hat{\zeta}_n-\zeta_0),{\cal Z}^n(h)\right)$
2330: converges weakly to $\left(\argmax\,Q,{\cal Z}(h)\right)$, with
2331: the desired asymptotic independence. The remaining
2332: results follow.$\Box$
2333:
2334: {\it Proof of theorem~\ref{t5}.} We have
2335: \begin{eqnarray*}
2336: 0&=&\sqrt{n}\pp_n U_{\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)\\
2337: &=&\sqrt{n}\pp_n U_{\zeta_0}^{\tau}(\hat{\psi}_n)+
2338: \sqrt{n}(\pp_n-P)\left(U_{\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)
2339: -U_{\zeta_0}^{\tau}(\hat{\psi}_n)\right)\\
2340: &&+\sqrt{n}
2341: P\left(U_{\hat{\zeta}_n}^{\tau}(\hat{\psi}_n)
2342: -U_{\zeta_0}^{\tau}(\hat{\psi}_n)\right)\\
2343: &\equiv&\sqrt{n}\pp_n U_{\zeta_0}^{\tau}(\hat{\psi}_n)+B_{1,n}+B_{2,n},
2344: \end{eqnarray*}
2345: where the index set for the score terms is ${\cal H}_1$.
2346: By arguments similar to those used in the proof of theorem~\ref{t.l9},
2347: combined with the fact that $n(\hat{\zeta}_n-\zeta_0)=O_P(1)$, we have
2348: that both $B_{1,n}=o_P^{{\cal H}_1}(1)$ and $B_{2,n}=o_P^{{\cal H}_1}(1)$.
2349: Thus $\sqrt{n}\pp_n U_{\zeta_0}(\hat{\psi}_n)=o_P^{{\cal H}_1}(1)$.
2350: We also have that
2351: \[\sqrt{n}(\pp_n-P)U_{\zeta_0}^{\tau}(\hat{\psi}_n)-
2352: \sqrt{n}(\pp_n-P)U_{\zeta_0}^{\tau}(\psi_0)=o_P^{{\cal H}_1}(1).\]
2353: Combining this with lemma~\ref{l4}, the Z-estimator master theorem
2354: (theorem~3.3.1 of \cite{vw96}) now yields the desired results.$\Box$
2355:
2356: {\it Proof of corollary~\ref{c1}.} We first derive the unconditional
2357: limiting distribution of $\sqrt{n}(\hat{\psi}_n^{\circ}-\psi_0)$.
2358: If a class of measurable functions
2359: ${\cal F}$ is $P$-Glivenko-Cantelli with $\|P\|_{\cal F}<\infty$, then
2360: the class $\kappa\cdot{\cal F}=\{\kappa f:f\in{\cal F}\}$, where
2361: $\kappa$ denotes a generic version of one of the weights $\kappa_i$,
2362: is also $P$-Glivenko-Cantelli, by theorem~3 of \cite{vw00}.
2363: Thus we can apply the
2364: results of theorem~\ref{t1}, with only minor modification, combined
2365: with the simple fact that $\bar{\kappa}\rightarrow\mu_{\kappa}$
2366: almost surely, to yield that $\hat{\psi}_n^{\circ}\rightarrow\psi_0$
2367: outer almost surely. Note that the proof is made somewhat easier than
2368: before since we already know $\hat{\zeta}_n\rightarrow\zeta_0$
2369: almost surely. Furthermore, if a class of measurable functions~${\cal F}$
2370: is $P$-Donsker with $\|P\|_{\cal F}<\infty$, then the multiplier central
2371: limit theorem (theorem~2.9.2 of \cite{vw96})
2372: yields that the class $\kappa\cdot{\cal F}$ is also $P$-Donsker.
2373: Hence we can apply the results of theorem~\ref{t4}, with only minor
2374: modification, to yield that $\sqrt{n}(\hat{\psi}_n^{\circ}-\psi_0)$
2375: is asymptotically linear with influence function
2376: $\tilde{l}^{\circ}(h)=(\kappa/\mu_{\kappa})U_{\zeta_0}^{\tau}
2377: (\sigma_{\theta_0}^{-1}(h))$, $h\in{\cal H}_1$. The factor
2378: $\mu_{\kappa}^{-1}$ occurs because the information operator for
2379: the weighted version of the likelihood is $\mu_{\kappa}\sigma_{\theta_0}$.
2380: We now have that $\sqrt{n}(\hat{\psi}_n^{\circ}-\hat{\psi}_n)
2381: =\sqrt{n}\pp_n(\kappa/\mu_{\kappa}-1)U_{\zeta_0}^{\tau}
2382: (\sigma_{\theta_0}^{-1}(\cdot))+o_P^{{\cal H}_1}(1)$, unconditionally.
2383:
2384: Finally,
2385: the conditional multiplier central limit theorem (theorem~2.9.6
2386: of \cite{vw96})
2387: yields part~(ii) of the theorem. The factor $(\mu_{\kappa}/\sigma_{\kappa})$
2388: arises because $\mbox{var}(\kappa/\mu_{\kappa})=\sigma_{\kappa}^2
2389: /\mu_{\kappa}^2$. Similar arguments establish~(i) by using
2390: parallel Glivenko-Cantelli and Donsker results for the nonparametric
2391: bootstrapped empirical process.$\Box$
2392:
2393: {\it Proof of lemma~\ref{s9.l1}.} Let $\mu(x)$ denote the baseline
2394: measure and $\rho_n(x)$, $\rho(x)$ the density function under
2395: $P_n$ and $P$ respectively. In the general situation,
2396: verifying~(\ref{c8.e1}) is equivalent to finding a function $h$
2397: such that:
2398: \begin{eqnarray*}
2399: \;\;\;\;\lefteqn{\int
2400: \left[ \frac{\left(\frac{dP_n(x)}{d \mu(x)}\right)^{1/2 }-
2401: \left(\frac{dP(x)}{d \mu(x)}\right)^{1/2}}{1/\sqrt{n}} -
2402: \frac{1}{2}h(x)\left(\frac{dP(x)}{d \mu(x)}\right)^{1/2}\right]^2d\mu(x)}&&\\
2403: &=& \int \left[ \frac{\rho_n(x)^{1/2}-\rho(x)^{1/2} }{1/\sqrt{n}}
2404: - \frac{1}{2}h(x)\rho(x)^{1/2} \right]^2 d \mu(x)\\
2405: & \rightarrow & \int \left [ \frac{1}{2}
2406: \frac{\dot {\rho}(x)}{(\rho(x))^{1/2}}-
2407: \frac{1}{2}h(x)\frac{\rho(x)}{(\rho(x))^{1/2}} \right]^2 d\mu(x)\\
2408: &=&\int \left [ \frac{1}{2} \frac{\dot{\rho}(x)}
2409: {\rho(x)}(\rho(x))^{1/2}
2410: -\frac{1}{2}h(x)(\rho(x))^{1/2} \right]^2 d\mu(x)\\
2411: &=&0.
2412: \end{eqnarray*}
2413: Hence the given score function satisfies~(\ref{c8.e1})
2414: by the smoothness of the log-likelihood.$\Box$
2415:
2416: {\it Proof of lemma~\ref{s9.l2}.} Note that a consequence of the
2417: Donsker theorem for contiguous alternatives
2418: (theorem~3.10.12 of \cite{vw96})
2419: is that for any bounded $P$-Donsker class ${\cal F}$,
2420: $\|\pp_n-P\|_{\cal F}\weakpn 0$. Thus the proof of
2421: lemma~\ref{l2} can be reconstituted to yield
2422: that $\|\hat{A}_0\|_{[0,\tau]}$ is bounded in probability
2423: under $P_n$, since all of the classes of functions involved are bounded
2424: $P$-Donsker classes. We can similarly modify the proof of theorem~\ref{t1}
2425: to yield the desired results since, once again, the only classes of
2426: functions involved are bounded and $P$-Donsker. This is true,
2427: in particular, for the key class given in lemma~\ref{l.t1.1}, for
2428: any $k<\infty$. Thus $\|\hat{\psi}_0-\psi_0^{\ast}\|_{\infty}
2429: \weakpn 0$.$\Box$
2430:
2431: {\it Proof of theorem~\ref{s9.t1}.} The basic idea of the proof
2432: is to use the Donsker theorem for contiguous alternatives in
2433: combination with key arguments in the proof of theorem~\ref{t5}
2434: and the form of the score and information operators under
2435: model C2'. Pursuing this course, we obtain for any $(h_1,h_2)\in
2436: \re^{q+1}$,
2437: \begin{eqnarray*}
2438: (h_1,h_2')\hat{S}_1(\zeta)&=&\sqrt{n}\pp_n(1,1)\left[\left(
2439: \begin{array}{c}U_{\zeta,1}^{\tau}\\ U_{\zeta,2}^{\tau}
2440: \end{array}\right)(\psi_0^{\ast})\left(\begin{array}{c}
2441: h_1\\h_2\end{array}\right)\right.\\
2442: &&\left.-\left(
2443: \begin{array}{c}U_{\zeta_0,3}^{\tau}\\ U_{\zeta_0,4}^{\tau}
2444: \end{array}\right)(\psi_0^{\ast})\left([\sigma_{\ast}^{22}]^{-1}
2445: \sigma_{\ast}^{21}(\zeta)\left(\begin{array}{c}h_1\\h_2
2446: \end{array}\right)\right)\right]+o_{P_n}^{[a,b]}(1)\\
2447: &\equiv&\sqrt{n}\pp_n H_{\ast}(\zeta)+o_{P_n}^{[a,b]}(1),
2448: \end{eqnarray*}
2449: where $o_{P_n}^B(1)$ denotes a quantity going to zero in
2450: probability, under $P_n$, uniformly over the set $B$. Now
2451: the Donsker theorem for contiguous alternatives yields that
2452: the right-hand side converges to a tight, Gaussian process with
2453: covariance $P[H_{\ast}(\zeta_1)H_{\ast}(\zeta_2)]$, for all
2454: $\zeta_1,\zeta_2\in[a,b]$, and mean $P\left[H_{\ast}\left\{
2455: U_{\zeta_0,1}^{\tau}(\psi_0^{\ast})(\alpha_{\ast})+U_{\zeta_0,2}^{\tau}
2456: (\psi_0^{\ast})(\eta_{\ast})\right\}\right]$. Note that we
2457: only need to compute the moments under the null distribution~$P$.
2458: Careful calculations verify that this yields the desired results.$\Box$
2459:
2460: {\it Proof of corollary~\ref{c2}.} The limiting results under
2461: $P_n$ follow from theorem~\ref{s9.t1} and the continuous mapping
2462: theorem, provided we can show that
2463: \begin{eqnarray}
2464: \inf_{\zeta\in[a,b],v\in\re^{q+1}:\|v\|=1}v'V_{\ast}(\zeta)v&>&0.
2465: \label{e1.m9}
2466: \end{eqnarray}
2467: The limiting null distribution
2468: results will similarly follow from the
2469: fact that under the null distribution~$P$, $\nu_{\ast}(\zeta)=0$
2470: for all $\zeta\in[a,b]$. Note that in both the null and alternative
2471: settings, $V_{\ast}(\zeta)$ only depends on the null limiting
2472: distribution. It is sufficient to verify that $\sigma_{\psi_0^{\ast},
2473: \zeta_n}$ is one-to-one for all sequences $\zeta_n\in[a,b]$
2474: and $h_n\in{\cal H}_{\infty}$. Note that we can ignore any
2475: differences between $\zeta_0$ and $\zeta$ in calculating
2476: $\zeta\mapsto\sigma_{\psi_0^{\ast},\zeta}^{22}$ because of
2477: the non-identifiability of $\zeta$ under the null
2478: hypothesis, ie., $\zeta\mapsto\sigma_{\psi_0^{\ast},\zeta}^{22}$
2479: is constant. Assume now that there exists sequences $\zeta_n\in[a,b]$
2480: and $h_n\in{\cal H}_{\infty}$ such that
2481: $\sigma_{\psi_0^{\ast},\zeta_n}h_n\rightarrow 0$. We will now
2482: show that this forces $h_n\rightarrow 0$. Without loss of
2483: generality, we can assume $\zeta_n\rightarrow\zeta_{\ast}$ and
2484: $h_n\rightarrow h$. Since the map $h\mapsto\sigma_{\psi_0^{\ast},\zeta}h$
2485: is continuous and since $\zeta\mapsto\sigma_{\psi_0^{\ast},\zeta}h$ is
2486: cadlag, we can further assume without loss of generality that
2487: either $\sigma_{\psi_0^{\ast},\zeta_{\ast}}h=0$ or that
2488: $\sigma_{\psi_0^{\ast},\zeta_{\ast}^-}h=0$ (the $\zeta_{\ast}^-$ denotes
2489: that we are converging to $\zeta_{\ast}$ from below). The arguments for either
2490: case are the same, so we will for brevity only give the proof for
2491: the first case.
2492:
2493: By the arguments surrounding expressions~(\ref{c12:e9}), (\ref{c12:e11})
2494: and~(\ref{c12:e12}), combined with the non-identifiability of $\zeta$
2495: under the null model, we obtain that expression~(\ref{c12:e12}) must now
2496: hold for all $t\in(0,\tau]$ but with $\zeta_{\ast}$ replacing $\zeta_0$.
2497: In ortherwords,
2498: $\tilde{Y}(t)(h_1\ind(Y>\zeta_{\ast})+
2499: h_2'Z_2(t)\ind(Y>\zeta_{\ast})+h_3'Z+h_4(t))=0$, almost surely,
2500: for all $t\in(0,\tau]$.
2501: Since var$[Z(t_4)|Y>\zeta_{\ast}]\geq\mbox{var}[Z(t_4)|Y>b]
2502: \times\pr{Y>b}/\pr{Y>\zeta_{\ast}}$
2503: is positive definite by condition~B4, we have $h_3=0$.
2504: We can similarly use~B4 to verify that var$[Z(t_3)|Y\leq\zeta_{\ast}]$
2505: is positive definite and thus $h_2=0$. Now $h_1=0$ and $h_4=0$
2506: easily follow. Hence $h\mapsto\sigma_{\psi_0^{\ast},\zeta}h$ is
2507: uniformly one-to-one in a manner
2508: which yields the conclusion~(\ref{e1.m9}).$\Box$
2509:
2510: {\it Proof of theorem~\ref{s9.t2}.} The results follow from
2511: arguments similar to those used in the proof of theorem~\ref{s9.t1},
2512: but based on the conditional multiplier central limit theorem
2513: for contiguous alternatives, theorem~\ref{s9.t2.t1} below.$\Box$
2514:
2515: \begin{theorem}\label{s9.t2.t1} (Conditional multiplier central
2516: limit theorem for contiguous alternatives) Let ${\cal F}$ be
2517: a $P$-Donsker class of measurable functions, and let $P_n$ satisfy
2518: \[\int\left[\sqrt{n}(dP_n^{1/2}-dP^{1/2})-\frac{1}{2}hdP^{1/2}
2519: \right]^{1/2}\rightarrow 0\label{s9.t2.t1.e1},\]
2520: as $n\rightarrow\infty$, for some real valued, measurable
2521: function $h$. Also assume
2522: $\lim_{M\rightarrow\infty}$ $\limsup_{n\rightarrow\infty}
2523: P_n(f-Pf)^2\ind\{|f-Pf|>M\}=0$
2524: for all $f\in{\cal F}$, and that the multipliers in
2525: the weighted bootstrap, $\kappa_1,\ldots,\kappa_n$, are i.i.d.
2526: and independent of the data, with mean $0<\mu_{\kappa}<\infty$
2527: and variance $0<\sigma_{\kappa}^2<\infty$, and with
2528: $\int_0^{\infty}\sqrt{P(\kappa_1>u)}du<\infty$.
2529: Then $(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)
2530: \weakpnboot\mathbb{G}$ in $\ell^{\infty}({\cal F})$,
2531: where $\mathbb{G}$ is a tight, mean zero Brownian bridge process.
2532: \end{theorem}
2533:
2534: {\it Proof.} The detailed proof can be found in chapter~11
2535: of Kosorok (To appear). We now present a synopsis of the proof.
2536: Let $\tilde{\kappa}_i\equiv\sigma_{\kappa}^{-1}
2537: (\kappa_i-\mu_{\kappa})$, $i=1,\ldots,n$, and note that
2538: \begin{eqnarray}
2539: \label{s9.t2.t1.e3}&&\\
2540: \pp_n^{\circ}-\pp_n&=&n^{-1/2}\sum_{i=1}^n(\kappa_i/\bar{\kappa}
2541: -1)\Delta_{X_i}\;\;=\;\;
2542: n^{-1/2}\sum_{i=1}^n(\kappa_i/\bar{\kappa}-1)(\Delta_{X_i}-P)
2543: \nonumber\\
2544: &=&\frac{\sigma_{\kappa}}{\mu_{\kappa}}n^{-1/2}\sum_{i=1}^n
2545: \tilde{\kappa}_i(\Delta_{X_i}-P)+
2546: \left(\frac{\sigma_{\kappa}}{\bar{\kappa}}-\frac{\sigma_{\kappa}}
2547: {\mu_{\kappa}}\right)n^{-1/2}\sum_{i=1}^n\tilde{\kappa}_i(\Delta_{X_i}-P)
2548: \nonumber\\
2549: &&+\left(\frac{\mu_{\kappa}}{\bar{\kappa}}-1\right)n^{-1/2}
2550: \sum_{i=1}^n(\Delta_{X_i}-P),\nonumber
2551: \end{eqnarray}
2552: where $\Delta_{X_i}$ is the Dirac measure of the observation $X_i$.
2553: Since ${\cal F}$ is $P$-Donsker, we also have that
2554: $\dot{\cal F}\equiv\{f-Pf:f\in{\cal F}\}$ is $P$-Donsker. Thus
2555: by the unconditional multiplier central limit theorem,
2556: we have that $\tilde{\kappa}\cdot{\cal F}$ is also $P$-Donsker. Now, by
2557: that fact that $\|P(f-Pf)\|_{\cal F}=0$ (trivially) combined with
2558: the central limit theorem under contiguous alternatives, we have that
2559: both $f\mapsto n^{-1/2}\sum_{i=1}^n\tilde{\kappa}_i(\Delta_{X_i}-P)f
2560: \weakpn\mathbb{G}f$ and $f\mapsto n^{-1/2}\sum_{i=1}^n(\Delta_{X_i}
2561: -P)\weakpn\mathbb{G}f+P[(f-Pf)h]$ in $\ell^{\infty}({\cal F})$.
2562: Thus the last two terms in~(\ref{s9.t2.t1.e3})$\;\weakpn 0$, and hence
2563: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)
2564: \weakpn\mathbb{G}$ in $\ell^{\infty}({\cal F})$.
2565: This now implies the unconditional asymptotic
2566: tightness and desired asymptotic measurability of $\sqrt{n}
2567: (\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)$.
2568: Fairly standard arguments can now be used along with the given pointwise
2569: uniform square integrability condition to verify that
2570: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)$
2571: applied to any finite dimensional collection $f_1,\ldots,f_m\in{\cal F}$
2572: converges under $P_n$ in distribution, conditional on the data,
2573: to the appropriate limiting Gaussian process. This now implies
2574: $\sqrt{n}(\mu_{\kappa}/\sigma_{\kappa})(\pp_n^{\circ}-\pp_n)\,
2575: \weakpnboot\mathbb{G}$.$\Box$
2576:
2577: {\it Proof of corollary~\ref{c3}.} Assume at first
2578: that $\tilde{M}_n$ is a fixed
2579: number $\tilde{M}<\infty$. Theorem~\ref{s9.t2} now yields that the
2580: collection $\{\hat{S}_{1,1}^{\circ}-\hat{S}_1,\ldots,
2581: \hat{S}_{1,\tilde{M}_n}^{\circ}-\hat{S}_1\}$ converges jointly, conditionally
2582: on the data, to $\tilde{M}$ i.i.d. copies of $\mathbb{Z}_{\ast}$.
2583: Thus $\hat{V}_n$ converges weakly to the sample covariance
2584: process (divided by $\tilde{M}_n$ instead of $\tilde{M}_n-1$)
2585: of an i.i.d. sample of $\tilde{M}_n$ copies of
2586: $\mathbb{Z}_{\ast}$. The same result holds true if we allow
2587: $\tilde{M}_n$ to go to~$\infty$ slowly enough. Since the
2588: Gaussian processes involved are tight, $\hat{V}_n$ will thus
2589: be consistent for $\Sigma_{\ast}$, uniformly over $\zeta\in[a,b]$.
2590: Similar arguments yield pointwise consistency of $\hat{\mathbb{F}}$
2591: and $\tilde{\mathbb{F}}$ at continuity points of
2592: $\hat{\mathbb{T}}_{\ast}$ and $\tilde{\mathbb{T}}_{\ast}$.
2593: Since it is not hard to verify that
2594: both $\hat{\mathbb{T}}_{\ast}$ and $\tilde{\mathbb{T}}_{\ast}$
2595: have continuous distributions, the pointwise consistency extends
2596: to the desired uniform consistency.$\Box$
2597:
2598: \section*{Acknowledgments}
2599: The authors thank Editor Morris Eaton, an associate editor, and two
2600: referees for their extremely
2601: careful review and helpful suggestions that led
2602: to an improved paper.
2603:
2604: \begin{thebibliography}{9}
2605:
2606: \bibitem{a01}
2607: {\sc Andrews, D. W. K.} (2001). Testing when a parameter is on the
2608: boundary of the maintained hypothesis. {\it Econometrica} {\bf
2609: 69}, 683--73.
2610:
2611: \bibitem{ap94}
2612: {\sc Andrews, D. W. K., and Plogerger, W.} (1994). Optimal
2613: tests when a nuisance parameter is present only under the
2614: alternative. {\em Econometrica} {\bf 62}, 1383--1414.
2615:
2616: \bibitem{bn04}{\sc Bagdonavi\v{c}ius, V., and Nikulin, M.} (2004).
2617: Statistical modeling in survival analysis and its influence on the
2618: duration analysis. {\it Advances in survival analysis}, 411--429,
2619: {\it Handbook of Statistics, 23}. Elsevier, Amsterdam.
2620:
2621: \bibitem{bkrw98}
2622: {\sc Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner,
2623: J. A.} (1998). {\it Efficient and Adaptive Estimation for
2624: Semiparametric Models}. Springer-Verlag, New York.
2625:
2626: \bibitem{bd81}{\sc Bickel, P. J., and Doksum, K. A.} (1981). An analysis of
2627: transformations revisited. {\em Journal of the American
2628: Statistical Association} {\bf 76}, 296--311.
2629:
2630: \bibitem{br97}{\sc Bickel, P. J., and Ritov, Y.} (1997). Local asymptotic
2631: normality of ranks and covariates in transformation models.
2632: {\em Festschrift for Lucien Le Cam: Research papers in probability
2633: and statistics}, 43--54.
2634:
2635: \bibitem{bc64}{\sc Box, G. E. P., and Cox, D. R.} (1964).
2636: An analysis of transformations. (With discussion)
2637: {\em Journal of the Royal Statistical Society}, Series B
2638: {\bf 26}, 211--252.
2639:
2640: \bibitem{bc82}{\sc Box, G. E. P., and Cox, D. R.} (1982).
2641: An analysis of transformations revisited, rebutted.
2642: {\em Journal of the American Statistical Association}
2643: {\bf 77}, 209--210.
2644:
2645: \bibitem{c89}{\sc Chappell, R.} (1989). Fitting bent lines to data, with
2646: applications to allometry. {\it Journal of Theoretical Biology}
2647: {\bf 138}, 235-256.
2648:
2649: \bibitem{cwy95}{\sc Cheng, S. C., Wei, L. J., and Ying, Z.} (1995). Analysis
2650: of transformation models with censored data. {\em Biometrika} {\bf 82},
2651: 835--845.
2652:
2653: \bibitem{cwy97}{\sc Cheng, S. C., Wei, L. J., and Ying, Z.} (1997). Predicting
2654: survival probabilities with semiparametric transformation models. {\it Journal
2655: of the American Statistical Association} {\bf 92}, 227--235.
2656:
2657: \bibitem{dd88}{\sc Dabrowska, D.M. and Doksum, K.A.} (1988). Estimation
2658: and Testing in the Two-sample Generalized Odds-Rate Model. {\it
2659: Journal of the American Statistical Association} {\bf 83}, 1--23.
2660:
2661: \bibitem{d87}{\sc Davies, R. B.} (1987). Hypothesis testing when a nuisance
2662: parameter is present only under the alternative. {\em Biometrika}
2663: {\bf 74}, 33--43.
2664:
2665: \bibitem{fyw98}
2666: {\sc Fine, J. P., Ying, Z., and Wei, L. J.} (1998). On the linear
2667: transformation model for censored data. {\em Biometrika} {\bf 85}, 980--986.
2668:
2669: \bibitem{ih81}{\sc Ibragimov, I. A., and Has'minskii, R. Z.} (1981). {\em
2670: Statistical estimation: Asymptotical theory}. Springer, New York.
2671:
2672: \bibitem{kta}{\sc Kosorok, M. R.} (To appear). {\em Introduction to Empirical
2673: Processes and Semiparametric Inference}. Springer, New York.
2674:
2675: \bibitem{klf04}{\sc Kosorok, M. R., Lee, B. L. and Fine, J. P.} (2004). Robust
2676: Inference for Univariate Proportional Hazards Frailty Regression
2677: Models. {\it The Annals of Statistics} {\bf 32}, 1448-1491.
2678:
2679: \bibitem{lsl90}{\sc Liang, K.-Y., Self, S. G., and Liu, X.} (1990). The Cox
2680: proportional hazards model with change point: An epidemiologic
2681: application. {\em Biometrics} {\bf 46}, 783--793.
2682:
2683: \bibitem{ly93}{\sc Lin, D. Y. and Ying, Z.} (1993). Cox regression with
2684: incomplete covariate measurements. {\it Journal of the American Statistical
2685: Association} {\bf 88}, 1341--1349.
2686:
2687: \bibitem{lb97}{\sc Luo, X. and Boyett, J. M.} (1997). Estimation of a
2688: threshold parameter in cox regression. {\it Communication in
2689: Statistics--Theory and Methods} {\bf 26}, 2329--2346.
2690:
2691: \bibitem{ltc97}{\sc Luo, X., Turnbull, B.W. and Clark, L.C.} (1997).
2692: Likelihood ratio tests for a changepoint with survival data. {\it
2693: Biometrica} {\bf 84}, 555--565.
2694:
2695: \bibitem{mrv97}
2696: {\sc Murphy, S. A., Rossini, A. J., and van der Vaart, A. W.} (1997).
2697: Maximum likelihood estimation in the
2698: proportional odds model. {\it Journal of the American Statistical
2699: Association} {\bf 92}, 968--976.
2700:
2701: \bibitem{p98}{\sc Parner, E.} (1998). Asymptotic theory for the correlated
2702: gamma-frailty model. {\em Annals of Statistics} {\bf 26}, 183--214.
2703:
2704: \bibitem{p82}{\sc Pettit, A. N.} (1982). Inference for the linear model using
2705: a likelihood based on ranks. {\em Journal of the Royal Statistical
2706: Society}, Series B {\bf 44}, 234--243.
2707:
2708: \bibitem{p84}{\sc Pettit, A. N.} (1984). Proportional odds models for survival
2709: data and estimates using ranks. {\em Applied Statistics} {\bf 33}, 169--175.
2710:
2711: \bibitem{pr94}{\sc Politis, D. N., and Romano, J. P.} (1994). Large sample
2712: confidence regions based on subsamples under minimal assumptions.
2713: {\em Annals of Statistics} {\bf 22}, 2031--2050.
2714:
2715: \bibitem{p03}{\sc Pons, O.} (2003). Estimation in a cox regression model
2716: with a change-point according to a threshold in a covariate. {\it
2717: The Annals of Statistics} {\bf 31}, 442--463.
2718:
2719: \bibitem{stg98}
2720: {\sc Scharfstein, D. O., Tsiatis, A. A., and Gilbert, P. B.} (1998).
2721: Semiparametric efficient estimation in the generalized odds-rate class
2722: of regression models for right-censored time-to-event data.
2723: {\em Lifetime Data Analysis} {\bf 4}, 355--391.
2724:
2725: \bibitem{s98}{\sc Shen, X.} (1998). Proportional odds regression and sieve
2726: maximum likelihood estimation. {\em Biometrika} {\bf 85}, 165--177.
2727:
2728: \bibitem{sv04}{\sc Slud, E. V., and Vonta, F.} (2004). Consistency of the
2729: NPML estimator in the right-censored transformation model.
2730: {\em Scandinavian Journal of Statistics} {\bf 31}, 21--41.
2731:
2732: \bibitem{v98}{\sc van der Vaart, A. W.} (1998). {\em Asymptotic Statistics}.
2733: Cambridge University Press, Cambridge.
2734:
2735: \bibitem{vw96}{\sc van der Vaart, A. W., and Wellner, J. A.} (1996). {\it
2736: Weak Convergence and Empirical Processes: With Applications to
2737: Statistics.} Springer, New York.
2738:
2739: \bibitem{vw00}
2740: {\sc van der Vaart, A. W., and Wellner, J. A.} (2000). Preservation
2741: theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli classes.
2742: {\em High Dimensional Probability II}, 113--132. Birkhauser, Boston.
2743:
2744: \end{thebibliography}
2745:
2746: \end{document}
2747: