abstract:0eee4ba99c4c5755.tex

1: \begin{abstract}

2: Ultra-high dimensional longitudinal data are increasingly common

3: and the analysis is challenging both theoretically and methodologically.

4: We offer a new automatic procedure for finding

5: a sparse semivarying coefficient model, which is widely accepted for

6: longitudinal data analysis.

7: Our proposed method first reduces the number of covariates to a moderate

8: order by employing a screening procedure, and then identifies both

9: the varying and \mbox{constant} coefficients using a group SCAD estimator,

10: which is subsequently refined by

11: accounting for the within-subject correlation.

12: The screening procedure is based on working independence and B-spline

13: marginal models.

14: Under weaker conditions than those in the literature, we show that with

15: high probability only

16: irrelevant variables will be screened out, and the number of selected

17: variables can be bounded by a moderate order. This allows the desirable

18: sparsity and oracle properties of the subsequent structure

19: identification step. %It also marks the significance of our theory and

20: %methodology,

21: Note that existing methods require some kind of iterative screening in

22: order to achieve this,

23: thus they demand heavy computational effort and consistency is not

24: guaranteed. %We prove that our group SCAD estimator detects the

25: %constant and varying effects simultaneously.

26: The refined semivarying coefficient model employs

27: profile least squares, local linear smoothing and nonparametric

28: covariance estimation,

29: and is semiparametric efficient.

30: We also suggest ways to implement

31: the proposed methods, and to select the tuning parameters. An extensive

32: simulation study

33: is summarized to demonstrate its finite sample performance and the

34: yeast cell cycle data is analyzed.

35: \end{abstract}