abstract:8bb0d16e7be525d2.tex

1: \begin{abstract}

2: Several factors make clustering of functional data challenging, including the infinite dimensional space to which observations belong and the lack of a defined probability density function for the functional random variable. To overcome these barriers, researchers either assume that observations belong to a finite dimensional space spanned by basis functions or apply nonparametric smoothing methods to the functions prior to clustering. Although extensive literature describes clustering methods for functional data, few studies have explored the clustering of error–prone functional data. In this work, we consider clustering methods for functional data prone to complex, heteroscedastic measurement errors and propose a two-stage-based approach for clustering. Under the first stage, clustered mixed effects models are applied to adjust for measurement error bias, followed by cluster analysis of the measurement error–adjusted curves in the second stage. The cluster analysis can be performed using readily available methods for cluster analysis such as K-means and mclust. Through simulations, we investigate how varying sample sizes, the magnitude of measurement error and the correlation structure associated with the measurement errors influence the clustering of the error prone data. Our results indicate that failing to account for measurement errors and the correlation structures associated with frequently collected functional data reduces the accuracy of identifying the true latent groups or clusters. The developed methods are applied to two data sets, a school-based study of energy expenditure among elementary school-aged children in Texas and data from the National Health and Nutritional Examination Survey on participants’ physical activity monitored by wearable devices at frequent intervals.

3: \end{abstract}

4: