Research
Imaging data integration
The aim of this project is to develop regression modeling frameworks to handle the study-level heterogeneity of many large-scale imaging studies, such as Alzheimer’s disease neuroimaging initiative (ADNI) study and the UK Biobank study. Despite the numerous successes of those imaging studies, such study-level heterogeneity may be caused by the differences in study environment, population, design, and protocols and has posed major challenges in integrative analysis of imaging data collected from multi-centers or multi-studies. We propose both estimation and inference procedures for estimating the unknown parameters and detecting the unknown confounding factors. The asymptotic properties of both estimation and inference procedures are systematically investigated.
Abnormal region detection
Magnetic resonance imaging (MRI) has become an important imaging technique for quantifying the spatial location and magnitude/direction of longitudinal cartilage morphology changes in diseased patients. Although several analytical methods, such as subregion-based analysis, have been developed to refine and improve quantitative cartilage analyses, they can be suboptimal due to two major issues: the lack of spatial correspondence across subjects and time and the dynamic spatial heterogeneity of cartilage progression across subjects. The aim of this project is to present a statistical method for longitudinal cartilage quantification in diseased patients, while addressing these two issues.
Manifold data analysis
Clustering is one of the fundamental tools in manifold learning, and it has been extensively studied in many applications. However, in many image analysis problems (e.g., directional data analysis, shape analysis), most existing clustering methods established in Euclidean space face several challenges including a symmetric space, a high dimensional feature space, and manifold data variation associated with some covariates. In order to address such challenges, a penalized model-based clustering framework is developed to cluster high dimensional manifold data in symmetric spaces. Specifically, manifold learning methods are proposed with mixing proportions defined through a logistic model and Riemannian normal distribution in each component for data in symmetric spaces. A geodesic factor analyzer is established to explicitly model the high dimensional features. Penalized likelihood approaches are used to realize variable selection procedures.
Mediation analysis & Imaging genetics
Causal mediation analysis is widely utilized in neuroscience to investigate the role of brain image phenotypes in the neurological pathways from genetic exposures to clinical outcomes. However, it is still difficult to conduct a genome-wide mediation analysis with the shapes of brain regions as mediators due to several challenges, including (i) large-scale genetic exposures, i.e., millions of single-nucleotide polymorphisms (SNPs); (ii) nonlinear Hilbert space for complex mediators; and (iii) statistical inference on the direct and indirect effects. To tackle these challenges, this project proposes a mediation analysis framework with high dimensional genetic exposures and shape mediators. To identify the underlying causal pathways from the detected SNPs to the clinical outcome implicitly through the complex mediators, we proposed a framework consisting of an object-on-scalar model and a scalar-on-object model. Furthermore, the bootstrap resampling approach is adopted to investigate both global and local significant mediation effects.