Longitudinal data analysis with composite likelihood methods

Authors: Haocheng Li

Overview

Abstract (English)

Longitudinal data arise commonly in many fields including public health studies and survey sampling. Valid inference methods for longitudinal data are of great importance in scientific researches. In longitudinal studies, data collection are often designed to follow all the interested information on individuals at scheduled times. The analysis in longitudinal studies usually focuses on how the data change over time and how they are associated with certain risk factors or covariates. Various statistical models and methods have been developed over the past few decades. However, these methods could become invalid when data possess additional features. first of all, incompleteness of data presents considerable complications to standard modeling and inference methods. Although we hope each individual completes all of the scheduled measurements without any absence, missing observations occur commonly in longitudinal studies. It has been documented that biased results could arise if such a feature is not properly accounted for in the analysis. There has been a large body of methods in the literature on handling missingness arising either from response components or covariate variables, but relatively little attention has been directed to addressing missingness in both response and covariate variables simultaneously. Important reasons for the sparsity of the research on this topic may be attributed to substantially increased complexity of modeling and computational difficulties. In Chapter 2 and Chapter 3 of the thesis, I develop methods to handle incomplete longitudinal data using the pairwise likelihood formulation. The proposed methods can handle longitudinal data with missing observations in both response and covariate variables. A unified framework is invoked to accommodate various types of missing data patterns. The performance of the proposed methods is carefully assessed under a variety of circumstances. In particular, issues on efficiency and robustness are investigated. Longitudinal survey data from the National Population Health Study are analyzed with the proposed methods. The other difficulty in longitudinal data is model selection. Incorporating a large number of irrelevant covariates to the model may result in computation, interpretation and prediction difficulties, thus selecting parsimonious models are typically desirable. In particular, the penalized likelihood method is commonly employed for this purpose. However, when we apply the penalized likelihood approach in longitudinal studies, it may involve high dimensional integrals which are computationally expensive. We propose an alternative method using the composite likelihood formulation. Formulation of composite likelihood requires only a partial structure of the correlated data such as marginal or pairwise distributions. This strategy shows modeling tractability and computational cheapness in model selection. Therefore, in Chapter 4 of this thesis, I propose a composite likelihood approach with penalized function to handle the model selection issue. In practice, we often face the model selection problem not only from choosing proper covariates for regression predictor, but also from the component of random effects. Furthermore, the specification of random effects distribution could be crucial to maintain the validity of statistical inference. Thus, the discussion on selecting both covariates and random effects as well as misspecification of random effects are also included in Chapter 4. Chapter 5 of this thesis mainly addresses the joint features of missingness and model selection. I propose a specific composite likelihood method to handle this issue. A typical advantage of the approach is that the inference procedure does not involve explicit missing process assumptions and nuisance parameters estimation.

Abstract (French)

Please note that abstracts only appear in the language of the publication and might not have a translation.

Details

Type	PhD dissertation
Author	Haocheng Li
Publication Year	2012
Title	Longitudinal data analysis with composite likelihood methods
City	Waterloo, ON
Department	Department of Statistics and Actuarial Science
University	University of Waterloo
Publication Language	English

Download Citation (.bib)

Related Publications

Lynda M. Hayward (2004).

A cohort and gender comparison of mid-life characteristics associated with residential mobility upon retirement

McMaster RDC Research Paper , 5

Alison L. Park, Rebecca Fuhrer, and Amélie Quesnel-Vallée (2013).

Scolarité des parents et risque de dépression chez les jeunes adultes

Scott B. Patten, Jeanne V. A. Williams, Dina H. Lavorato, Kirsten M. Fiest, Andrew G. M. Bulloch, and JianLi Wang (2015).

The prevalence of major depression is not changing

Canadian Journal of Psychiatry , 31-34

K. Dechant (2005).

Linking fitness and holistic medicine: Using growth models to correlate adult Canadians' individual physical activity and use of holistic medicine

A. Chambers (2004).

A comparison of the predictors of heart health among immigrants and native-born Canadians

Julie Fournier (2007).

L'effet de l'horaire de travail par quarts rotatifs sur la détresse psychologique: Une étude longitudinale

J. Wang (2004).

A longitudinal population-based study of treated and untreated major depression

Medical Care , 543-550

Rania A. Wasfi, Kaberi Dasgupta, Heather Orpana, and Nancy A. Ross (2016).

Neighborhood walkability and body mass index trajectories: Longitudinal study of Canadians

American Journal of Public Health , 934-940

Data Used

National Population Health Survey - Household Component - Longitudinal

Longitudinal, Survey (1994 to 2011)

NPHS

Research Data Centre(s)

SWORDC (Waterloo)

SWO

Canadian Research Data Centre Network