Analysis of correlated data with measurement error in responses or covariates

Auteurs: Zhijian Chen

Aperçu

Résumé (français)

Veuillez noter que les résumés n'apparaissent que dans la langue de la publication et peuvent ne pas avoir de traduction.

Résumé (anglais)

Correlated data frequently arise from epidemiological studies, especially familial and longitudinal studies. Longitudinal design has been used by researchers to investigate the changes of certain characteristics over time at the individual level as well as how potential factors influence the changes. Familial studies are often designed to investigate the dependence of health conditions among family members. Various models have been developed for this type of multivariate data, and a wide variety of estimation techniques have been proposed. However, data collected from observational studies are often far from perfect, as measurement error may arise from different sources such as defective measuring systems, diagnostic tests without gold references, and self-reports. Under such scenarios only rough surrogate variables are measured. Measurement error in covariates in various regression models has been discussed extensively in the literature. It is well known that naive approaches ignoring covariate error often lead to inconsistent estimators for model parameters. In this thesis, we develop inferential procedures for analyzing correlated data with response measurement error. We consider three scenarios: (i) likelihood-based inferences for generalized linear mixed models when the continuous response is subject to nonlinear measurement errors; (ii) estimating equations methods for binary responses with misclassifications; and (iii) estimating equations methods for ordinal responses when the response variable and categorical/ordinal covariates are subject to misclassifications. The first problem arises when the continuous response variable is difficult to measure. When the true response is defined as the long-term average of measurements, a single measurement is considered as an error-contaminated surrogate. We focus on generalized linear mixed models with nonlinear response error and study the induced bias in naive estimates. We propose likelihood-based methods that can yield consistent and efficient estimators for both fixed-effects and variance parameters. Results of simulation studies and analysis of a data set from the Framingham Heart Study are presented. Marginal models have been widely used for correlated binary, categorical, and ordinal data. The regression parameters characterize the marginal mean of a single outcome, without conditioning on other outcomes or unobserved random effects. The generalized estimating equations (GEE) approach, introduced by Liang and Zeger (1986), only models the first two moments of the responses with associations being treated as nuisance characteristics. For some clustered studies especially familial studies, however, the association structure may be of scientific interest. With binary data Prentice (1988) proposed additional estimating equations that allow one to model pairwise correlations. We consider marginal models for correlated binary data with misclassified responses. We develop “corrected” estimating equations approaches that can yield consistent estimators for both mean and association parameters. The idea is related to Nakamura (1990) that is originally developed for correcting bias induced by additive covariate measurement error under generalized linear models. Our approaches can also handle correlated misclassifications rather than a simple misclassification process as considered by Neuhaus (2002) for clustered binary data under generalized linear mixed models. We extend our methods and further develop marginal approaches for analysis of longitudinal ordinal data with misclassification in both responses and categorical covariates. Simulation studies show that our proposed methods perform very well under a variety of scenarios. Results from application of the proposed methods to real data are presented. Measurement error can be coupled with many other features in the data, e.g., complex survey designs, that can complicate inferential procedures. We explore combining survey weights and misclassification in ordinal covariates in logistic regression analyses. We propose an approach that incorporates survey weights into estimating equations to yield design-based unbiased estimators. In the final part of the thesis we outline some directions for future work, such as transition models and semiparametric models for longitudinal data with both incomplete observations and measurement error. Missing data is another common feature in applications. Developing novel statistical techniques for dealing with both missing data and measurement error can be beneficial.

Détails

Type	Thèse de doctorat
Auteur	Zhijian Chen
Année de pulication	2010
Titre	Analysis of correlated data with measurement error in responses or covariates
Ville	Waterloo, ON
Département	Statistics and Actuarial Science
Université	University of Waterloo
Langue de publication	Anglais

Télécharger la citation

Publications connexes

Zhijian Chen, Grace Y. Yi, et Changbao Wu (2014).

Marginal analysis of longitudinal ordinal data with misclassification in both response and covariates

Biometrical Journal , 69-85

Zhijian Chen, Grace Y. Yi, et Changbao Wu (2011).

Marginal methods for correlated binary data with misclassified responses

Biometrika , 647-662

Grace Y. Yi, Zhijian Chen, et Changbao Wu (2016).

Analysis of correlated data with error-prone response under generalized linear mixed models

Natalie Diane Riediger, Shahin Shooshtari, et Mohammed Hassan Moghadasian (2007).

The influence of sociodemographic factors on patterns of fruit and vegetable consumption in Canadian adolescents

Journal of the American Dietetic Association , 1511-1518

Peter M. Smith, Amber Bielecky, et Cameron Mustard (2012).

The relationship between chronic conditions and work-related injuries and repetitive strain injuries in Canada

Journal of Occupational and Environmental Medicine , 841-846

Hassanali Vatanparast, Mona S. Calvo, Timothy J. Green, et Susan J. Whiting (2010).

Despite mandatory fortification of staple foods, vitamin D intakes of Canadian children and adults are inadequate

The Journal of Steroid Biochemistry and Molecular Biology , 301-303

Peter Kitchen, Allison Williams, et James Chowhan (2011).

Walking to work in Canada: Health benefits, socio-economic characteristics and urban-regional variations

BMC Public Health

A. C. Coronado, C. Finley, K. Badovinac, J. Han, J. Niu, et R. Rahal (2018).

Discrepancies between Canadian cancer research funding and site-specific cancer burden

Current Oncology , 338-341

Données utilisées

Enquête sur la santé dans les collectivités canadiennes - Nutrition

Enquêtes transversales, Répété (2004 À 2015)

ESCC

Réseau canadien des Centres de données de recherche