Semiparametric Bayesian Density Estimation with Disparate Data Sources: A Meta-Analysis of Global Childhood Undernutrition

Publisher: Journal of the American Statistical Association, vol. 110, no. 511 (subscription required)
Sep 30, 2015
Mariel M. Finucane, Christopher J. Paciorek, Gretchen A. Stevens, and Majid Ezzati
Undernutrition, resulting in restricted growth, and quantified here using height-for-age z-scores, is an important contributor to childhood morbidity and mortality. Since all levels of mild, moderate, and severe undernutrition are of clinical and public health importance, it is of interest to estimate the shape of the z-scores’ distributions. We present a finite normal mixture model that uses data on 4.3 million children to make annual country-specific estimates of these distributions for under-5-year-old children in the world’s 141 low- and middle-income countries between 1985 and 2011. We incorporate both individual-level data when available, as well as aggregated summary statistics from studies whose individual-level data could not be obtained. We place a hierarchical Bayesian probit stick-breaking model on the mixture weights. The model allows for nonlinear changes in time, and it borrows strength in time, in covariates, and within and across regional country clusters to make estimates where data are uncertain, sparse, or missing. This work addresses three important problems that often arise in the fields of public health surveillance and global health monitoring. First, data are always incomplete. Second, different data sources commonly use different reporting metrics. Last, distributions, and especially their tails, are often of substantive interest.