Statistical Analysis with Missing Data by Roderick J. A. Little

By Roderick J. A. Little

"An very important contribution to the utilized data literature.... I supply the publication excessive marks for unifying and making obtainable a lot of the previous and present paintings during this vital area."
—William E. Strawderman, Rutgers University

"This book...provide[s] fascinating real-life examples, stimulating end-of-chapter workouts, and up to date references. it's going to be on each utilized statistician’s bookshelf."
The Statistician

"The e-book may be studied within the statistical tools division in each statistical agency."
Journal of professional Statistics

Statistical research of knowledge units with lacking values is a pervasive challenge for which typical tools are of restricted price. the 1st version of Statistical research with lacking information has been a typical reference on missing-data tools. Now, reflecting wide advancements in Bayesian tools for simulating posterior distributions, this moment variation via said specialists at the topic bargains a completely up to date, reorganized survey of present method for dealing with missing-data problems.

Blending conception and alertness, authors Roderick Little and Donald Rubin evaluation old methods to the topic and describe rigorous but basic tools for multivariate research with lacking values. They then supply a coherent thought for research of difficulties in accordance with likelihoods derived from statistical versions for the information and the missing-data mechanism and practice the idea to a variety of vital missing-data problems.

The re-creation now enlarges its assurance to include:

  • Expanded assurance of Bayesian technique, either theoretical and computational, and of a number of imputation
  • Analysis of knowledge with lacking values the place inferences are in accordance with likelihoods derived from formal statistical versions for the data-generating and missing-data mechanisms
  • Applications of the strategy in quite a few contexts together with regression, issue research, contingency desk research, time sequence, and pattern survey inference
  • Extensive references, examples, and exercises

Amstat News requested 3 evaluate editors to expense their best 5 favourite books within the September 2003 factor. Statistical research With lacking Data was once between these selected.

On the other hand, if we have little knowledge about the form of the uncensored distribution, we cannot say whether the data are a censored sample from a symmetric distribution or a random subsample from an asymmetric distribution. In the former case, the sample mean is biased for the population mean; in the latter case it is not. 10. Historical Heights. Wachter and Trussell (1982) present an interesting illustration of stochastic censoring, involving the estimation of historical heights. The distribution of heights in historical populations is of considerable interest in the biomedical and social sciences, because of the information it provides about nutrition, and hence indirectly about living standards.

It is applied to the case of multivariate normal data in Chapter 11. The resulting algorithm is particularly instructive, because it is closely related to an iterative version of a method that imputes estimates of the missing values by regression. Thus even in this complex problem, a link can be established between efficient model-based methods and more traditional pragmatic approaches based on substituting reasonable estimates for missing values. Chapter 11 also presents more esoteric uses of the EM algorithm to handle problems such as variance components models, factor analysis, and time series, which can be viewed as missing-data problems for multivariate normal data with specialized parametric structure.

Approaches to measuring and incorporating imputation uncertainty are discussed in Chapter 5, including multiple imputation, which reappears in Parts II and III in the context of model-based methods. 4. Model-Based Procedures. A broad class of procedures is generated by defining a model for the observed data and basing inferences on the likelihood or posterior distribution under that model, with parameters estimated by procedures such as maximum likelihood. Advantages of this approach are flexibility; the avoidance of ad hoc methods, in that model assumptions underlying the resulting methods can be displayed and evaluated; and the availability of estimates of variance that take into account incompleteness in the data.

