Graphics Reference
In-Depth Information
data. But a significative difference between the two methods is attained: while EM
generates a single imputation in each step from the estimated parameters at each
step, MI performs several imputations that yield several complete data sets.
This repeated imputation can be done thanks to the use of Markov Chain Monte
Carlo methods, as the several imputations are obtained by introducing a random
component, usually from a standard normal distribution. In a more advanced fashion,
MI also considers that the parameters estimates are in fact sample estimates. Thus,
the parameters are not directly estimated from the available data but, as the process
continues, they are drawn from their Bayesian posterior distributions given the data at
hand. These assumptions means that only in the case of MCAR or MARmissingness
mechanisms hold MI should be applied.
As a result Eq. ( 4.9 ) can be applied with slight changes as the e term is now a
sample from a standard normal distribution and is applied more than once to obtain
several imputed values for a single MV. As indicated in the previous paragraph,
MI has a Bayesian nature that forces the user to specify a prior distribution for the
parameters
of the model fromwhich the e term is drawn. In practice [ 83 ]isstressed
out that the results depend more on the election of the distribution for the data than
the distribution for
θ
. Unlike the single imputation performed by EMwhere only one
imputed value for eachMV is created (and thus only one value of e is drawn), MI will
create several versions of the data set, where the observed data X obs is essentially the
same, but the imputed values for X mis will be different in each data set created. This
process is usually known as data augmentation (DA) [ 91 ] as depicted in Fig. 4.2 .
Surprisingly not many imputation steps are needed. Rubin claims in [ 80 ] that only
3-5 steps are usually needed. He states that the efficiency of the final estimation built
upon m imputations is approximately:
θ
Fig. 4.2 Multiple imputation process by data augmentation. Every MV denoted by a '?' is replaced
by several imputed and different values that will be used to continue the process later
 
Search WWH ::




Custom Search