Information Technology Reference
In-Depth Information
5.6.2 SAS MISSING VALUES ANALYSIS
Most SAS statistical procedures ignore cases with any missing variable
values from the analysis. These cases are called “incomplete.” Limiting
your analysis to only complete cases is potentially problematic because
the information contained in the incomplete cases is lost. Additionally,
analyzing only complete cases disregards any possible systematic differ-
ences between the complete cases and the incomplete cases. The results
may not be generalizable to the population of all cases, especially with a
small number of complete cases.
One method for handling missing data is single imputation ,which
substitutes a value for each missing value. For example, each missing
value can be imputed with the variable mean of the complete cases, or it
can be imputed with the mean conditional on observed values of other
variables. This approach treats missing values as if they were known in the
complete data analysis. However, single imputation does not reflect the
uncertainty about the predictions of the unknown missing values, and
the resulting estimated variances of the parameter estimates will be biased
toward zero (Rubin, 1987, p. 13).
SAS uses a multiple imputation (MI) procedure (Rubin, 1987, 1996)
that replaces each missing value with a set of plausible values that represent
the uncertainty about the value to impute. MI inference involves three
distinct phases:
The missing data are filled in m times to generate m complete data
sets.
The m complete data sets are analyzed by using standard statistical
analyses.
The results from the m complete data sets are combined to produce
inferential results.
To implement the MI procedure, you will need to use SAS syntax “PROC
MI.” Students interested in using this option need to consult the SAS
manual.
5.6.3 SAS OUTLIERS ANALYSIS
To examine the data for outliers, we can obtain box and whiskers plots and
can have SAS list the extreme values in each group. We will deal with these
one at a time. From the main menu select Describe Summary Statistics
to arrive at the Task Roles screen. As described earlier, place GAFscore
under Analysis variables and therapy under Classification variables in
the Summary Statistics Roles panel. Then click on Plots in the navigation
panel to arrive at the window shown in Figure 5.31. We have selected Box
and whisker as the display we wish to view. Because we specified therapy
as a classification variable, we will obtain separate plots for each of the
therapy groups. Then click Run to perform the analysis.
The box and whisker plot is shown in Figure 5.32. The box, actually
a vertically displayed rectangle, spans the range from the first quartile
Search WWH ::




Custom Search