Biomedical Engineering Reference
In-Depth Information
with that for product w9k001 . OC curves for LPM / SPM ratio have steep sides with
a wide, flat top compared to the grouped-stage OC curves, and the pattern of incor-
rect decisions for grouped stages occurs over a much broader range of the accept-
able region than for the EDA approach.
8.5.3
PCA Approach
8.5.3.1
Overview
In this section, a multivariate approach to deciding whether a set of APSD profiles is
similar or dissimilar to the original population is presented. In multivariate terminology,
the original population is labeled the training data set, and the new measurements that
are being compared to the training data are labeled the prediction set. In the case pre-
sented in this chapter, the original “training set” population consisted of 252 NGI
cascade impaction measurements normalized for total impacted to account for dose
differences. This population of APSD profiles reflected typical product, process, and
analytical variability expected from a product in late-stage development. The approach
presented here includes a principal component analysis model (PCA). The output
from this multivariate technique, in terms of detecting differences or changes in a set
of data, was then compared to an approach for detecting differences based on EDA
and grouped stages. Summary conclusions were made from this comparison.
A PCA of the 252 NGI measurements (Sect. 8.2 ), used to build the model, gave
very good results with >90% of the variability being captured in the first two com-
ponents (Fig. 8.63 ).
A quantitative measure of the goodness of fit of the PCA model is given by the
statistic, R 2 X . The value of R 2 X is a statistic that indicates how well the PCA model
explains the variation in the 252 measurements. In addition to this metric, there is
another measure, Q 2 , which reflects the predictive ability of the PCA model to pre-
dict new data unseen by the model. These predictions are made either internally via
existing data or through the use of an independent validation set of observations, a
prediction set. In this case, a high value was also obtained for Q 2 from an internal
validation, indicating that the model was fit for purpose.
A Hotelling T 2 ellipse [ 19 ] (a multivariate distribution analogous to the univariate
t-distribution) was established from the scores plot of the data set with a 99% con-
fidence interval, as indicated in Fig. 8.64 . The influence of the individual stages in
the NGI on the model can be seen from the loadings plot in Fig. 8.65 .
The loadings plot is a scatter plot of the loading or weight ( p[1] versus p[2] )
applied to each of the individual stages to construct the principal component. The
farther a data point lies from the origin, the more influence that stage has on the value
of the principal component, and thus, the more overall variation in the data set is
explained by changes in that factor. For example, from the loadings plot in Fig. 8.65 ,
Search WWH ::




Custom Search