Information Technology Reference
In-Depth Information
Generally a uniform or random sampling of system states is carried out by
varying parameters such as load level, unit availability, exchanges at the boarders,
component availability etc. according to their independent probability distributions
obtained from projected historical data (Henry et al. 1999 , 2004b ; Paul and Bell
2004 ; Lebrevelec et al. 1999 ; Senroy et al. 2006 ). Then, various scenarios are
simulated for a pre-speci
ed set of contingencies. This stage is generally very
tedious and time consuming, as there could be a tremendously large number of
combinations of variables [about 5,000
15,000 samples for a statistically valid
study (Henry et al. 2004b )]. Therefore, the challenge of producing high information
content training database at low computational cost needs to be addressed (Cutsem
et al. 1993 ; Jacquemart et al. 1996 ; Wehenkel 1997 ; Dy-Liacco 1997 ).
In the open literature, there are re sampling methods to retain only the most
important instances from an already generated training database (Jiantao et al. 2003 ;
Foody 1999 ) for classi
-
cation purposes. But such methods involve huge compu-
tational cost
in
first generating a training database,
then identifying the most
in
uential instances, and if need be, generate more of such instances. Genc et al.
( 2010 ) proposed an iterative method to a priori identify the most in
fl
uential region
in the operating parameter state space, and then enrich the training database with
more instances from the identi
fl
ed high information content region for enhancing
classi
cation performance. In this case, the method proposed to identify the high
information content region involves heavy computational cost when the dimension
of the operating parameter space increases, even beyond 10 parameters.
This chapter proposes to develop an ef
cient sampling method to generate
in
uential operating conditions that captures high information content for better
classi
fl
cation and also reduces computing requirements. In short, the objective is to
maximize information content in the training database, while minimizing com-
puting requirements to generate it. This ef
cient sampling is constructed using the
Monte Carlo Variance Reduction (MCVR) techniques. Among the mostly used
MCVR methods, control variate and antithetic variate take advantage of the cor-
relation between certain random variables to obtain variance reduction in statistical
estimation studies. Strati
cation method and importance sampling method re-orient
the way the random numbers are generated, i.e., alters the sampling distribution
(Ripley 1987 ; Thisted 1988 ). The proposed ef
cient sampling method is con-
structed using the importance sampling method for its ability to bias the Monte
Carlo sampling towards the in
fl
uential region identi
ed a priori; and generate
samples within the in
uential region preserving the original relative likelihood of
the operating conditions.
In order to sample the most in
fl
fl
uential operating conditions, the in
fl
uential
region must be
first traced; which requires that the operating parameter state space
be characterized with respect to post-contingency performance. A straight forward
way to perform state space characterization is to divide the N-dimensional hyper-
cube, where N is the number of selected operating parameters, into M smaller
hypercubes, select the center point of each of the M smaller hypercubes and per-
form an assessment to identify post-contingency performance (NM contingency
simulations). But for large N, there is a curse of dimensionality, resulting in very
Search WWH ::




Custom Search