Database Reference
In-Depth Information
They consist of neurons organized in layers. The input layer contains the predic-
tors or input neurons. The output layer includes the target field. These models
estimate weights that connect predictors (input layer) to the output. Models
with more complex topologies may also include intermediate, hidden layers,
and neurons. The training procedure is an iterative process. Input records,
with known outcomes, are presented to the network and model prediction is
evaluated with respect to the observed results. Observed errors are used to
adjust and optimize the initial weight estimates. They are considered as opaque
or ''black box'' solutions since they do not provide an explanation of their predic-
tions. They only provide a sensitivity analysis, which summarizes the predictive
importance of the input fields. They require minimum statistical knowledge
but, depending on the problem, may require a long processing time for training.
Support vector machine(SVM): SVM is a classification algorithm that can
model highly nonlinear, complex data patterns and avoid overfitting, that
is, the situation in which a model memorizes patterns only relevant to the
specific cases analyzed. SVM works by mapping data to a high-dimensional
feature space in which records become more easily separable (i.e., separated
by linear functions) with respect to the target categories. Input training data
are appropriately transformed through nonlinear kernel functions and this
transformation is followed by a search for simpler functions, that is, linear
functions, which optimally separate records. Analysts typically experiment with
different transformation functions and compare the results. Overall SVM is an
effective yet demanding algorithm, in terms of memory resources and processing
time. Additionally, it lacks transparency since the predictions are not explained
and only the importance of predictors is summarized.
Bayesian networks: Bayesian models are probability models that can be
used in classification problems to estimate the likelihood of occurrences. They
are graphical models that provide a visual representation of the attribute
relationships, ensuring transparency, and an explanation of the model's rationale.
Evaluation of Classification Models
Before applying the generated model in new records, an evaluation procedure is
required to assess its predictive ability. The historical data with known outcomes,
which were used for training the model, are scored and two new fields are derived:
the predicted outcome category and the respective confidence score, as shown in
Table 2.2, which illustrates the procedure for the simplified example presented
earlier.
In practice, models are never as accurate as in the simple exercise presented
here. There are always errors and misclassified records. A comparison of the
predicted to the actual values is the first step in evaluating themodel's performance.
Search WWH ::




Custom Search