percentage A value between 0 and 100 that represents a part of a
whole. For example, 75 percent indicates three quarters of a whole.
physical attribute An object that corresponds to a field in a format-
ted file, or column in a database table.
physical dataset Identifies data as a set of cases to be used as input
to data mining. Using tasks, physical attributes can be mapped to
logical attributes of a model's signature or logical data of a build
settings object. The data referenced by a physical dataset object can be
used in model building, scoring (apply), lift computation, statistical
physical data record A collection of named attribute values used as
input and output for single or multi-record scoring.
predictor A logical attribute used as input to a supervised model or
algorithm to build a model. Also referred to as an independent vari-
predictive data mining Data mining that results in a model by
performing inference on build data, and attempting to predict out-
comes for cases in apply datasets. See also descriptive data mining .
prior probabilities The set of prior probabilities, or priors , specifies
the distribution of categories, or classes, in the original population.
Through skewed sampling, such as stratified sampling, prior proba-
bilities will differ from the distribution observed in the build dataset.
Priors allow the algorithm to adjust predictions to reflect original
probability A value between zero and one (0 … 1) that indicates
the likelihood of an event. Zero indicates there is no chance of the
event occurring. One indicates it is probabilistically certain the event
quality of fit In clustering, a value between zero and one that is a
measure of how well a given case fits in the predicted cluster.
Values closer to zero indicate a poor fit; values closer to one indicate
a good fit.
receiver operating characteristics (ROC) A measure of comparison
between individual models to determine thresholds that yield a high
proportion of positive hits. ROC curves aid users in adjusting the
cost matrix to minimizing error rates. ROC was originally used in
signal detection theory to gauge the true hit versus false alarm ratio
when sending signals over a noisy channel.
recode A transformation that defines an explicit set of mappings,
where each mapping involves an original value and replacement (or