test The data mining operation that determines the accuracy of a
model. This is typically performed by using held-aside (test) data
identical in form to the build data, scoring that test data, and com-
paring the actual target value with the predicted target value. Testing
is only applicable for supervised models. In JDM, test is performed
using a test task .
test data The input data used for testing a model.
test task A task that when executed produces test results for super-
text mining A data mining technique for extracting patterns and
insights out of unstructured, text data. Text mining goes beyond the
notion of search in that previously unknown information can be
discovered through the use of data mining algorithms.
time series A data mining technique that supports the analysis of
time series data. A series of values X(t) are recorded according to
some function of time and are thus ordered by an index describing
the time (t) at which the values were recorded.
training The step in the model building process that produces a
possibly nonoptimized form of the model. For example, a tree algo-
rithm may produce a full tree during training, but may require an
evaluation phase to effectively select the best subtree. See build .
training data See build data .
transformation A function applied to data resulting in a new form
or representation of the data. Binning and normalization are exam-
ples of data transformations. See also binning, explode, and normaliza-
trend In time series, this is typically considered to be a long-term
change in the mean level of a series. What constitutes “long-term”
depends on the sampling rate of the time series. See also time series .
UML Unified Modeling Language.
URI Uniform Resource Identifier.
unstructured data Data that represents complex content, often with
an inherent structure. Examples of unstructured data include text,
images, audio, and video. See also structured data .
unsupervised learning The process of building data mining models
without the guidance ( supervision ) of a known, correct result. In super-
vised learning , this correct result is provided in the target attribute .
Unsupervised learning has no such target attribute. Clustering and
association are examples of unsupervised learning.