such as the mean, mode, median, standard deviation. Multivariate
statistics include tests such as F Tests and T Tests.
stratified sampling A sampling technique such that the cases
selected are based on percentages or counts of class values from a
specific attribute. For example, a target attribute with values high,
medium, and low, where the original distribution of cases is 75 percent,
20 percent, and 5 percent, respectively, may be stratified to ensure
that there are equal number of cases in the sampled dataset.
structured data Data that contains primitive data types such as
integers, floats, or category strings. Examples include age, marital
supervised learning The process of building data mining models
using a known dependent attribute, referred to as the target . All clas-
sification and regression techniques are supervised.
system default For an enumeration class, an implementation-
defined default value that corresponds to one of the allowed values
for the enumeration class. This default value may be different accord-
ing to the context. Vendors must document the system default for
system determined For an enumeration class, a user may request
the implementation to determine what is the best value for this
enumeration. The implementation-selected value may take into
account, for example other settings or data, to determine an enumer-
ation value. JDM implementers are expected to document the behav-
ior users can expect.
target In supervised learning, the identified logical attribute that is
to be predicted. Also referred to as a dependent variable.
taxonomy A hierarchical grouping of a set of categorical values.
For example, a geography taxonomy groups cities into states, states
into regions, and regions into countries.
task A container within which to specify arguments to data mining
operations to be performed by the DME. Data mining tasks include:
model build, test, apply, import, and export .
TCK See Technology Compatibility Kit .
Technology Compatibility Kit The suite of tests, tools, and
documentation, as defined through the Java Community Process,
that allows implementers of a specification to determine if their
implementation is compliant with that specification.