Graphics Reference
In-Depth Information
specifically classification and associationmining. A subgroup discoverymethod aims
to extract interesting rules with respect to a target attribute.
1.5.6 Transfer Learning [ 26 ]
Aims to extract the knowledge from one or more source tasks and apply the knowl-
edge to a target task. In this paradigm, the algorithms apply knowledge about source
tasks when building a model for a new target task. Traditional learning algorithms
assume that the training data and test data are drawn from the same distribution and
feature space, but if the distribution changes, such methods need to rebuild or adapt
the model in order to performwell. The so-called data shift problem is closely related
to transfer learning.
1.5.7 Data Stream Learning [ 13 ]
In some situations, all data is not available at a specific moment, so it is necessary
to develop learning algorithms that treat the input as a continuous data stream. Its
core assumption is that each instance can be inspected only once and must then be
discarded to make room for subsequent instances. This paradigm is an extension of
data acquirement and it is related to both supervised and unsupervised learning.
1.6 Introduction to Data Preprocessing
Once some basic concepts and processes of DM have been reviewed, the next step is
to question the data to be used. Input data must be provided in the amount, structure
and format that suit each DM task perfectly. Unfortunately, real-world databases are
highly influenced by negative factors such the presence of noise, MVs, inconsistent
and superfluous data and huge sizes in both dimensions, examples and features. Thus,
low-quality data will lead to low-quality DM performance [ 27 ].
In this section, we will describe the general categorization in which we can divide
the set of data preprocessing techniques. More details will be given in the rest of
chapters of this topic, but for now, our intention is to provide a brief summary of
the preprocessing techniques that we should be familiar with after reading this topic.
For this purpose, several subsections will be presented according to the type and set
of techniques that belong to each category.
 
Search WWH ::




Custom Search