Digital Signal Processing Reference
In-Depth Information
features, e.g. sports categories. In this example, the “Team Sports” classifier is used
to filter the incoming data into two sets, thereby shedding a significant volume of
data before passing it to the downstream classifiers (negatively identified team sports
data is forwarded to the “Winter” classifier, while the remaining data is not further
analyzed). Deploying a network of classifiers in this manner enables successive
identification of multiple features in data, and provides significant advantages in
terms of deployment costs. Indeed, decomposing complex jobs into a network of
operators enhances scalability, reliability, and allows cost-performance tradeoffs to
be performed. As a consequence, less computing resources are required because
data is dynamically filtered through the classifier network. For instance, it has been
shown that using classifiers operating in series with the same model (boosting [ 29 ] )
or classifiers operating in parallel with multiple models (bagging [ 16 ] ) can result in
improved classification performance.
In this chapter, we will focus on mining applications that are built using a
topology of low-complexity binary classifiers each mapped to a specific concept
of interest. A binary classifier performs feature extraction and classification leading
to a yes/no answer. However, this does not limit the generality of our solutions,
as any M-ary classifiers may be decomposed into a chain of binary classifiers.
Importantly, our focus will not be on the operators' or classifiers' design, for which
many solutions already exist; instead, we will focus on configurin g 1 the networks of
distributed processing nodes, while trading off the processing accuracy against the
available processing resources or the incurred processing delays. See Fig. 4 b .
1.2.2
Changing Paradigm
Historically, mining applications were mostly used to find facts with data at rest.
They relied on static databases and data warehouses, which were submitted to
queries in order to extract and pull out valuable information out of raw data.
Recently, there has been a paradigm change in knowledge extraction: data is no
longer considered static but rather as an inflowing stream, on which to dynamically
compute queries and analysis in real time. For example, in Healthcare Monitoring,
data (i.e., biometric measurements) is automatically analyzed through a batch of
queries, such as “Verify that the calcium concentration is in the correct interval”,
“Verify that blood pressure is not too high”, etc. Rather than applying a single
query to data, the continuous stream of medical data is by default pushed through
a predefined set of queries. This enables to detect any abnormal situation and react
accordingly. See Fig. 5 .
Interestingly, stream mining could lead to performing automatic action in
response to a specific measurement. For example, a higher dose of pain killers could
be administrated when concentration of calcium becomes too high, thus enabling
real-time control. See Fig. 6
1 As we will discuss later, there are two types of configuration choices we must make: the
topological ordering of classifiers and the local operating points at each classifier
 
Search WWH ::




Custom Search