Databases Reference
In-Depth Information
FIGURE 11.11
Search and recommendation process.
of the result set and data from the knowledge repository is processed to optimize the search and this
data is returned as personalized offers to the user. Often sponsors of specific products and services
provide such offers with incentives that are presented to the user by the recommender algorithm
output.
How does machine learning use metadata and master data? In the preceding search example, the
metadata is derived for the search elements and tagged with additional data as available. This data
is compared and processed with the data from the knowledge repository, which includes semantic
libraries, and master data catalogs when the machine learning algorithm is executed. The combination
of metadata and master data along with the use of semantic libraries provides a better quality of data
to the machine learning algorithm, which in turn produces better quality of output for use by hypoth-
esis and prediction workflows.
Processing data that is very numeric like sensor data, financial data, or credit card data will be
based on patterns of numbers that execute as data inputs. These patterns are processed through several
mathematical models and their outputs are stored in the knowledge repository, which then shares the
stored results back into the processing loop in the machine learning implementation.
Processing data such as images and videos uses conversion techniques to create mathematical data
sets for all the nontextual elements. These mathematical data sets are processed through several com-
binations of data mining and machine learning algorithms, including statistical analysis, linear regres-
sion, and polynomial curve-fitting techniques, to create outputs. These outputs are processed further
to create a noise-free set of outputs, which can be used for recreating the digital models of images or
video data (image only and not audio). Audio is processed as separate feeds and associated with video
processing data sets as needed.
Machine learning techniques reduce the complexity of processing Big Data. The most com-
mon and popular algorithms for machine learning with web-sale data processing are available in the
open-source foundation known as the Apache Mahout project. Mahout is designed to be deployed
 
Search WWH ::




Custom Search