Databases Reference
In-Depth Information
stored in a knowledge repository (an NoSQL- or DBMS-like database) along with the algorithms
for machine learning.
3. The data is then processed through the hypothesis workflows.
4. The outputs from a hypothesis and predictive mining exercise are sent to the knowledge repository
as a collection with metatags for search criteria and associated user geographic and demographic
data.
5. Process the outputs of the hypothesis to outputs for further analysis or presentation to users.
Examples of real-life implementations of machine learning include:
IBM Watson
Amazon recommendation engine
Yelp ratings
Analysis of astronomical data
Human speech recognition
Stream analytics:
Credit card fraud
Electronic trading fraud
Google robot-driven vehicles
Predict stock rates
Genome classification
Using semantic libraries, metadata, and master data, along with the data collected from each itera-
tive processing, enriches the capabilities of the algorithms to detect better patterns and predict better
outcomes.
Let us see how a recommendation engine uses all the data types to create powerful and personal-
ized recommendations. We will use the Amazon website to discuss this process:
1. John Doe searches for movies on Amazon.
2. John Doe receives all the movies relevant to the title he searched for.
3. John Doe also receives recommendations and personalized offers along with the result sets.
How does the system know what else John Doe will be interested in purchasing, and how sure is the
confidence score for such a recommendation? This is exactly where we can apply the framework for
machine learning shown in Figure 11.9 ; the process is shown in Figure 11.11 .
The first step of the process is a user login or just anonymously executing a search on a website.
The search process executes and also simultaneously builds a profile for the user. The search engine
produces results that are shared to the user if needed as first-pass output, and adds them to the user
profile. As a second step, the search engine executes the personalized recommendation that provides
an optimized search result along with recommendations.
In this entire process after the first step, the rest of the search and recommendation workflow
follows the machine learning technique and is implemented with the collaborative filtering and
clustering algorithms. The user search criteria and the basic user coordinates, including the web-
site, clickstream activity, and geographical data, are all gathered as user profile data, and are inte-
grated with data from the knowledge repository of similar prior user searches. All of this data is
processed with machine learning algorithms, and multiple hypothesis results are iterated with
confidence scores and the highest score is returned as the closest match to the search. A second pass
Search WWH ::




Custom Search