Databases Reference
In-Depth Information
1
Analyze and Evaluate Business Use Case
• Frame the problem
• Gather samplistic data
• Perform Data Discovery and Analysis
Iterative Cycle
2
Develop Business Hypotheses
• Assemble illustrative use cases
• Perform Fit-Gap Analysis
7
3
6
Measure and Monitor
• Measure effectiveness of
the Big Data Analytics
Solution
• Calibrate the Analytics
Models
• Monitor the solution for
it's effective benefits
• Establish feedback loops
for further learning and
improvements
Build the production ready
system (Scale and Performance)
• Architect and Develop the
end state solution
• Design and implement
appropriate business
processes
Develop Analytics Approach
• Evaluate illustrative use cases
• Perform Fit-Gap Analysis
• Identify appropriate analytics
algorithms/models
4
Build and Prepare Data sets
• Acquire data and
understand data
characteristics
• Data at rest - determine
appropriate data
sampling techniques
• Data in motion -
determine appropriate
data processing
techniques
5
Select and Build the Analytical
Models
• Build Analytical Models
• Test and validate with data
• Apply data visualization
techniques
• Review results
Increase amount and type of
data (Volume and Variety)
Explore different analytical
modeling and data
visualization techniques with
each iteration
Figure 7-1. Big data analytics methodology
You might sense a certain amount of similarity of this methodology with other data
analytics implementation and BI methodologies; however, the above methodology differs
from others by the number of times the designer should execute the steps to solve design
problems associated with processing at full scale. Knowledge gained during each pass
through of the various steps should be reflected in the system design.
Analyze and Evaluate Business Use Case
The first step in the methodology is to analyze and evaluate the business use case.
In many instances the analyze and evaluate step is also considered as a proof of concept
(POC) exercise. It is not uncommon to notice the first few cycles in this step will likely
produce some unexpected results, for example:
The data samples do not include enough descriptive information
to find the desired correlations.
The data discovery and analysis activities do not scale due to the
extreme number of early patterns observed in the sample data.
The POC infrastructure is not good enough to handle the variety
of data and the types of algorithms applied exhibit a good number
of shortcomings.
The most important outcome of the analyze and evaluate step is the development of
a detailed description of the business hypotheses inclusive of appropriate infrastructure,
 
Search WWH ::




Custom Search