Database Reference
In-Depth Information
data using any of the many analytical techniques such as logistic regression, de-
cision trees, neural networks, rule evolvers, and so on.
The next part of model design is the identification of the appropriate modeling tech-
nique. The focus will be on what data we would be running in our models, structured,
unstructured, or hybrid.
As a part of building the environment for modeling, we would define data sets for
testing, training, and production. We would also define the best hardware/software
to run the tests such as parallel processing capabilities, and so on.
Important tools that can help building the models are R, PL/R, Weka, Revolution R
(a commercial option), MADlib, Alpine Miner, or SAS Enterprise Miner.
The second step of executing the model considers running the identified model
against the data sets to verify the relevance of the model as well as the outcome.
Based on the outcome, we would need further investigation on additional data re-
quirements and alternative approaches to solving the problem in context.
Phase 5 - publish insights
Now comes the important part of the life cycle, communicating/publishing the key
results/findings against the hypothesis defined in phase 1. We would consider
presenting the caveats, assumptions, and limitations of the results. The results are
summarized to be interpreted for a relevant target audience.
This phase requires identification of the right visualization techniques to best com-
municate the results. These results are then validated by the domain experts in the
following phase.
Phase 6 - measure effectiveness
Measuring the effectiveness is all about validating if the project succeeded or failed.
We need to quantify the business value based on the results from model execution
and the visualizations.
An important outcome of this phase is the recommendations for future work.
Search WWH ::




Custom Search