Databases Reference
In-Depth Information
Build the Production Ready System
Usually there is a combination of two approaches, data exploratory or analytical-pipeline
processing, that are applied when building production-ready analytics systems.
Analysts are often not sure what to look for. In essence it is a highly exploratory
approach they take to look at all possible data sources and relate these to a particular
business problem. In such scenarios, instead of taking the entire dump of those data
sources, they revert to data-sampling techniques; this approach reduces the scale
of the big data along one or more of its dimensions while still faithfully representing
the characteristics of the original data: i.e., the data itself, the information content
represented by the data, or the information content and the data taken together. However,
data samples are valid for analysis if and only if patterns in the data remain stable during
the entire phase of analysis.
Analytical-pipeline is a highly automated processing architecture pattern in
which sets of homogenous data are exposed to one or more analytical algorithms and
techniques (refer to analytical techniques in Figure 7-2 ). The main objective of the
analytical pipeline processing approach is to process each data set through a series of
steps, preferably only once: once the data set is processed the results are then analyzed
for their possible relevance to the business problem (see Figure 7-3 ).
 
Search WWH ::




Custom Search