Database Reference
In-Depth Information
Run Analytics on a Broader Variety of Data
Earlier in this chapter, we described a client who achieved enormous customer
support effectiveness by analyzing voice data from support conversations in
real time. The ability to incorporate newer varieties of data, such as voice, text,
video, and other unstructured data types, along with structured relational
sources, opens up possibilities for improving efficiencies and differentiation.
One of our retail clients is now correlating social media data with their point-
of-sale data in the data warehouse. Before launching a new brand, they know
what type of buzz it's generating, and they use that information to forecast
product sales by geography and to ensure that merchandise is stocked to that
level. They are running deep analytic queries on inventory levels and models
that require heavy computations.
The Big Data Platform Manifesto
To enable the key considerations of the analytic enterprise, it's important to
have a checklist of the imperatives of a Big Data platform—introducing our
Big Data platform manifesto. The limitations of traditional approaches have
resulted in failed projects, expensive environments, and nonscalable deploy-
ments. A Big Data platform has to support all of the data and must be able to
run all of the computations that are needed to drive the analytics. To achieve
these objectives, we believe that any Big Data platform must include the six
key imperatives that are shown in Figure 3-1.
1. Data Discovery and Exploration
The process of data analysis begins with understanding data sources, figuring
out what data is available within a particular source, and getting a sense of
its quality and its relationship to other data elements. This process, known as
data discovery , enables data scientists to create the right analytic model and
computational strategy. Traditional approaches required data to be physi-
cally moved to a central location before it could be discovered. With Big Data,
this approach is too expensive and impractical.
To facilitate data discovery and unlock resident value within Big Data, the
platform must be able to discover data “in place.” It has to be able to support
the indexing, searching, and navigation of different sources of Big Data. It
has to be able to facilitate discovery of a diverse set of data sources, such as
databases, flat files, content management systems—pretty much any persis-
tent data store that contains structured, semistructured, or unstructured data.
 
Search WWH ::




Custom Search