Database Reference
In-Depth Information
Chapter 1. Big Data, Analytics, and
Data Science Life Cycle
Enterprise data has never been of such prominence as in the recent past. One of
the dominant challenges of today's major data influx in enterprises is establishing a
future-proof strategy focused on deriving meaningful insights tangibly contributing to
business growth.
This chapter introduces readers to the core aspects of Big Data, standard analytical
techniques, and data science as a practice in business context. In the chapters that
follow, these topics are further elaborated with a step-by-step implementation guide to
use Greenplum's Unified Analytics Platform ( UAP ).
The topics covered in this chapter are listed as follows:
• Enterprise data and its characteristics
• Context of Big Data—a definition and the paradigm shift
• Data formats such as structured, semi-structured, and unstructured data
• Data analysis, need, and overview of important analytical techniques (statist-
ical, predictive, mining, and so on)
• The philosophy of data science and its standard life cycle
Enterprise data
Before we take a deep dive into Big Data and analytics, let us understand the import-
ant characteristics of enterprise data as a prerequisite.
Enterprise data signifies data in a perspective that is holistic to an enterprise. We
are talking about data that is centralized/integrated/federated, using diverse storage
strategy, from diverse sources (that are internal and/or external to the enterprise),
condensed and cleansed for quality, secure, and definitely scalable.
In short, enterprise data is the data that is seamlessly shared or available for explor-
ation where relevant information is used appropriately to gain competitive advantage
for an enterprise.
Search WWH ::




Custom Search