Databases Reference
In-Depth Information
In many situations, when you are dealing with big data sources,
you may not find well-documented definitions associated with
data attributes. This is precisely why you should attempt to create
a minimum set of documentation consisting of the source, how
you accessed it, what access methods (APIs or direct downloads)
you applied, what data cleansing methods you applied, what
security and privacy measures you applied on the data sets, where
you are storing the raw data sets, etc.
Skills: Big data analytics solutions are intended to solve different kinds of problems and
they require different kind of skills (data scientist) to accomplish the tasks. The skills like DBA,
data integration specialists, and reports development specialists usually are not expected to
be competent in collecting, merging, and analyzing data coming from a variety of sources; nor
are they expected to have the business acumen to understand the context of the data.
In big data scenarios, data scientists and data architects rather
than database administrators will be in demand to effectively
implement the distributed nature of big data processing, ingesting
and aggregating data from multiple sources and managing storage,
compute, and network resources to handle large data sets.
In Chapter 9, we will discuss in detail the skills needed to be successful in devel-
oping and implementing big data analytics solutions.
Note
In the sections above we discussed what additional considerations need to be
put in place under the EIM framework to support big data analytics initiatives in your
organization. Big data analytics initiatives are very different in nature. Besides a robust
EIM framework, you will need to understand what capabilities need to be put in place to
optimally deliver big data analytics initiatives in your organization. What are those?
New capabilities needed for big data
Big data characteristics, especially the velocity and variety aspects of it warrants us to deal
with the data and associated events as they happen. We can't afford latency because the
data will become useless if you don't act at the time of events happening. In addition,
the type of analysis you will make on big data expects it to be much more iterative. The
complexity of big data sets also demands better data visualization techniques. Otherwise,
it will become tedious and incomprehensible if you follow traditional reporting and
dashboard development approaches. In order to move at the speed of business, and
maintain competitive advantage, enterprise agility is becoming vital. This means that
business requirements need to be developed rather quickly. The organization should have
the ability to quickly respond to changing business conditions, and more often than not
business will be asking a question, which means data sets are created quickly, analyzed and
presented back to business users with possible answers. This further highlights the need for,
and the importance of, adoption of agile methods for business intelligence and analytics.
 
 
Search WWH ::




Custom Search