Databases Reference
In-Depth Information
Data Management
Data Integration, Data
Manipulation, Data Preparation
Other Skills
Story
Telling
Collaboration
Creativity
Leadership
Analytics Te chniques
Algorithms, Models,
Discovery
Data
Scientist
Business Analysis
Contextualize, Business
Hypothesis, Data Visualization
Figure 9-1. Data scientist skills
Data Management. At the heart of analytics is data, and so robust data sets are needed
for deep analytic efforts. Data can be in disparate locations (internal and external), large
in volume, or streaming. Data scientists need to employ several approaches to develop the
relevant data sets for analysis. In many cases, data needs to be massaged and prepared to
reflect relationships and contexts; these things will not be present in raw transactional data,
hence the data scientists need to be good at data integration, data manipulation, and data
preparation skills. In one of the earlier chapters we discussed the importance of data quality
in preparing data sets for analysis, and so the data scientist needs to have skills to perform
profiling, data validations, and cleansing of data.
Analytics Techniques. In the previous chapter we discussed several analytics
techniques. Depending on the business problem you are trying to solve and type of data
available for you, a broader or narrower set of analytics techniques or algorithms and
models will have to be developed. The data scientist needs to be skilled in the various
analytics techniques and processes.
Business Analysis. Business context behind the data is the most critical skill a data
scientist can possess, because if you do not understand the business attributes of the data,
you will not be able to leverage the value of data. In big data scenarios it is easy to get lost in
the discovery process when you are dealing with a vast volume or variety of data. The data
scientist must have the ability to distinguish “cool facts/analysis” from insights that will
matter to the business and to communicate those insights to business executives.
Beyond these three core skills, a data scientist should also possess several other soft
skills: storytelling, collaboration, creativity, and leadership.
Data scientists have a difficult job of formulating the right data sets; this means
they need to obtain access to the data, work with business users to contextualize the
data associating the business meanings behind the data and then explain the findings of
their analysis to business stakeholder in a language they understand. In short, the data
Search WWH ::




Custom Search