Database Reference
In-Depth Information
Key Concepts
Data preparation
Model planning
Model execution
Communicate results
Data science projects differ from most traditional Business Intelligence projects and
many data analysis projects in that data science projects are more exploratory in
nature. For this reason, it is critical to have a process to govern them and ensure
that the participants are thorough and rigorous in their approach, yet not so rigid
that the process impedes exploration.
Many problems that appear huge and daunting at first can be broken down into
smaller pieces or actionable phases that can be more easily addressed. Having
a good process ensures a comprehensive and repeatable method for conducting
analysis. In addition, it helps focus time and energy early in the process to get a clear
grasp of the business problem to be solved.
A common mistake made in data science projects is rushing into data collection and
analysis, which precludes spending sufficient time to plan and scope the amount of
work involved, understanding requirements, or even framing the business problem
properly. Consequently, participants may discover mid-stream that the project
sponsors are actually trying to achieve an objective that may not match the available
data, or they are attempting to address an interest that differs from what has been
explicitly communicated. When this happens, the project may need to revert to the
initial phases of the process for a proper discovery phase, or the project may be
Creating and documenting a process helps demonstrate rigor, which provides
additional credibility to the project when the data science team shares its findings.
A well-defined process also offers a common framework for others to adopt, so the
methods and analysis can be repeated in the future or as new members join a team.
Search WWH ::

Custom Search