Database Reference
In-Depth Information
Gutierrez:
What advice do you give to junior people at your company as
you and they create our future?
Lenaghan:
First and foremost, it is very important to be self-critical: always
question your assumptions and be paranoid about your outputs. That is the
easy part. In terms of skills that people should have if they really want to suc-
ceed in the data science field, it is essential to have good software engineering
skills. So even though we may hire people who come in with very little pro-
gramming experience, we work very hard to instill in them very quickly the
importance of engineering, engineering practices, and a lot of good agile pro-
gramming practices. This is helpful to them and us, as these can all be applied
almost one-to-one to data science right now.
If you look at dev ops right now, they have things such as continuous integra-
tion, continuous build, automated testing, and test harnesses—all of which
map very well from the dev ops world to the data ops (a phrase I stole from
Red Monk) world very easily. I think this is a very powerful notion. It is impor-
tant to have testing frameworks for all of your data, so that if you make a code
change, you can go back and test all of your data. Having an engineering mind-
set is essential to moving with high velocity in the data science world. Reading
Code Complete
1
and
The Pragmatic Programmer
2
is going to get you much fur-
ther than reading machine learning topics—although you do, of course, have
to read the machine learning topics, too.
Gutierrez:
So knowing machine learning is the pass to get inside of the door
and then, once inside the door, knowing the engineering practices is what sets
you apart?
Lenaghan:
Yes, in terms of the importance of everyday practice, you cannot
underestimate engineering. And a lot of people do. A lot of the people we
interview, even very senior people, just run some cleansed data sets that they
run some R packages on. To really succeed, having an engineering mindset is
important. I would say that having an analytical mindset is the most important,
then having good hygienic engineering practices, and then having the tools.
Where things get messed up is when you have the skillsets inverted—that is,
when you just have tools that you rely on and you basically apply them blindly
without good dev ops or engineering practices and without any critical think-
ing. The consolidation of programming libraries and practices is very good, but
the tools and the packaged libraries only serve you if you first have the critical
thinking skills and the engineering practices.
1
Steve McConnell,
Code Complete, 2nd Ed
. (Microsoft Press, 2004).
2
Andrew Hunt and Dave Thomas,
The Pragmatic Programmer
(Addison-Wesley, 1999).