Database Reference
In-Depth Information
Smallwood: I think it's important to look for experience—especially if you're
starting a new team. You can't take someone without experience, even if they
were valedictorian of their PhD program at MIT. I still wouldn't take that per-
son as my only or first data scientist, because they haven't worked enough.
I think the education is great—don't get me wrong, education is fantastic.
But in today's world, where people are working on more practical problems,
the actual experience of wrestling through one model after another matters
for the only or first data scientist hire. Especially if the experience happened
under very different data circumstances, different distributions of the underly-
ing data, and different data characteristics. You also want to see experience
with missing data, duplicate data, and all the challenges that you actually face
with raw collected data. And that's just on the data side.
On the modeling side, you also want to see experience in thinking about
whether you're solving the right thing, and then learning from the business
perspective that it's completely impractical and you solved the wrong thing.
You have to go through all of those experiences to really build up the beefier
level of experience.
I think if you've found someone with that experience—even just one great
person like that, you're set. Because then you can hire some junior-level peo-
ple and teach them along the way. But you've got to have at least one person
who's actually experienced.
Gutierrez: What is one problem you think data scientists need to fix?
Smallwood: That is a really hard question, as I feel like all of the important prob-
lems that can be tackled with data are already starting to be tackled by people. In
the future, perhaps, there may be other new important problems, but at this point
people are looking at the current important problems. But I will say that to me
the most valuable things to improve are things where data is shared more openly
across the industry, regardless of what industry you're talking about.
If I were to pick one thing that needs fixing in this space, it is that health care
data sharing is moving way more slowly than it should. If we could somehow fig-
ure out how to speed up privacy-protected information sharing across all that
great data that's collected, I'm sure we could make more progress on diagnosing
things, even self-diagnosis. Especially with more ubiquitous open sharing of infor-
mation about symptoms and what they can be connected to health-wise.
Anything where you're sharing data across the whole consumer and business
industry within the same industry, I think is super beneficial. I love what's been
done in microphilanthropy. The work in this area is very cool. There's a ton
of environmental work that's going on that's also really cool. So I think that
there's real opportunity everywhere with today's ability to collect data and
then share it.
 
Search WWH ::




Custom Search