Database Reference
In-Depth Information
Perlich: Correct. In our daily work, we have this type conversation often:
“We have a tricky problem. Let's get somebody to think about how we should
solve it.” This work and their presentation let us get a sense of their smell test,
critical thinking, communication, and how they drive the process to under-
standing the data. It also lets us see how they ask questions. In particular we
are interested in what questions they ask about what else they need to know
about the problem, what are the constraints, what is the environment in which
the problem happens, and what is relevant about the industry or the specific
setting. Of course, we are also very interested in how they come up with ideas
for solving the problem. We think it is a good process, and so far the people
we have interviewed seemed to have liked it.
Gutierrez: Is it hard for a company without data scientists to hire data
scientists?
Perlich: Yes and I am not even talking about convincing one to join you and
the cost associated with it. Companies, with the help of some vendors, may
have started collecting data. Now that the companies have data or will soon be
looking at data, they have to go out and hire someone to do things with data.
But what then happens when looking for data scientists is that “data scientist”
is a completely undefined job description. The biggest issue for companies is
that if they were to try to hire a data scientist, they would not even know how
to tell if they were interviewing one, because they do not really know what
“data scientist” means. As long as they do not have one data scientist recogniz-
ing a second one, it is actually quite a daunting task. Perhaps a little too negative,
but it seems to me that many data scientists basically just changed the label on
their résumé, for some of them it makes sense but for some it does not. It is a
major issue that data scientists have no agreed-upon skill set as of today. If you
hire a database administrator, you know what you are getting. Today, if you hire
a data scientist, you do not know what you are getting.
Gutierrez: Is it hard for a company with data scientists to hire data
scientists?
Perlich: It is easier because the second is more likely to accept the job if
there is a first and because you have somebody who can evaluate the candi-
date to some extend, but it is still far from easy. One of the big concerns with
data science is what I call “quality control.” If I build a model, I have a good gut
guess of the overall quality of the model. However, I cannot tell you how good
it is in the sense that I do not know if my working on it for another week will
increase the performance by 5 percent or by 50 percent. I have a gut feeling
about this, but the reality is that there is a noise part that comes from the
data that has nothing to do with the algorithm. This makes it extremely hard
to really know where you stand on your own model's quality. I am pretty okay
predicting model quality for my own models based on knowing how much
time I have spent, how much I have explored, and how much is left on the table
that I have not looked at.
 
Search WWH ::




Custom Search