Database Reference
In-Depth Information
Perlich: Yes, because—depending on what the goal and context are—the
answer could be completely different. The problem is that many people think
they know enough about data to talk about averages and other statistical
measures, which makes them feel like they can ask meaningful questions. The
problem is that in the real world, data is not like data they saw in classrooms
and in topics. They do not realize that they do not know enough about how
distributions and other statistical measures really work. If they knew, they
would realize that single-numbered aggregates are almost always completely
useless. And that is the conversation part I need to have, because I am not
going to give you a number until I know what you want to do. You may be
solving the right problem, you may just not know how to ask the right ques-
tion. It is not even that the problem is the wrong one, you might just be unable
to formulate it in a meaningful way when you hand it over.
Gutierrez: How do you approach the issue of how non-data scientists ask
questions of you and the data?
Perlich: You cannot expect everybody to become data scientists. I do not
really need to have my CEO or other management people understand the
intricacies of long-tailed distributions. I think there are other core competen-
cies that are more important for the management to have and develop. The
most important thing is getting out of the silo mode you saw historically in
many institutions that had statisticians. In these institutions, you would have
the statisticians sitting in the basement or somewhere far away from the busi-
ness units. Nobody knew them and they were kept completely separate. You
would just order a report from them and it would come back to you with an
answer. You would then make whatever decisions you wanted based on those
reports. The only times I have seen things work out well when working with
statisticians and data scientists is if the problem-solving was done in teams,
where you actually had enough face time to have a conversation with the
person who executes it in the end.
What helps with education and with data literacy is that you do not spend
three days agreeing on the vocabulary every time you discuss something. At
least we have to know how to talk to each other. When you ask me for the
average, having a common language of what that means before we start the
conversation makes a huge difference. Otherwise, people walk away after a
few minutes if I do not get to the point or understand what they are asking.
This means that the business side should probably have some high-level ana-
lytical understanding. They do not need to be able to do it, but they need to
know enough of the vocabulary to understand what their techie or data sci-
entist is telling them. In the same way, forcing techies or data scientists to actu-
ally talk to the practitioners is great. It is not the worst thing in the world to
have them actually learn what matters on the business side and the common
 
Search WWH ::




Custom Search