Database Reference
In-Depth Information
My progression to realizing that I really wanted to work with data was that I
was working on many different software products and services and, to make
better decisions, I had to look into the feedback, the logs, and measurements
of what the system was doing—basically all of the data. As I looked into the
data, I found that I had to start to do it in a smart way, since there was so
much data. Obviously, I ended up learning statistics, machine learning methods,
and different data mining methods. I also started thinking about how to make
services and products intelligent enough to automatically use this information
and how to transform the organization so it can apply faster the knowledge
generated from these systems.
At Skype, that is what I did. I improved the engineering and calibration tools
so that bugs got fixed faster. I made it so that the roadmap priorities were bet-
ter aligned across the organization, so that the tools used for engineering and
calibration were in tune with what was required of them in the wild. As I did
this, there was a natural progression from development to data, since there
are so many insights you can achieve and so many decisions that you can make
if you use the all of the information available. Without it, it's just your intuition
guiding you, which doesn't work as well.
This is an interesting point actually. Some say that intuition is actually a better
thing to use to be able to develop something revolutionary or disruptive when
trying to come up with something new, rather than looking into the past, into
a mirror, or into the data. Quite often, you don't come up with something new
when you look to the past. You just do incremental improvements, optimiza-
tions, and make something more robust. So this is a big challenge and ques-
tion for me: how much to look into the data and prior knowledge versus just
creating something on my own. So now, through this evolution, I get to live
in this interesting place with my feet in two different communities: software
development and data science communities.
Gutierrez: Do you remember the first data set you worked with?
Karpištšenko: There have been so many that it's hard to remember. I mean,
my computers are full of different data sets. The earliest ones were the easiest
ones, which, of course, I analyzed in Excel, as they just had some qualitative
labels for me to use. Rather than the first one, I'll talk about the one that I
think is the most meaningful early work I did for myself.
In 2008 or so, when I was looking into all my communication patterns in
email and instant messaging, I was annoyed by the fact that I had so many con-
tacts, close to 700 people, trying to connect to me. At times, I couldn't even
remember who they all were or what they wanted, which meant I had a long
list of unread emails and unanswered instant messages. I had to decide what
was relevant and what was not. In many cases, I had just been CC'ed and I
wasn't supposed to take any action. In other cases, however, I was supposed to
take action.
 
Search WWH ::




Custom Search