Amy Heineike - Data Scientists at Work

Database Reference

In-Depth Information

Gutierrez: What have you been working on in the last year?

Heineike: This has been an exciting year, as we're getting to the stage of really

scaling up what we're delivering. This means that we've been very focused on

increasing the quality of the overall experience to users, as well as leveraging

more datasets. I've spent a lot of time working closely with our production

software engineers to help define how we interpret new data streams—

understanding what's there, what is useful, where there are data issues, and

how we can address them. We've also worked to improve the quality of each

of our main algorithms, and figure out if and how they need to be tailored to

each dataset, and responded to feedback from our users.

The data science team has grown quickly this year too, so we've brought on

board people with expertise in different parts of our stack who've been able

to go deeper on each piece.

Gutierrez: Why is scaling up important for Quid?

Heineike: Getting to scale the product means the business is growing, which

is obviously important. It's also just really exciting to get to put the things

we've been building internally in front of a lot of people though. The genesis

of our product came from our own curiosity to explore what was happen-

ing in these different domains—in emerging technologies, in important global

conversations—and to push the tools to see how easy we could make this

for users.

On the one hand then, it's amazing to now see other people's curiosity also

be satisfied and seeing them asking and answering really diverse types of ques-

tions. Our users are often very creative and smart people, and so there is a

real energy when they get into the product.

On the other hand, as a data scientist I'm fascinated by how we can democ-

ratize the feeling of having the ability to really drill into and explore data. It's

one thing to be able to write a script that can pull insights from a specific data

set that address a specific question, and it takes a lot of technical knowledge

to do that—it's another thing altogether to productize the essence of that,

so that other people can drive it. Giving that control to people with the best

questions is really exhilarating.

Gutierrez: What specific news sources are you looking at?

Heineike: We are using a really wide corpus of news. We gather and index

about 1,500,000 news articles a day from global sources. These news articles

represent a broad selection of what is being generated, because the data

goes from mainstream news to unique, really personal blogs, like people's

WordPress accounts, with technical sources and everything else in between.

Search WWH ::

Custom Search

Home