Database Reference
In-Depth Information
Another part of the day is spent very carefully planning and thinking about
the type of infrastructure we build. Now that we have a data science team at
MailChimp, it's no longer just about building out the infrastructure that just
keeps the application running. That's obviously priority number one. But now
we have all of these people who want to access the data for analysis. This
type of access involves very different patterns than our users accessing their
own data, so we need to think carefully about that. People run for the hills
whenever I execute an SQL query, because the queries that are run from the
application only hit indices, whereas the ones that I run involve a whole lot of
joining or subqueries.
So when I come along and I've got something that's some crazy nested thing
with a million joins, and then God help us, there's a cross-join in there and
I'm blowing some data set out, and I'm doing window functions on it in SQL
and things like that, I make all sorts of stoplight charts turn red on our NOC
[Network Operations Center] wall. And that's not fun for them and that's not
fun for me. So there are definitely discussions that have to occur during the
day around what systems we're building and how they're going to be used.
We have to make space for analysts to hit these boxes as hard as they want or
hit different data sets as hard as they want. It's a back and forth of what data
do you need? What data can we provide you? What are your needs? Are you
building something for production or building something for analysis? Are you
okay with a query running for a really long time? Does it need to run super
quick? So there's just a piece of my day that involves technical requirements.
Another part of the day is spent speaking to executives and other customer-
facing teams to ask them what they are hearing from users in terms of
features that they want. Then thinking about whether data can be brought
to bear to make any of those requested features possible. One requested
feature we recently did was send-time optimization. We spent a lot of time
discussing how to use our data to appropriately predict and optimize when
a customer should send their newsletter to best engage their audience. One
of the things that drives that engagement might be the time zone of the email
receiver. Everyone's different and everyone's on a different schedule, but they
might cluster around a few different specific times of the day, especially based
on whether they're on one side of the globe or another, as most people work
day shifts and most people sleep at night. Those are obvious things. But then
it gets way more complex from there. So we talked to executives and our
qualitative research team to really understand if people wanted this and then
figured out what data was going to be needed. And if we did build it, how we
thought it should be presented to the user. We spent a good amount of time
making sure that if we invested the time and energy into the project, that it
would be a good investment.
 
Search WWH ::




Custom Search