Database Reference
In-Depth Information
They do it in a spreadsheet because there's nowhere to run or hide. You're
doing every single step yourself and you can see how it works. I did it this way
for a couple of reasons. One is to show people that it's not black magic. Data
science and machine learning have a lot of really amazing words around them
that make them sound like Terminator 2, where robots are going to come and
destroy the Earth. When you look at something like a boosted trees model,
however, it's relatively cutting-edge but not that hard. I think someone with a
year of freshman college math could build that model and understand every-
thing about it.
So I wrote the topic because I wanted to take people through these things in
great detail to help build their confidence and so that they understand they
can prototype in them and not be afraid of them. I want them to see how even
though these models are easy or silly, they still get the job done amazingly
well. I want to broaden the conversation. I feel like the way a lot of people
talk about this stuff is just so mystical. They don't really want to tell you what
they're doing because their job security is wrapped up around being some
sort of shaman-like persona. But that's not what your job security should
really be based on—it should be based on solving problems. If you're solving
problems appropriately and you can explain yourself well, you're not going to
lose your job. You don't have to hide behind the fact that no one else knows
what this model does.
So that was the purpose of this topic. It's for the people who are gluttons
for pain and who really want to understand how these models work. You
can work the spreadsheets with me. And by the end, if you survive, then
you're really going to know this stuff. You'll know how to talk about it intel-
ligently. So I think it's very fun for a particular kind of reader. It's not bedtime
reading. You're not going to read it in bed. It's much more about sitting at
your desk, leaning forward, and doing it step-by-step as I walk you through
it. And so I thought it was a pretty cool addition to a lot of the other books
already out there.
Gutierrez: Why did you choose Excel as the tool to use to learn the techniques?
Foreman: There are already a lot of topics on R, Python, and similar pro-
gramming languages. Unfortunately, there are a lot of people in the enterprise
space that don't do R and don't do Python. If you look at accounting, or
finance, or the government, what you'll see is that for a lot of these places,
the analytics system of record is freaking Excel spreadsheets andVBA macros.
This means that topics using the aforementioned programming languages are
leaving these people behind.
 
Search WWH ::




Custom Search