Databases Reference
In-Depth Information
building blocks of the solution are situations where you do want to
understand causality, when you want to be able to say that a certain
type of behavior causes a certain outcome. In these cases your men‐
tality or goal is not to optimize for predictive accuracy, but rather to
be able to isolate causes.
This chapter will explore the topic of causality, and we have two experts
in this area as guest contributors, Ori Stitelman and David Madigan.
Madigan's bio will be in the next chapter and requires this chapter as
background. We'll start instead with Ori, who is currently a data sci‐
entist at Wells Fargo. He got his PhD in biostatistics from UC Berkeley
after working at a litigation consulting firm. As part of his job, he
needed to create stories from data for experts to testify at trial, and he
thus developed what he calls “data intuition” from being exposed to
tons of different datasets.
Correlation Doesn't Imply Causation
One of the biggest statistical challenges, from both a theoretical and
practical perspective, is establishing a causal relationship between two
variables. When does one thing cause another? It's even trickier than
it sounds.
Let's say we discover a correlation between ice cream sales and bathing
suit sales, which we display by plotting ice cream sales and bathing suit
sales over time in Figure 11-1 .
This demonstrates a close association between these two variables, but
it doesn't establish causality . Let's look at this by pretending to know
nothing about the situation. All sorts of explanations might work here.
Do people find themselves irrestistably drawn toward eating ice cream
when they wear bathing suits? Do people change into bathing suits
every time they eat ice cream? Or is there some third thing (like hot
weather) which we haven't considered that causes both? Causal infer‐
ence is the field that deals with better understanding the conditions
under which association can be interpreted as causality.
Asking Causal Questions
The natural form of a causal question is: What is the effect of x on y ?
Some examples are: “What is the effect of advertising on customer be‐
havior ?” or “What is the effect of drug on time until viral failure ?” or
in the more general case, “What is the effect of treatment on outcome ?”
Search WWH ::




Custom Search