Databases Reference
In-Depth Information
righthand side of Figure 3-5 that for a fixed value of x = 5 , there is
variability among the time spent on the site. You want to capture this
variability in your model, so you extend your model to:
y = β 0 + β 1 x
where the new term ϵ is referred to as noise , which is the stuff that you
haven't accounted for by the relationships you've figured out so far. It's
also called the error term ϵ represents the actual error , the difference
between the observations and the true regression line, which you'll
never know and can only estimate with your β s.
One often makes the modeling assumption that the noise is normally
distributed, which is denoted:
ϵ∼ N 0, σ 2
Note this is sometimes not a reasonable assumption. If you
are dealing with a known fat-tailed distribution, and if your
linear model is picking up only a small part of the value of
the variable y, then the error terms are likely also fat-tailed.
This is the most common situation in financial modeling.
That's not to say we don't use linear regression in finance,
though. We just don't attach the “noise is normal” assumption
to it.
With the preceding assumption on the distribution of noise, this mod‐
el is saying that, for any given value of x , the conditional distribution
of y given x is p y x N β 0 + β 1 x , σ 2 .
So, for example, among the set of people who had five new friends this
week, the amount of the time they spent on the website had a normal
distribution with a mean of β 0 + β 1 * 5 and a variance of σ 2 , and you're
going to estimate your parameters β 0 , β 1 , σ from the data.
How do you fit this model? How do you get the parameters β 0 , β 1 , σ
from the data?
Search WWH ::




Custom Search