Databases Reference
In-Depth Information
This generalizes to any model as well—you plot the cumulative sum
of the product of demeaned forecast and demeaned realized. (A de‐
meaned value is one where the mean's been subtracted.) In other
words, you see if your model consistently does better than the “stu‐
pidest” model of assuming everything is average.
If you plot this and you drift up and to the right, you're good. If it's too
jaggedy, that means your model is taking big bets and isn't stable.
Why Regression?
So now we know that in financial modeling, the signal is weak. If you
imagine there's some complicated underlying relationship between
your information and the thing you're trying to predict, get over
knowing what that is—there's too much noise to find it. Instead, think
of the function as possibly complicated, but continuous, and imagine
you've written it out as a Taylor Series. Then you can't possibly expect
to get your hands on anything but the linear terms.
Don't think about using logistic regression, either, because you'd need
to be ignoring size, which matters in finance—it matters if a stock went
up 2% instead of 0.01%. But logistic regression forces you to have an
on/off switch, which would be possible but would lose a lot of infor‐
mation. Considering the fact that we are always in a low-information
environment, this is a bad idea.
Note that although we're claiming you probably want to use linear
regression in a noisy environment, the actual terms themselves don't
have to be linear in the information you have. You can always take
products of various terms as x's in your regression. but you're still
fitting a linear model in nonlinear terms.
Adding Priors
One interpretation of priors is that they can be thought of as opinions
that are mathematically formulated and incorporated into our models.
In fact, we've already encountered a common prior in the form of
downweighting old data. The prior can be described as “new data is
more important than old data.”
Besides that one, we may also decide to consider something like “co‐
efficients vary smoothly.” This is relevant when we decide, say, to use
a bunch of old values of some time series to help predict the next one,
giving us a model like:
Search WWH ::




Custom Search