Databases Reference
In-Depth Information
Figure 6-7. In-sample should come before out-of-sample data in a
time series dataset
One final point on causal modeling and in-sample versus out-of-
sample. It is consistent with production code, because we are always
acting—in the training and in the out-of-sample simulation—as if
we're running our model in production and we're seeing how it per‐
forms. Of course we fit our model in sample, so we expect it to perform
better there than in production.
Another way to say this is that, once we have a model in production,
we will have to make decisions about the future based only on what
we know now (that's the definition of causal) and we will want to update
our model whenever we gather new data. So our coefficients of our
model are living organisms that continuously evolve. Just as they
should—after all, we're modeling reality, and things really change over
time.
Preparing Financial Data
We often “prepare” the data before putting it into a model. Typically
the way we prepare it has to do with the mean or the variance of the
data, or sometimes the data after some transformation like the log (and
 
Search WWH ::




Custom Search