Time Stamps and Financial Modeling - Doing Data Science

Databases Reference

In-Depth Information

all causal, so you have to be careful when you train your overall model

how to introduce your next data point and make sure the steps are all

in order of time, and that you're never ever cheating and looking ahead

in time at data that hasn't happened yet.

In particular, and it happens all the time, one can't normalize by the

mean calculuated over the training set. Instead, have a running esti‐

mate of the mean , which you know at a given moment, and normalize

with respect to that.

To see why this is so dangerous, imagine a market crash in the middle

of your training set. The mean and variance of your returns are heavily

affected by such an event, and doing something as innocuous as a

mean estimate translates into anticipating the crash before it happens.

Such acausal interference tends to help the model, and could likely

make a bad model look good (or, what is more likely, make a model

that is pure noise look good).

Log Returns

In finance, we consider returns on a daily basis. In other words, we

care about how much the stock (or future, or index) changes from day

to day. This might mean we measure movement from opening on

Monday to opening on Tuesday, but the standard approach is to care

about closing prices on subsequent trading days.

We typically don't consider percent returns, but rather log returns : if

F t denotes a close on day t , then the log return that day is defined as

log F t / F t −1 , whereas the percent return would be computed as

100 F t / F t −1 −1 . To simplify the discussion, we'll compare log re‐

turns to scaled percent returns , which is the same as percent returns

except without the factor of 100. The reasoning is not changed by this

difference in scalar.

There are a few different reasons we use log returns instead of per‐

centage returns. For example, log returns are additive but scaled per‐

cent returns aren't. In other words, the five-day log return is the sum

of the five one-day log returns. This is often computationally handy.

By the same token, log returns are symmetric with respect to gains and

losses, whereas percent returns are biased in favor of gains. So, for

example, if our stock goes down by 50%, or has a -0.5 scaled percent

gain, and then goes up by 200%, so has a 2.0 scaled percent gain, we

are where we started. But working in the same scenarios with log

Search WWH ::

Custom Search

Home