Databases Reference
In-Depth Information
then the mean or the variance of that transformed data). This ends up
being a submodel of our model.
Transforming Your Data
Outside of the context of financial data, preparing and transforming
data is also a big part of the process. You have a number of possible
techniques to choose from to transform your data to better “behave”:
• Normalize the data by subtracting the mean and dividing by the
standard deviation.
• Alternatively normalize or scale by dividing by the maximum
value.
• Take the log of the data.
• Bucket into five evenly spaced buckets; or five evenly distributed
buckets (or a number other than five), and create a categorical
variable from that.
• Choose a meaningful threshold and transform the data into a
new binary variable with value 1, if a data point is greater than
or equal to the threshold, and 0 if less than the threshold.
Once we have estimates of our mean y and variance σ 2 , we can nor‐
malize the next data point with these estimates just like we do to get
from a Gaussian or normal distribution to the standard normal dis‐
tribution with mean = 0 and standard deviation = 1:
y y
σ y
y
Of course we may have other things to keep track of as well to prepare
our data, and we might run other submodels of our model. For ex‐
ample, we may choose to consider only the “new” part of something,
which is equivalent to trying to predict something like y t y t −1 instead
of y t . Or, we may train a submodel to figure out what part of y t −1
predicts y t , such as a submodel that is a univariate regression or
something.
There are lots of choices here, which will always depend on the situa‐
tion and the goal you happen to have. Keep in mind, though, that it's
 
Search WWH ::




Custom Search