Database Reference
In-Depth Information
time_of_day = hour_of_day.map(lambda hr: assign_tod(hr))
time_of_day.take(5)
If we again take the first five records of this new RDD, we will see the following trans-
formed values:
['afternoon', 'evening', 'morning', 'morning', 'morning']
We have now transformed the timestamp variable (which can take on thousands of values
and is probably not useful to a model in its raw form) into hours (taking on 24 values) and
then into a time of day (taking on five possible values). Now that we have a categorical
feature, we can use the same 1-of-k encoding method outlined earlier to generate a binary
feature vector.
Search WWH ::




Custom Search