Database Reference
In-Depth Information
vector dimension that is significantly smaller than the raw dimensionality of our dataset,
we bound the memory usage of our model both in training and production; hence,
memory usage does not scale with the size and dimensionality of our data.
However, there are two important drawbacks, which are as follows:
• As we don't create a mapping of features to index values, we also cannot do the
reverse mapping of feature index to value. This makes it harder to, for example,
determine which features are most informative in our models.
• As we are restricting the size of our feature vectors, we might experience hash
collisions . This happens when two different features are hashed into the same in-
dex in our feature vector. Surprisingly, this doesn't seem to have a severe impact
on model performance as long as we choose a reasonable feature vector dimen-
sion relative to the dimension of the input data.
Note
Further information on hashing can be found at http://en.wikipedia.org/wiki/
Hash_function .
A key paper that introduced the use of hashing for feature extraction and machine learning
is:
Kilian Weinberger , Anirban Dasgupta , John Langford , Alex Smola , and Josh Attenberg .
Feature Hashing for Large Scale Multitask Learning . Proc. ICML 2009 , which is avail-
able at http://alex.smola.org/papers/2009/Weinbergeretal09.pdf .
Search WWH ::




Custom Search