Database Reference
In-Depth Information
val label = trimmed(r.size - 1).toInt
val categoryIdx = categories(r(3))
val categoryFeatures = Array.ofDim[Double](numCategories)
categoryFeatures(categoryIdx) = 1.0
val otherFeatures = trimmed.slice(4, r.size - 1).map(d =>
if (d == "?") 0.0 else d.toDouble)
val features = categoryFeatures ++ otherFeatures
LabeledPoint(label, Vectors.dense(features))
You should see output similar to what is shown here. You can see that the first part of our
feature vector is now a vector of length 14 with one nonzero entry at the relevant category
Again, since our raw features are not standardized, we should perform this transformation
using the same StandardScaler approach that we used earlier before training a new
model on this expanded dataset:
val scalerCats = new StandardScaler(withMean = true,
withStd = true).fit( => lp.features))
val scaledDataCats = =>
LabeledPoint(lp.label, scalerCats.transform(lp.features)))
We can inspect the features before and after scaling as we did earlier:
The output is as follows:
The following code will print the features after scaling:
Search WWH ::

Custom Search