Database Reference
In-Depth Information
feature vector, which simply converts all the values to floats and wraps them in a numpy
array:
def extract_features_dt(record):
return np.array(map(float, record[2:14]))
data_dt = records.map(lambda r:
LabeledPoint(extract_label(r), extract_features_dt(r)))
first_point_dt = data_dt.first()
print "Decision Tree feature vector: " +
str(first_point_dt.features)
print "Decision Tree feature vector length: " +
str(len(first_point_dt.features))
The following output shows the extracted feature vector, and we can see that we have a
vector length of 12 , which matches the number of raw variables we are using:
Decision Tree feature vector:
[1.0,0.0,1.0,0.0,0.0,6.0,0.0,1.0,0.24,0.2879,0.81,0.0]
Decision Tree feature vector length: 12
Search WWH ::




Custom Search