Database Reference
In-Depth Information
We can compare these predicted labels with the labels in our test data set. Let's count
the number of misclassifications:
$
paste -d, <
(
csvcut -c
type
data/wine-test.csv
)
\
>
<
(
csvcut -c
type
output/predictions.csv
)
|
>
awk -F,
'{ if ($1 != $2) {sum+=1 } } END { print sum }'
766
Combine the
type
columns of both
data/wine-test.csv
and
output/predictions.csv
.
Keep count of when the two columns differ in value using
awk
.
As you can see, BigML's API misclassified 766 wines out of 1,599. This isn't a good
result, but note that we just blindly applied an algorithm to a data set, which we nor‐
mally wouldn't do. We can most probably achieve much better results if we would
spend more time on tuning the features.
Conclusion
BigML's prediction API has proven to be very easy to use. As with many of the
command-line tools discussed in this topic, we've barely scratched the surface with
BigML. You should also be aware of these additional features:
• BigML's command-line tool
bigmler
also allows for local computations, which is
useful for debugging
• Results can also be inspected using BigML's web interface
• BigML can also perform regression tasks
For a complete overview of BigML's features, check out
the developer page
.
Although we've only been able to experiment with one prediction API, we believe
that prediction APIs in general are worthwhile to consider for doing data science.
Further Reading
• Conway, D., & White, J. M. (2012).
Machine Learning for Hackers
.
O'Reilly Media.
• Lisitsyn, S., Widmer, C., & Garcia, F. J. I. (2013). Tapkee: An Efficient Dimension
Reduction Library.
Journal of Machine Learning Research
,
14
, 2355-2359.
• Cortez, P., Cerdeira, A., Almeida, F., Matos, T., & Reis, J. (2009). Modeling Wine
Preferences by Data Mining from Physicochemical Properties.
Decision Support
Systems
,
47
(4), 547-553.
• Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H.
(2009). The WEKA Data Mining Software: An Update.
SIGKDD Explorations
,
11
(1).