Database Reference
In-Depth Information
Azure platform. To download the required files, visit the Apache project site
at http://mahout.apache.org/ . As of this writing, the currently supported
version is 0.7. After you download the mahout-distribution-0.7.zip
file, extract the contents of using your preferred compression utility.
NOTE
If you are running the Hortonworks Data Platform on premise instead
of using HDInsight of Windows Azure, you will find that Mahout is
included and is ready to use. No further action is required.
In the decompressed folder, you'll find the mahout-core-0.7-job.jar
and the mahout-examples-0.7-job.jar (which contains a number of
prebuilt samples) files. The mahout-core-0.7-job.jar contains
prebuilt Hadoop jobs for each of the use cases previously mentioned. These
jobs do not require any coding and will generate and run the required
MapReduce jobs to implement machine learning algorithms in a distributed
environment. To use this jar file, you must upload it either directly to your
cluster using the Remote Desktop connection or to the Azure Blob Storage
account connected to Azure HDInsight cluster.
NOTE
This section, and the remainder of this chapter, assumes that you have
an HDInsight on Windows Azure cluster set up. If you do not have a
cluster set up and configured, see Chapter 3, “Configuring Your First
Big Data Environment,” for further information.
Building a Recommendation Engine
What caused you to pick up this topic? What about the last movie you
saw or maybe even the last item of clothing you purchased? Every decision
that anyone makes is inevitably based on some preconceived (and often
unconscious) opinion. Every day, our opinions develop and become part of
the ever-larger library of unconscious factors from which we “borrow” as we
face decisions.
Search WWH ::




Custom Search