Database Reference
In-Depth Information
denseVec1 = array ([ 1.0 , 2.0 , 3.0 ]) # NumPy arrays can be passed directly to MLlib
denseVec2 = Vectors . dense ([ 1.0 , 2.0 , 3.0 ]) # .. or you can use the Vectors class
# Create the sparse vector <1.0, 0.0, 2.0, 0.0>; the methods for this take only
# the size of the vector (4) and the positions and values of nonzero entries.
# These can be passed as a dictionary or as two lists of indices and values.
sparseVec1 = Vectors . sparse ( 4 , { 0 : 1.0 , 2 : 2.0 })
sparseVec2 = Vectors . sparse ( 4 , [ 0 , 2 ], [ 1.0 , 2.0 ])
Example 11-5. Creating vectors in Scala
import org.apache.spark.mllib.linalg.Vectors
// Create the dense vector <1.0, 2.0, 3.0>; Vectors.dense takes values or an array
val denseVec1 = Vectors . dense ( 1.0 , 2.0 , 3.0 )
val denseVec2 = Vectors . dense ( Array ( 1.0 , 2.0 , 3.0 ))
// Create the sparse vector <1.0, 0.0, 2.0, 0.0>; Vectors.sparse takes the size of
// the vector (here 4) and the positions and values of nonzero entries
val sparseVec1 = Vectors . sparse ( 4 , Array ( 0 , 2 ), Array ( 1.0 , 2.0 ))
Example 11-6. Creating vectors in Java
import org.apache.spark.mllib.linalg.Vector ;
import org.apache.spark.mllib.linalg.Vectors ;
// Create the dense vector <1.0, 2.0, 3.0>; Vectors.dense takes values or an array
Vector denseVec1 = Vectors . dense ( 1.0 , 2.0 , 3.0 );
Vector denseVec2 = Vectors . dense ( new double [] { 1.0 , 2.0 , 3.0 });
// Create the sparse vector <1.0, 0.0, 2.0, 0.0>; Vectors.sparse takes the size of
// the vector (here 4) and the positions and values of nonzero entries
Vector sparseVec1 = Vectors . sparse ( 4 , new int [] { 0 , 2 }, new double []{ 1.0 , 2.0 });
Finally, in Java and Scala, MLlib's Vector classes are primarily meant for data repreā€
sentation, but do not provide arithmetic operations such as addition and subtraction
in the user API. (In Python, you can of course use NumPy to perform math on dense
vectors and pass those to MLlib.) This was done mainly to keep MLlib small, because
writing a complete linear algebra library is beyond the scope of the project. But if you
want to do vector math in your programs, you can use a third-party library like
Breeze in Scala or MTJ in Java, and convert the data from it to MLlib vectors.
Algorithms
In this section, we'll cover the key algorithms available in MLlib, as well as their input
and output types. We do not have space to explain each algorithm mathematically,
but focus instead on how to call and configure these algorithms.
Search WWH ::




Custom Search