Database Reference
In-Depth Information
/PATH/lfw/Aaron_Eckhart/Aaron_Eckhart_0001.jpg
Next, we will see how many files we are dealing with:
println(files.count)
Running these commands creates a lot of noisy output in the Spark shell, as it outputs all
the file paths that are read to the console. Ignore this part, but after the command has com-
pleted, the output should look something like this:
..., /PATH/lfw/Azra_Akin/Azra_Akin_0003.jpg:0+19927, /PATH/
lfw/Azra_Akin/Azra_Akin_0004.jpg:0+16030
...
14/09/18 20:36:25 INFO SparkContext: Job finished: count at
<console>:19, took 1.151955 s
1055
So, we can see that we have 1055 images to work with.
Visualizing the face data
Although there are a few tools available in Scala or Java to display images, this is one
area where Python and the matplotlib library shine. We will use Scala to process and ex-
tract the images and run our models and IPython to display the actual images.
You can run a separate IPython Notebook by opening a new terminal window and launch-
ing a new notebook:
>ipython notebook
Note
Note that if using Python Notebook, you should first execute the following code snippet to
ensure that the images are displayed inline after each notebook cell (including the % char-
acter): %pylab inline .
Alternatively, you can launch a plain IPython console without the web notebook, enabling
the pylab plotting functionality using the following command:
>ipython --pylab
Search WWH ::




Custom Search