Database Reference
In-Depth Information
Next, we will see how many files we are dealing with:
Running these commands creates a lot of noisy output in the Spark shell, as it outputs all
the file paths that are read to the console. Ignore this part, but after the command has com-
pleted, the output should look something like this:
..., /PATH/lfw/Azra_Akin/Azra_Akin_0003.jpg:0+19927, /PATH/
14/09/18 20:36:25 INFO SparkContext: Job finished: count at
<console>:19, took 1.151955 s
So, we can see that we have 1055 images to work with.
Visualizing the face data
Although there are a few tools available in Scala or Java to display images, this is one
area where Python and the matplotlib library shine. We will use Scala to process and ex-
tract the images and run our models and IPython to display the actual images.
You can run a separate IPython Notebook by opening a new terminal window and launch-
ing a new notebook:
>ipython notebook
Note that if using Python Notebook, you should first execute the following code snippet to
ensure that the images are displayed inline after each notebook cell (including the % char-
acter): %pylab inline .
Alternatively, you can launch a plain IPython console without the web notebook, enabling
the pylab plotting functionality using the following command:
>ipython --pylab
Search WWH ::

Custom Search