Databases Reference
In-Depth Information
Calibrating Metrics for the Recommender
A good next step is to use an analytics tool such as R to analyze and visualize the data
about trees and roads. We do that step to perform calibration and testing of the data
products so far. Take a look at the src/scripts/copa.R file, which is an R script to analyze
tree and road data.
For example, Figure 8-4 shows a chart for the distribution of tree species in Palo Alto.
American sweetgum ( Liquidambar styraciflua ) is the most common tree.
Figure 8-4. Summary analysis for tree data
Also, there's a density plot/bar chart of estimated tree heights, most of which are in the
10- to 30-meter range. Palo Alto is known for many tall eucalyptus and sequoia trees
(the city name translates to “Tall Stick”), and these show up on the right side of the
density plot—great for lots of shade. Overall, the distribution of trees shows a wide range
of estimated heights, which helps confirm that our approximation is reasonable to use.
library ( ggplot2 )
dat_folder <- "~/src/concur/CoPA/out"
d <- read.table ( file = paste ( dat_folder , "tree/part-00000" , sep = "/" ),
sep = "\t" , quote = "" , na.strings = "NULL" , header = FALSE ,
encoding = "UTF8" )
 
Search WWH ::




Custom Search