Graphics Reference
In-Depth Information
in two columns: eruptions , which is the length of each eruption, and waiting , which is the
length of time to the next eruption. We'll only use the waiting column in this example:
faithful
eruptions waiting
3.600
79
1.800
54
3.333
74
...
The second method mentioned earlier uses geom_line() and tells it to use the "density" stat-
istical transformation. This is essentially the same as the first method, using geom_density() ,
except the former draws it with a closed polygon.
As with geom_histogram() , if you just want to get a quick look at data that isn't in a data frame,
you can get the same result by passing in NULL for the data frame and giving ggplot() a vector
of values. This would have the same result as the first solution:
# Store the values in a simple vector
w <- faithful$waiting
ggplot( NULL
NULL , aes(x = w)) + geom_density()
A kernel density curve is an estimate of the population distribution, based on the sample data.
The amount of smoothing depends on the kernelbandwidth: the larger the bandwidth, the more
smoothing there is. The bandwidth can be set with the adjust parameter, which has a default
value of 1. Figure 6-8 shows what happens with a smaller and larger value of adjust :
ggplot(faithful, aes(x = waiting)) +
geom_line(stat = "density" , adjust = .25 , colour = "red" ) +
geom_line(stat = "density" ) +
geom_line(stat = "density" , adjust = 2 , colour = "blue" )
Search WWH ::




Custom Search