Graphics Reference
In-Depth Information
in two columns:
eruptions
, which is the length of each eruption, and
waiting
, which is the
length of time to the next eruption. We'll only use the
waiting
column in this example:
faithful
eruptions waiting
3.600
79
1.800
54
3.333
74
...
The second method mentioned earlier uses
geom_line()
and tells it to use the
"density"
stat-
istical transformation. This is essentially the same as the first method, using
geom_density()
,
except the former draws it with a closed polygon.
As with
geom_histogram()
, if you just want to get a quick look at data that isn't in a data frame,
you can get the same result by passing in
NULL
for the data frame and giving
ggplot()
a vector
of values. This would have the same result as the first solution:
# Store the values in a simple vector
w
<-
faithful$waiting
ggplot(
NULL
NULL
, aes(x
=
w))
+
geom_density()
A kernel density curve is an estimate of the population distribution, based on the sample data.
The amount of smoothing depends on the kernelbandwidth: the larger the bandwidth, the more
smoothing there is. The bandwidth can be set with the
adjust
parameter, which has a default
ggplot(faithful, aes(x
=
waiting))
+
geom_line(stat
=
"density"
, adjust
=
.25
, colour
=
"red"
)
+
geom_line(stat
=
"density"
)
+
geom_line(stat
=
"density"
, adjust
=
2
, colour
=
"blue"
)