Graphics Reference
In-Depth Information
low age lwt race smoke ptl ht ui ftv bwt
0 19 182
2
0
0 0 1
0 2523
0 33 155
3
0
0 0 0
3 2551
0 20 105
1
1
0 0 0
1 2557
...
We looked at the relationship between smoke (smoking) and bwt (birth weight in grams). The
value of smoke is either 0 or 1, but since it's stored as a numeric vector, ggplot() doesn't know
that it should be treated as a categorical variable. To make it so ggplot() knows to treat smoke
as categorical, we can either convert that column of the data frame to a factor, or tell ggplot() to
treat it as a factor by using factor(smoke) inside of the aes() statement. For these examples,
we converted it to a factor in the data.
Another method for visualizing the distributions is to use facets, as shown in Figure 6-12 . We
can align the facets vertically or horizontally. Here we'll align them vertically so that it's easy to
compare the two distributions:
Figure 6-12. Left: density curves with facets; right: with different facet labels
ggplot(birthwt1, aes(x = bwt)) + geom_density() + facet_grid(smoke ~ . )
One problem with the faceted graph is that the facet labels are just 0 and 1, and there's no label
indicating that those values are for smoke . To change the labels, we need to change the names
of the factor levels. First we'll take a look at the factor levels, then we'll assign new factor level
names, in the same order:
Search WWH ::




Custom Search