Graphics Reference
In-Depth Information
Discussion
The
tophitters2001
data set contains many columns, but we'll focus on just three of them for
this example:
tophit[, c(
"name"
,
"lg"
,
"avg"
)]
name lg avg
Larry Walker NL
0.3501
Ichiro Suzuki AL
0.3497
Jason Giambi AL
0.3423
...
Jeff Conine AL
0.3111
Derek Jeter AL
0.3111
In
Figure 3-27
the names are sorted alphabetically, which isn't very useful in this graph. Dot
plots are often sorted by the value of the continuous variable on the horizontal axis.
Although the rows of
tophit
happen to be sorted by
avg
, that doesn't mean that the items will
be ordered that way in the graph. By default, the items on the given axis will be ordered however
is appropriate for the data type.
name
is a character vector, so it's ordered alphabetically. If it
were a factor, it would use the order defined in the factor levels. In this case, we want
name
to be
sorted by a different variable,
avg
.
To do this, we can use
reorder(name, avg)
, which takes the
name
column, turns it into a
factor, and sorts the factor levels by
avg
. To further improve the appearance, we'll make the
vertical grid lines go away by using the theming system, and turn the horizontal grid lines into
dashed lines (
Figure 3-28
):
ggplot(tophit, aes(x
=
avg, y
=
reorder(name, avg)))
+
geom_point(size
=
3
)
+
# Use a larger dot
theme_bw()
+
theme(panel.grid.major.x
=
element_blank(),
panel.grid.minor.x
=
element_blank(),
panel.grid.major.y
=
element_line(colour
=
"grey60"
, linetype
=
"dashed"
))