Graphics Reference
In-Depth Information
See Creating a Balloon Plot for creating a balloon plot.
Dealing with Overplotting
Problem
You have many points and they obscure each other.
Solution
With large data sets, the points in a scatter plot may obscure each other and prevent the viewer
from accurately assessing the distribution of the data. This is called overplotting. If the amount
of overplotting is low, you may be able to alleviate it by using smaller points, or by using a dif-
ferent shape (like shape 1, a hollow circle) through which other points can be seen. Figure 5-2
in Making a Basic Scatter Plot demonstrates both of these solutions.
If there's a high degree of overplotting, there are a number of possible solutions:
▪ Make the points semitransparent
▪ Bin the data into rectangles (better for quantitative analysis)
▪ Bin the data into hexagons
▪ Use box plots
Discussion
The scatter plot in Figure 5-12 contains about 54,000 points. They are heavily overplotted, mak-
ing it impossible to get a sense of the relative density of points in different areas of the graph:
sp <- ggplot(diamonds, aes(x = carat, y = price))
sp + geom_point()
Search WWH ::




Custom Search