Graphics Reference
In-Depth Information
Computer support specialist is the only profession in this dataset where women
tended to make more than men.
The annotation explains how to read the scatter plot and what the data means.
Sure, many people know how to read a scatter plot and interpret relationships
between two variables, but many don't, and it doesn't hurt to clarify.
Distributions are another challenging concept. People have to understand
skew, mean, median, and variation, and that observations are aggregated
across a continuous value scale when visualized.
For example, it is common for people to interpret the value axis of a histogram
as time and the count or density on the vertical axis as a metric of interest. This
leads to confusion, so it is useful to explain the various facets of a distribution.
In Chapter 4, “Exploring Data Visually,” you saw distributions for flight arrival
delays. Figure 5-36 shows the distribution of delays for Southwest Airlines.
A negative delay means an early arrival, and a positive one means the plane
arrived late to the destination airport. A delay of zero means an on-time arrival.
To clarify, simply add those descriptions as annotation on the histogram, as
shown in Figure 5-37. Avoid jargon and explain in the context of the data.
Note: Show people your visualizations to see
how they interpret results. If they're confused,
explain the data clearly.
In the end, you must consider what your audience will or
might not understand graphically and statistically, and
annotate based on that. Single variables, time series, and
spatial data are easier to understand visually because they
tend to be more intuitive than multiple variables or more
complex relationships.
FIGURE 5-36 Histogram showing distribution
FIGURE 5-37 Explanation of distribution
Search WWH ::




Custom Search