Graphics Reference
In-Depth Information
of a system, or demographic data on countries that includes multiple bits of
information on each.
Some visualization methods let you explore multivariate data in one view. That
is, all your data might fit onto a screen, and you can interpret relationships
between variables and explore trends in individual ones.
Often though, the relationships between variables aren't straightforward. There
isn't always a clear increasing or decreasing trend. In these cases, multiple
views using more straightforward charts and graphs can help a lot. As usual,
your approach depends on the data you have.
A FEW VARIABLES
With time series data, you look for how a variable changes when another
variable, time, does. Similarly, when you have two metrics about people,
places, and things, you might want to know how one metric changes, given
the other does. Do cities with higher burglary rates also have higher homicide
rates? What is the relationship between housing prices and square footage?
Do people who drink more soda per day tend to weigh more?
You can visualize relationships similarly to how you look for them with time
series data. Whereas the dot plots in this chapter placed time on the horizon-
tal axis and a variable on the vertical axis, a scatter plot replaces time with
a different variable, so you have two variables plotted against each other, as
shown in Figure 4-36.
Each dot represents a player during the 2008-2009 NBA basketball season.
Usage percentage, an estimated percentage of possessions that a player is
involved in while on the court, is plotted on the horizontal axis, and points per
game is plotted on the vertical axis. As you might expect, those who spend
more time with the ball tend to score more points per game.
This statistical relationship between variables is called correlation . As one vari-
able increases, the other one usually does, too. In this example, the correlation
is strong and obvious in the chart, but the correlation strength can vary, as
shown in Figure 4-37.
For a more defined view of how two variables are related, you can fit a line
through the points, as shown in Figure 4-38. You saw the same method used
with time series data in Figure 4-21. The increasing curve rounds off as points
per game approaches zero, but the line straightens out, showing a linear rela-
tionship. (It'd be a different story if the line resembled a sine wave.)
Search WWH ::




Custom Search