Databases Reference
In-Depth Information
Figure 9-15. Correlation matrix plotting several contribution factors for SME
(Subject Matter Experts) in an IT company
Box Plots
Box plots are another example of how the volume of data can affect how a visual is shown.
A box plot is a graphical display of five statistics (the minimum, lower quartile, median,
upper quartile, and maximum) that summarizes the distribution of a set of data.
The lower quartile (25th percentile) is represented by the lower edge of the box,
and the upper quartile (75th percentile) is represented by the upper edge of the box.
The median (50th percentile) is represented by a central line that divides the box into
sections.
Often, box plots are used to understand the outliers in the data. Generally speaking,
the number of outliers in the data can be represented by 1 to 5 percent of the data. With
traditionally sized data sets, viewing 1 to 5 percent of the data is not necessarily hard to
do. However, when you are working with massive amounts of data, viewing 1 to 5 percent
of the data is rather challenging.
Figure 9-16 shows a box plot that while most of the data points related to “team
proficiency index” and “project team size” are consistent, there is an outlier: a project team
size of more than forty resources showing the highest level of team proficiency index.
 
Search WWH ::




Custom Search