Databases Reference
In-Depth Information
If columns have previously been hidden to reduce clutter, make them
visible again. A subset is going to be created of the outlying observation.
All columns need to be included in that subset.
Drag the bottom slider of the TotalBasesDiff column up until none but the
outlying observation is visible.
Right-click on the slider; make a subset
from the filter named
“ComputedOutlier”.
View the data for ComputedOutlier in a table (tabular presentation).
The player's name is Granderson, an outfielder for the Yankees. In comparing
the data with the original source, it was found that Granderson actually hit
41 home runs rather than 14. The digits were transposed at data entry. To
proceed, one can choose to either find and enter correct values or remove the
observation from the dataset. If the choice is made to remove, do so by
subtracting the outlier dataset from the base dataset (mlbBatters2011.csv).
Correcting values in a dataset is outside the scope of VisMiner. It can easily be
done using any of a number of “csv” file editors, such as Microsoft Excel or
Windows Notepad.
Exercise 3.5
Using the dataset mlbBatters2011.csv, create a subset adding computed col-
umns for BattingAvgChk, BattingAvgDiff, SluggingChk and SluggingDiff.
Take the same approach as was done in the tutorial for the TotalBases data.
In baseball statistics,
batting average ¼ hits = at bats ; and
slugging ¼ total bases = at bats :
If computed correctly, the values for BattingAvgDiff and SluggingDiff are
non-zero. Does this indicate that there are errors in the data? Explain your
answer. (Hint: Look at the magnitude of the differences.)
Feasibility and consistency checks
The dataset Amarillo.csv contains data on homes for sale in Amarillo, Texas. It
was generated from on-line homes-for-sale listings by Amarillo realtors. The
data includes the latitude, longitude, and county name as entered by the realtors.
An interesting characteristic of Amarillo is that it is split in half by two counties:
 
Search WWH ::




Custom Search