Databases Reference
In-Depth Information
Potter and Randall. The border between the two runs along latitude 35.18. To
the north is Potter County and to the south is Randall County.
View in a parallel plot the Amarillo.csv dataset.
To allow you to focus on the relationship between County and Latitude,
hide all but these two axes.
Drag the top Latitude range slider down until it reaches 35.17. Since there
is some leeway in slider placement at the 35.17 level, try to keep it as high
as possible while still at 35.17.
Homes below 35.17 degrees latitude are definitely in Randall County even
though four that are close to the dividing line are listed in Potter County.
Drag the top Latitude range slider back to the top, then drag the bottom
range slider up until it reaches 35.19.
Homes above this latitude are in Potter County even though 41 are listed in
Randall County.
There is definitely incorrect data in this dataset. Either the county is listed
wrongly or the location coordinates are invalid. If location or county are
important in the planned data mining analysis, the errors must be isolated
and corrected or removed from the dataset.
Note: Given the discrepancy in number between the Randall County homes
assigned to Potter County (4) and the Potter County homes assigned to Randall
County (41), we make the assumption that the county values are in error. This is
based on domain knowledge that Randall County is generally a more desirable
county for residential real estate. Realtors making the entries likely coded the
county as “Randall” in order to have the homes show up in searches for Randall
County homes.
Data correction outside of VisMiner
In a previous example, presenting the handling of missing values and intro-
ducing the location plot viewer, we explored the homes-for-sale data in the
Provo, Utah, metropolitan area (Homes.csv). That data had been downloaded
from the realt ors.com websit e. The dataset as downloa ded contained numerou s
missing values, which were handled in the tutorial. Although not discussed in
the tutorial above, the downloaded data also contained questionable locations.
These had previously been corrected before its use in the tutorial.
Let's now go back and look at that original data as found in UtHomesAs-
Downloaded.csv. (Note: In order to focus on the location issues, the missing
 
Search WWH ::




Custom Search