Databases Reference
In-Depth Information
90 Days” is missing because the customer has not made any purchases within the
time frame, then a value of zero would accurately represent the field.
However, assigning a temperature field the value of zero would bias any
analysis results.
Missing values - an example
The dataset Homes.csv contains data on homes for sale in the Provo, Utah,
metropol itan area. The data is from the realt ors.com websit e and contai ns values
entered by realtors when listing a home for sale.
In the VisMiner Control Center, open the Homes.csv dataset. In the icon
representing the dataset, notice the red triangle, indicating that the dataset
contains missing values and is not ready for analysis or visualization.
View the “Summary Statistics” for this dataset. See Figure 3.1.
Notice at the top of the summary, there are 3,582 rows in the dataset. Also,
there are 3,582 rows with missing values. Every single row in the dataset has at
least one missing value.
The summary statistics table has a “Missing” column that lists the number of
missing items for each of the dataset columns. For example, note that there are
Figure 3.1
Summary Statistics for Homes.csv
 
Search WWH ::




Custom Search