Databases Reference
In-Depth Information
Figure 2.6
Iris Summary Statistics
You can sort the rows in the summary by clicking on a column header. For
example, to sort by mean:
Click on the “Mean” column header.
Click a second time to reverse the sort.
Select “Close” when you have finished viewing the summary statistics.
You have now completed the first two tasks in the “initial data exploration”
phase - determining the dataset format and attribute identification.
Exercise 2.1
The dataset OliveOil.csv contains measurements of different acid levels taken
from olive oil samples at various locations in Italy. This dataset, in later
chapters, will be used to build a classification model predicting its source
location given the acid measurements. Use the VisMiner summary statistics to
answer the questions below.
a. How many rows are there in the dataset?
b. List the names of the eight acid measure attributes (columns) contained in
the dataset.
c. How are locations identified?
d. Which acid measure has the largest mean value?
e. Which acid measure has the largest standard deviation?
f. List the regions in Italy where the samples are taken from. How many
observations were taken from each region?
g. List the areas in Italy where the samples are taken from. How many
observations were taken from each area?
 
Search WWH ::




Custom Search