Databases Reference
In-Depth Information
b. List the three strongest direct correlations between acid levels. What is the
coefficient of correlation of each?
c. Which acid is most strongly correlated with Region? What is the eta
coefficient?
d. Which acid is most strongly correlated with Area? What
is the eta
coefficient?
e. Create a derived set of the OliveOil data that contains acid measures only.
Name the derived set “Acids”.
The histogram
A histogram visually represents value distributions of a single attribute or
column.
Drag the Iris dataset up to an available display and drop. A context menu
will open listing all of the available viewers for the dataset.
Select “Histogram”.
The distribution of the Variety column is shown in Figure 2.10a. By default,
Variety is the first column selected by the histogram. It looks rather boring. There
are three bars - one for each of the three varieties in the dataset. The bars are all of
equal height; because in the dataset there are 50 observations for each variety.
Using the “Column” drop-down, change the column selection to
“PetalLength”.
The PetalLength distribution is a little more interesting (Figure 2.10b). Notice
the gap between bars in the 2-3 centimeter range. Very clearly we see a
multimodal distribution. The observations on the left do not appear to have been
drawn from the same population as those on the right.
The histogram bars are defined by first the column value range into a
predetermined number of equal sized buckets. In the VisMiner histograms
when numeric data is represented, by default VisMiner chooses 60 buckets.
Once the number of buckets is determined, each observation is assigned to the
bucket corresponding to its value, and the number of observations in the bucket
is encoded as the height of the bar. The bucket containing the most observations
is drawn full height and the others, based on their observation counts, are sized
relative to the tallest.
At times, depending on the range of each bucket, the highs and lows of
neighboring bars will vary significantly based on the chosen bucket count.
Slight adjustments in the bucket range can produce large changes in the heights
 
Search WWH ::




Custom Search