Graphics Reference
In-Depth Information
Figure . . Scatterplot of Sales vs. Total Assets with the seven outlying companies highlighted (the
lighter blob in the lower let corner)
values could be an accounting quirk, but should undoubtedly be discarded from the
dataset, although there are only five of them. he low Sales values are also worth
considering; what types of companies do these correspond to, and should they be
kept in the dataset? hey may be very new companies or companies on their last
legs. here are companies with zero Sales and another with Sales that are
more than zero but less than . hese data could in principle have been obtained by
zoomingintoahistogramfor Salesandqueryingthecells,butforquerieswithprecise
boundaries it is quicker to calculate the appropriate frequency table. Graphics are
better for more general, qualitative insights, while tables and statistical summaries
are better for exact details.
he empirical distributions can be examined in many ways. When cases are of
different sizes or weights, it can be illuminating to look at weighted distributions.
For instance, Fig. . shows a histogram of CA.TA,the ratio of current assets to total
assets,ontheletandahistogramofthesamevariable weightedbyTotal Assets onthe
right. Companies with the highest current assets ratios clearly have low Total Assets.
Outliers and negative values are someof the data cleaning problems that can arise;
there may be others as well.
Some statistical modeling approaches are hardly affected by individual gross er-
rors, and it may be that it matters little to the model fit whether these cases are ex-
cluded, adjusted, or just included in the analysis anyway. Even when this is the case,
it is useful to know what kinds of errors or unusual values can occur. It is also an
opportunity to talk to people who know the context of the data well, and to get them
to provide more background information. Analyzing data blind without any back-
ground information is just reckless. A surprising (and sometimes shocking) amount
Search WWH ::




Custom Search