Database Reference
In-Depth Information
7
Statistical Data
Analysis with Incanter
In this chapter, we will cover the following recipes:
F Generating summary statistics with $rollup
F Working with changes in values
F Scaling variables to simplify variable relationships
F Working with time series data with Incanter Zoo
F Smoothing variables to decrease variation
F Validating sample statistics with bootstrapping
F Modeling linear relationships
F Modeling non-linear relationships
F Modeling multinomial Bayesian distributions
F Finding data errors with Benford's law
Introduction
So far, we've focused on data and process. We've seen how to get data and how to get it ready
to analyze. We've also looked at how to organize and partition our processing to keep things
simple and get the best performance.
We'll now look at how to leverage statistics to gain insights into our data. This is a subject that
is both broad and deep, and covering statistics in any meaningful way is far beyond the scope
of this chapter. For more information about some of the procedures and functions described
here, you should refer to a textbook, class, your local statistician, or another resource.
For instance, Coursera has an online statistics course ( https://www.coursera.org/
course/stats1 ), and Harvard has a course on probability on iTunes ( https://itunes.
apple.com/us/course/statistics-110-probability/id502492375 ) .
Search WWH ::




Custom Search