Graphics Programs Reference
In-Depth Information
WhaT I Learned abouT FormaTTIng
When I first learned statistics in high school, the data was always
provided in a nice, rectangular format. All I had to do was plug
some numbers into an Excel spreadsheet or my awesome graphing
calculator (which was the best way to look like you were working in
class, but actually playing Tetris). That's how it was all the way through
my undergraduate education. Because I was learning about techniques
and theorems for analyses, my teachers didn't spend any time on
working with raw, preprocessed data. The data always seemed to be in
just the right format.
This is perfectly understandable, given time constraints and such, but
in graduate school, I realized that data in the real world never seems to
be in the format that you need. There are missing values, inconsistent
labels, typos, and values without any context. Often the data is spread
across several tables, but you need everything in one, joined across a
value, like a name or a unique id number.
This was also true when I started to work with visualization. It became
increasingly important because I wanted to do more with the data I had.
Nowadays, it's not out of the ordinary that I spend just as much time
getting data in the format that I need as I do putting the visual part of
a data graphic together. Sometimes I spend more time getting all my
data in place. This might seem strange at first, but you'll find that the
design of your data graphics comes much easier when you have your
data neatly organized, just like it was back in that introductory statistics
course in high school.
Various data formats, the tools available to deal with these formats, and
finally, some programming, using the same logic you used to scrape data
in the previous example are described next.
Data Formats
Most people are used to working with data in excel. This is fine if you're
going to do everything from analyses to visualization in the program,
but if you want to step beyond that, you need to familiarize yourself with
other data formats. The point of these formats is to make your data
Search WWH ::




Custom Search