Information Technology Reference
In-Depth Information
or a public FTP site, and the user can download using a command line FTP
program in a terminal, or perhaps using a graphical client like WS_FTP . In
addition, many organizations maintaining large datasets may require the com-
pletion of an online form, providing some details about the researcher, as well
as indicating which specific variables are being requested.
Large datasets are often in the form of compressed archives when made
available for download. There are several possible formats, such as zip or gzip.
Windows XP has a built-in expander for these data. Older versions of
Windows may need a program like WinZip or P7zip. On the Macintosh,
StuffIt Expander will extract data, or there are the Unix tools available in the
command terminal, such as unzip or gunzip. Keep in mind that even if the file
is compressed, it's much easier to download data on to a computer with high-
speed internet access. Slow connections can have a tendency to terminate
before a download is complete.
Before leaving a data site it is wise to double-check to make sure your data
came with a codebook. Without a codebook your data are essentially worthless,
as you are unable to interpret the numbers. Typically, a codebook is generated
by the server. If a codebook doesn't come with the data, you may need to search
the site or look for another data source.
Tips 'n Tricks
Downloading data
Finding data on a site can be difficult. Not all sites are intuitive or well
organized.
Be prepared to complete online forms with personal information as well
as the specifics of what you're requesting.
Double-check data set size to make sure your host server won't limit your
download.
Be sure you receive a codebook with your data.
Use a computer with a high-speed connection to make it easier to down-
load and decrease the chance of terminating before completion.
Open datasets in a spreadsheet first for cleaning data before analysis and
for reformatting for a variety of statistical applications.
*Adapted from Neustadt, Robinson & Kestnbaum, 2002.
Data storage
Data storage is as much a research responsibility as it is a technical question. In
some respects, the nature of the data is going to drive storage requirements. For
example, medical data and medical related data have very strict mandated
Search WWH ::




Custom Search