Database Reference
In-Depth Information
vehicle GPS traces, retail transactions — all of these contribute to the growing mountain
of data.
The volume of data being made publicly available increases every year, too. Organiza-
tions no longer have to merely manage their own data; success in the future will be dic-
tated to a large extent by their ability to extract value from other organizations' data.
Initiatives such as Public Data Sets on Amazon Web Services and Infochimps.org exist to
foster the “information commons,” where data can be freely (or for a modest price) shared
for anyone to download and analyze. Mashups between different information sources
make for unexpected and hitherto unimaginable applications.
Take, for example, the Astrometry.net project , which watches the Astrometry group on
Flickr for new photos of the night sky. It analyzes each image and identifies which part of
the sky it is from, as well as any interesting celestial bodies, such as stars or galaxies. This
project shows the kinds of things that are possible when data (in this case, tagged photo-
graphic images) is made available and used for something (image analysis) that was not
anticipated by the creator.
It has been said that “more data usually beats better algorithms,” which is to say that for
some problems (such as recommending movies or music based on past preferences),
however fiendish your algorithms, often they can be beaten simply by having more data
(and a less sophisticated algorithm). [ 5 ]
The good news is that big data is here. The bad news is that we are struggling to store and
analyze it.
Search WWH ::




Custom Search