Database Reference
In-Depth Information
Data!
We live in the data age. It's not easy to measure the total volume of data stored electronic-
ally, but an IDC estimate put the size of the “digital universe” at 4.4 zettabytes in 2013 and
is forecasting a tenfold growth by 2020 to 44 zettabytes. [ 3 ] A zettabyte is 10 21 bytes, or
equivalently one thousand exabytes, one million petabytes, or one billion terabytes. That's
more than one disk drive for every person in the world.
This flood of data is coming from many sources. Consider the following: [ 4 ]
▪ The New York Stock Exchange generates about 4−5 terabytes of data per day.
▪ Facebook hosts more than 240 billion photos, growing at 7 petabytes per month.
▪ Ancestry.com, the genealogy site, stores around 10 petabytes of data.
▪ The Internet Archive stores around 18.5 petabytes of data.
▪ The Large Hadron Collider near Geneva, Switzerland, produces about 30 petabytes
of data per year.
So there's a lot of data out there. But you are probably wondering how it affects you. Most
of the data is locked up in the largest web properties (like search engines) or in scientific or
financial institutions, isn't it? Does the advent of big data affect smaller organizations or in-
dividuals?
I argue that it does. Take photos, for example. My wife's grandfather was an avid photo-
grapher and took photographs throughout his adult life. His entire corpus of medium-
format, slide, and 35mm film, when scanned in at high resolution, occupies around 10 giga-
bytes. Compare this to the digital photos my family took in 2008, which take up about 5
gigabytes of space. My family is producing photographic data at 35 times the rate my
wife's grandfather's did, and the rate is increasing every year as it becomes easier to take
more and more photos.
More generally, the digital streams that individuals are producing are growing apace. Mi-
crosoft Research's MyLifeBits project gives a glimpse of the archiving of personal inform-
ation that may become commonplace in the near future. MyLifeBits was an experiment
where an individual's interactions — phone calls, emails, documents — were captured
electronically and stored for later access. The data gathered included a photo taken every
minute, which resulted in an overall data volume of 1 gigabyte per month. When storage
costs come down enough to make it feasible to store continuous audio and video, the data
volume for a future MyLifeBits service will be many times that.
The trend is for every individual's data footprint to grow, but perhaps more significantly,
the amount of data generated by machines as a part of the Internet of Things will be even
greater than that generated by people. Machine logs, RFID readers, sensor networks,
Search WWH ::




Custom Search