Database Reference
In-Depth Information
Brought to You by the Letter V:
How We Define Big Data
To keep things simple, we typically define Big Data using four Vs; namely,
volume , variety , velocity , and veracity . We added the veracity characteristic
recently in response to the quality and source issues our clients began facing
with their Big Data initiatives. Some analysts include other V-based descriptors,
such as variability and visibility, but we'll leave those out of this discussion.
No Question About It: Data Volumes Are on the Rise
Volume is the obvious Big Data trait. At the start of this chapter we rhymed
off all kinds of voluminous statistics that do two things: go out of date the
moment they are quoted and grow bigger! We can all relate to the cost of
home storage; we can remember geeking out and bragging to our friends
about our new 1TB drive we bought for $500; it's now about $60; in a couple
of years, a consumer version will fit on your fingernail.
The thing about Big Data and data volumes is that the language has
changed. Aggregation that used to be measured in petabytes (PB) is now
referenced by a term that sounds as if it's from a Star Wars movie: z ettabytes
(ZB). A zettabyte is a trillion gigabytes (GB), or a billion terabytes!
Since we've already given you some great examples of the volume of data
in the previous section, we'll keep this section short and conclude by refer-
encing the world's aggregate digital data growth rate. In 2009, the world had
about 0.8ZB of data; in 2010, we crossed the 1ZB marker, and at the end of
2011 that number was estimated to be 1.8ZB (we think 80 percent is quite the
significant growth rate). Six or seven years from now, the number is esti-
mated (and note that any future estimates in this topic are out of date the
moment we saved the draft, and on the low side for that matter) to be around
35ZB, equivalent to about four trillion 8GB iPods! That number is astonish-
ing considering it's a low-sided estimate. Just as astounding are the chal-
lenges and opportunities that are associated with this amount of data.
Variety Is the Spice of Life
The variety characteristic of Big Data is really about trying to capture all of the
data that pertains to our decision-making process. Making sense out of
unstructured data, such as opinion and intent musings on Facebook, or analyz-
ing images, isn't something that comes naturally for computers. However, this
 
Search WWH ::




Custom Search