Information Technology Reference
In-Depth Information
21
Big Data Co mputing Applications
The rapid growth of the Internet and World Wide Web has led to vast amounts
of information available online. In addition, business and government orga-
nizations create large amounts of both structured and unstructured infor-
mation that needs to be processed, analyzed, and linked. It is estimated
the amount of information currently stored in a digital form in 2007 at 281
exabytes and the overall compound growth rate at 57% with information
in organizations growing at even a faster rate. It is also estimated that 95%
of all current information exists in unstructured form with increased data
processing requirements compared to structured information. The storing,
managing, accessing, and processing of this vast amount of data represent
a fundamental need and an immense challenge in order to satisfy needs to
search, analyze, mine, and visualize these data as information.
The Web is believed to have well over a trillion Web pages, of which at
least 50 billion have been catalogued and indexed by search engines such
as Google, making them searchable by all of us. This massive Web content
spans well over 100 million domains (i.e., locations where we point our
browsers, such as http://www.wikipedia.org). These are themselves grow-
ing at a rate of more than 20,000 net domain additions daily. Facebook and
Twitter each have over 900 million users, who between them generate over
300 million posts a day (roughly 250 million tweets and over 60 million
Facebook updates). Added to this are the over 10,000 credit-card payments
made per second, the well over 30 billion point-of-sale transactions per year
(via dial-up POS devices), and finally the over 6 billion mobile phones, of
which almost 1 billion are smartphones, many of which are GPS-enabled,
and which access the Internet for e-commerce, tweets, and post updates on
Facebook. Finally, and last but not least, there are the images and videos on
YouTube and other sites, which by themselves outstrip all these put together
in terms of the sheer volume of data they represent.
21.1 Big Data
This deluge of data, along with emerging techniques and technologies used
to handle it, is commonly referred to today as big data . Such big data are both
valuable and challenging, because of their sheer volume. So much so that the
volume of data being created in the current 5 years from 2010 to 2015 will
441
 
Search WWH ::




Custom Search