Information Technology Reference
In-Depth Information
Volume
Velocity
Te rabytes
Batch
Records/Arch
Real/near-time
Transactions
Processes
Ta bles, Files
Streams
Variety
Value
5 Vs of
Big Data
Statistical
Structured
Events
Unstructured
Correlations
Multi-factor
Hypothetical
Probabilistic
Trustworthiness
Authenticity
Origin, Reputation
Availability
Accountability
Veracity
FIGURE 2.1
Five Vs of big data.
and tools currently used. In e-science, growth of the data amount is caused
by advancements in both scientific instruments and SDI. In many areas, the
trend is actually to include data collections from all observed events, activi-
ties, and sensors, which became possible and is important for social activities
and social sciences.
Big Data volume includes such features as size, scale, amount, and dimen-
sion for tera- and exascale data recording either data-rich processes or data
collected from many transactions and stored in individual files or databases.
All need to be accessible, searchable, processed, and manageable.
Two examples from e-science also provide different characteristics of data
and different processing requirements:
• The Large Hadron Collider (LHC) [11, 12] produces on average 5 PB
(petabytes) of data a month that are generated in a number of short
collisions that make them unique events. The collected data are
filtered, stored, and extensively searched for single events that may
confirm a scientific hypothesis.
• The LOFAR (Low-Frequency Array) [13] is a radio telescope that
collects about 5 PB every hour; however, the data are processed by a
correlator, and only correlated data are stored.
In industry, global services providers such as Google [14], Facebook [15],
and Twitter [16] are producing, analyzing, and storing data in huge amounts
Search WWH ::




Custom Search