Information Technology Reference
In-Depth Information
as regular activity/production services. Although some of their tools and
processes are proprietary, they actually prove the feasibility of solving big
data problems at the global scale and significantly push the development of
the Open Source big data tools.
2.2.3.2 Velocity
Big data are often generated at high speed, including data generated by
arrays of sensors or multiple events; these data need to be processed in
real time or near real time, in a batch, or as streams (e.g., for visualization).
As an example, the LHC ATLAS detector [12] uses about 80 readout channels
and collects up to 1 PB of unfiltered data per second, which are reduced to
approximately 100 MB per second. This should record up to 40 million colli-
sion events per second.
Industry can also provide numerous examples when data registration,
processing, or visualization imposes similar challenges.
2.2.3.3 Variety
Variety deals with the complexity of big data and information and seman-
tic models behind these data. This results in data collected as structured,
unstructured, semistructured, and mixed data. Data variety imposes new
requirements for data storage and database design, which should have
dynamic adaptation to the data format, particularly scaling up and down.
Biodiversity research [17] provides a good example of the data variety that
is a result of the collection and processing of information from a wide range
of sources and the relation of the collected information to species popula-
tion, genomic data, climate, satellite information, and more. Another example
can be urban environment monitoring (also called “smart cities” [18]), which
requires operating, monitoring, and evolving numerous processes, individuals,
and associations.
Adopting data technologies in traditionally non-computer-oriented areas
such as psychology and behavior research, history, and archeology will gen-
erate especially rich data sets.
2.2.3.4 Value
Value is an important feature of the data that is defined by the added
value that the collected data can bring to the intended process, activity, or
predictive analysis/hypothesis. Data value will depend on the events or
processes the data represent, such as processes that are stochastic, probabi-
listic, regular, or random. Depending on this, requirements may be imposed
to collect all data, store the data for a longer period (for some possible event
of interest), and so on. In this respect, data value is closely related to the
data volume and variety. The stock exchange financial data provide a good
Search WWH ::




Custom Search