Information Technology Reference
In-Depth Information
3.1
Definition of Big Data
Recently, not only business people, but researchers are also focusing on Big data,
whichisdefinedbythreeVs[31][32]:
Volume: large amounts of data
Variety: different forms of data, including traditional database, images, docu-
ments, and complex records
Velocity: data content constantly changing through the absorption of comple-
mentary data collections and from streaming data from multiple sources
These 3V's definitions are on the viewpoint of infrastructures such as High Per-
formance Computing and parallel distributed processing. These researches have fin-
ished. Because, big ICT companies such as Google, Amazon, Facebook, etc. oper-
ates these infrastructures as actual systems. What we have to consider is Big data
infrastructures as a social problem. For example, Big data infrastructures need much
electric power. The one of the important problems is how to operate Big data infras-
tructures without tragedy such as Fukushima. However, each big company is solving
this problem. For example, Facebook will operate a Big data infrastructure by 100%
wind-generated power as a system by 2016 [33]. It is impossible in Japan at least
whose liberalization of electric power is not enough.
However, some people use keyword ”Big data”. We computer scientist should
consider this to be needed. What are these needs? The keyword ”Small data”[34]
tells us a hint.
The definition of Small data is described by [34] as follows: Small data con-
nect people with timely, meaningful insights (derived from big data and/or ”local”
sources), organized and packaged - often visually - to be accessible, understand-
able, and actionable for everyday tasks. This definition is similar to the advantages
and sales talks of Big data. We consider that these needs mean, how to analyze non-
schema data. In this context, the volume, including the 3V is not related, it may be
large, or may be small, or whichever may be sufficient.
In order to analyze appropriately such above description, a system has to cor-
rectly map into cyber world from the real world. In the next section, I show the
mapping from real world to cyber world.
3.2
Mapping from Real World to Cyber World
The one of the important elements for Big data analytics, including small data case
such as [34] is mapped correctly in cyber world from the real world. Fig. 1 shows
the relationship between real world and cyber world. Sensors aggregate real world
situation as discrete data. However, the real world is continuous. In order to correctly
analyze in cyber world, a system has to be analyzed by using continuous value.
Therefore, fitting or interpreting is very important.
It is a very easy thing. For example, for listening to music, CDs and a CD player
can be used. CDs have discrete sound data from the real world. Their music cannot
be recognized without digital/analogue conversion by a CD player. This example
 
Search WWH ::




Custom Search