Database Reference
In-Depth Information
Big Data Analysis Fields
Data analysis research can be divided into six key technical fields, i.e., structured
data analysis, text data analysis, website data analysis, multimedia data analysis,
network data analysis, and mobile data analysis. Such a classification aims to
emphasize data characteristics, but some of the fields may utilize similar technolo-
gies. Since data analysis has a broad scope and it is not easy to have a comprehensive
coverage, we will focus on the key problems and technologies in data analysis in the
following discussions.
Structured Data Analysis
Business applications and scientific research may generate massive structured
data, of which the management and analysis rely on mature commercialized
technologies, such as RDBMS, data warehouse, OLAP, and BPM (Business Process
Management) [ 4 ]. Data analysis is mainly based on data mining and statistical
analysis, both of which have been well studied over the past 30 years.
Data analysis is still a very active research field and new application demands
drive the development of new methods. Statistical machine learning based on
exact mathematical models and powerful algorithms have been applied to anomaly
detection [ 5 ] and energy control [ 6 ]. Exploiting data characteristics, time and space
mining may extract knowledge structures hidden in high-speed data flows and
sensor data models and modes [ 7 ]. Driven by privacy protection in e-commerce,
e-government, and health care applications, privacy protection data mining is an
emerging research field [ 8 ]. Over the past decade, benefited by the substantial popu-
larization of event data, new process discovery, and consistency check technologies,
process mining is becoming a new research field especially in process analysis with
event data [ 9 ].
Text Data Analysis
The most common format of information storage is text, e.g., email communication,
business documents, web pages, and social media. Therefore, text analysis is
deemed to feature more business-based potential than structured data mining.
Generally, tax analysis, also called text mining, is a process to extract useful infor-
mation and knowledge from unstructured text. Text mining is an inter-disciplinary
problem, involving information retrieval, machine learning, statistics, computing
linguistics, and data mining in particular. Most text mining systems are based on
text expressions and natural language processing (NLP), with more focus on the
Search WWH ::

Custom Search