Big Data, Analytics, and Data Science Life Cycle - Getting Started with Greenplum for Big Data Analytics

Database Reference

In-Depth Information

We are talking about four discrete properties of data that require special tools, pro-

cesses, and procedures to handle:

• Increased volumes (to the degree of petabytes, and so on)

• Increased availability/accessibility of data (more real time)

• Increased formats (different types of data)

• Increased messiness (noisy)

There is a paradigm shift seen as we now have technology to bring this all together

and analyze it.

Multi-structured data

In this section, we will discuss various data formats in the context of Big Data. Data

is categorized into three main data formats/types:

• Structured : Typically, data stored in a relational database can be categor-

ized as structured data. Data that is represented in a strict format is called

structured data. Structured data is organized in semantic chunks called en-

tities. These entities are grouped and relations can be defined. Each entity

has fixed features called attributes. These attributes have a fixed data type,

pre-defined length, constraints, default value definitions, and so on. One im-

portant characteristic of structured data is that all entities of the same group

have the same attributes, format, length, and follow the same order. Rela-

tional database management systems can hold this kind of data.

• Semi-structured : For some applications, data is collected in an ad-hoc man-

ner and how this data would be stored or processed is unknown at that

stage. Though the data has a structure, it sometimes doesn't comply with a

structure that the application is expecting it to be in. Here, different entities

can have different structures with no pre-defined structure. This kind of data

is defined to be semi-structured. For example, scientific data, bibliographic

data, and so on. Graph data structures can hold this kind of data. Some char-

acteristics of semi-structured data are listed as follows:

Search WWH ::

Custom Search

Home