Databases Reference
In-Depth Information
Big Data Sources. Big data types include web and social
media, machine-to-machine, big transaction data, biometrics,
and human-generated data. This data may be in structured,
unstructured, and semi-structured formats.
Big Data Ingestion. Big data ingestion technologies fall into a few
different categories:
1.
Bulk data movement. Bulk data movement includes
technologies such as ETL that extract data from one or
more data sources, transform the data, and load the data
into a target database.
2.
Data replication. Replication technologies like change
data capture can capture big data, such as utility smart
meter readings, in near real time with minimal impact to
system performance.
3.
Data virtualization. Data virtualization is also known as
data federation. Data virtualization allows an application
to issue SQL queries against a virtual view of data in
heterogeneous sources such as in relational databases,
XML documents, and on the mainframe.
Hadoop Distributions. Hadoop distributions consist of a large
number of technologies with their own release schedules.
A number of vendors have created their own commercial
distributions of Apache Hadoop that have undergone release
testing and bundle product support and training. Most
enterprises that have deployed Hadoop for commercial use
have selected one of the Hadoop distributions: Cloudera, MapR,
Hortonworks .
Databases. Enterprises have the ability to select from multiple
database approaches:
1.
NoSQL (“not only SQL”) databases are a category of
database management systems that do not use SQL as
their primary query language. These databases may
not require fixed table schemas and do not support
join operations. These databases are optimized for
highly scalable read-write operations rather than for
consistency. NoSQL databases include a vast array of
offerings such as Apache HBase, Apache Cassandra,
MongoDB, Apache CouchDB, Couchbase, Riak, and
Amazon DynamoDB. DataStax offers an enterprise
edition that includes a Hadoop distribution, and replaces
HDFS with the CassandraFS.
 
Search WWH ::




Custom Search