Information Technology Reference
In-Depth Information
archival. Amazon recommends Glacier for data that is infrequently accessed and for which
retrieval is not time critical.
OpenStack Swift
OpenStack is an open-source initiative to produce ubiquitous cloud computing platforms for
public and private clouds that are simple to implement and massively scalable and support
many features. OpenStack was founded by Rackspace Hosting and NASA. The reasons for the
open-source initiative are to provide direct cloud computing toward a standardized domain
and to prevent proprietary/vendor lock-in for cloud customers.
OpenStack supports both object and block storage options. Like Amazon's storage
offerings, it provides a fully distributed storage platform that is accessible via an API and
can be directly integrated into applications or used for backup and archival. Moreover,
redundancy and scalability are ensured through the use of clusters of standardized servers.
The OpenStack object storage system (Swift) is mainly a distributed storage system for
static data such as VM snapshots, backup, and archival. It is much akin to Amazon's S3
storage technology. Objects are written to multiple drives across the data center. OpenStack
software takes care of data replication and integrity across the cluster. This means that in
the event of a drive failure, the OpenStack software is responsible for replicating data to a
healthy drive (self-healing).
Features
The Swift object store provides the following features:
Objects of up to 5 GB in size can be stored.
A proxy server handles all the requests from the other server. For each request, it
looks up the location of the account, container, and object and then routes the request
accordingly. Failures are also handled by the proxy server.
A container server is responsible for keeping track of listings of objects. It does not
know where objects are located but instead what objects are in a specific container.
An account server is similar to a container server but keeps track of listings of con-
tainer rather than objects.
An authorization server contains and authorizes the cloud storage.
Swift crawls through the saved data and replaces a bad file or object with the correct
replica (self-healing).
Statistics of objects are made available through tracking containers.
Hadoop Distributed File System (HDFS)
The Hadoop Distributed File System (HDFS) is part of the Apache Hadoop Core Project:
http://hadoop.apache.org/core/
http://hadoop.apache.org/docs/r0.18.3/hdfs_design.html
Search WWH ::




Custom Search