Information Technology Reference
In-Depth Information
third-party solutions for document-based storage like MongoDB, AeroSpike, and so on.
These document-based “databases” implement the core characteristics of object-based stor-
age and can be either deployed on your custom cloud instance or consumed as a hosted solu-
tion offered by several startups. The startup behind MongoDB, for example, offers hosted
MongoDB service as well.
Let's get started with our example of Amazon S3 storage. S3 is a persistent data storage
offering from Amazon. It is perhaps one of the most popular and earliest implementations
of object storage service on a public cloud.
Object Life Cycle on Amazon S3
Amazon's RESTful API, which has client libraries for multiple languages like JavaScript,
Java, C#, and PHP, can be used to manage the complete life cycle of an object. Objects are
put into “buckets,”' which are kind of super objects and need to have unique names. No
two users on the Amazon public cloud can have buckets with the same name. The buckets
are arranged in a flat address space with unique names. This would mean that two objects
can have the same name in two different buckets.
Object creation and read and deletion operations can be performed through the S3
REST API. You can put in an infinite amount of objects or data elements, at least theo-
retically, with constant access time, which indicates true scalability of the platform. An
object is not limited by size either; you can have an object as small as 1 byte and as big as
5 terabytes each. This limit is imposed by Amazon and may vary with different public and
private cloud providers. Because it's a service from Amazon, you do not have access to the
fine-grain configurations like putting limits on the maximum size of an object, but you can
still decide on object storage policies and put security-based limitations, as we'll discuss in
the next sections.
Metadata
Instead of a centralized lookup table for storing metadata for every file or data element
in the storage system, the metadata of every object is embedded right with the object in
the flat address space. This guarantees that as scale of the data element or blobs increases,
access times are not affected. Data organization is truly decentralized.
The way metadata is defined is also flat. Key-value pairs are used to define characteristics
of an object. The object storage backend does not discriminate between different types of
data since they are all treated as objects.
Enabling association of metadata for objects stored on Amazon S3 or any other object
storage system enabled application developers and engineers to define properties of objects.
There's no limit to how many key-value pairs you can define for a single object, but more
key-value pairs would mean the size of the metadata of an object would increase and pos-
sible increased latency when objects are accessed over the network or the Internet.
Some metadata is created by the provider at the time of object creation and maintained
throughout the life cycle of the object. This normally includes the creation/update date and
size of the object. Additional metadata may include a flag indicating whether you want the
data within the object to be encrypted before it's stored on the physical medium. Another
Search WWH ::




Custom Search